Eric
Lindgren
*a,
Adam J.
Jackson
b,
Erik
Fransson
a,
Esmée
Berger
a,
Goran
Škoro
c,
Svemir
Rudić
c,
Rastislav
Turanyi
b,
Sanghamitra
Mukhopadhyay
c and
Paul
Erhart
*a
aDepartment of Physics, Chalmers University of Technology, Gothenburg, SE-41296, Sweden. E-mail: erhart@chalmers.se
bScientific Computing Department, STFC Rutherford Appleton Laboratory, Didcot OX11 0QX, UK
cISIS Neutron and Muon Source, STFC Rutherford Appleton Laboratory, Didcot OX11 0QX, UK
First published on 3rd July 2025
Machine learning has emerged as a powerful tool in materials discovery, enabling the rapid design of novel materials with tailored properties for countless applications, including in the context of energy and sustainability. To ensure the reliability of these methods, however, rigorous validation against experimental data is essential. Scattering techniques—using neutrons, X-rays, or electrons—offer a direct way to probe atomic-scale structure and dynamics, making them ideal for this purpose. In this work, we describe a computational workflow that bridges machine learning-based simulations with experimental validation. The workflow combines density functional theory, machine-learned interatomic potentials, molecular dynamics, and autocorrelation function analysis to simulate experimental signatures, with a focus on inelastic neutron scattering. We demonstrate the approach on three representative systems: crystalline silicon, crystalline benzene, and hydrogenated scandium-doped BaTiO3, comparing the simulated spectra to measurements from four different neutron spectrometers. While our primary focus is inelastic neutron scattering, the workflow is readily extendable to other modalities, including diffraction and quasi-elastic scattering of neutrons, X-rays, and electrons. The good agreement between simulated and experimental results highlights the potential of this approach for guiding and interpreting experiments, while also pointing out areas for further improvement.
However, the predictive power of these computational approaches necessitates rigorous experimental validation. Scattering experiments, such as neutron, X-ray, and electron scattering, provide critical insights into the structure and dynamics of materials but require precise simulations to interpret the data accurately.13 Bridging the gap between computational predictions and experimental observations remains a significant challenge in the field. Predictive simulations could also significantly enhance experimental planning and execution by ensuring that data acquisition is optimized for maximum information gain while reducing the likelihood of inconclusive or ambiguous results.14–16 Furthermore, such simulations can support the preparation of beamline proposals, providing quantitative justifications for instrument time requests by demonstrating expected signal strengths and resolving power. As experimental facilities increasingly integrate computational tools into their workflows, predictive capabilities are poised to play a crucial role in streamlining the experimental process, improving the overall efficiency of materials characterization, and ultimately accelerating scientific discoveries.
In response to these challenges, we here describe a comprehensive workflow that integrates DFT calculations, MLIPs in the neuroevolution potential (NEP) format, MD simulations using GPUMD,17 the computation of autocorrelation functions via dynasor,18,19 and their convolution with atomic form factors, instrument resolution functions and kinematic constraints. This enables instrument-specific predictions of scattering data from first-principles, allowing direct comparisons between simulations and experimental measurements. MLIPs are instrumental to this workflow, as they enable the large-scale MD simulations required to properly converge the density auto-correlation functions, and thus the experiment predictions.
The computational efficiency of NEPs as implemented in GPUMD allows our workflow to be applied to systems containing tens of thousands of atoms, simulated over several nanoseconds. To the best of our knowledge, this allows the access of both larger system sizes and longer simulation times than other workflows predicting inelastic neutron scattering (INS) spectra with MLIPs.20 Furthermore, the workflow is fully implemented in Python, offering direct and easy integration in existing computational workflows. We demonstrate the efficacy of this workflow by applying it to three example systems, including elemental Si, crystalline benzene, and hydrogenated Sc-doped BaTiO3, showcasing both its potential for guiding experimental design and accelerating the discovery of new materials as well as its current limitation. We focus specifically on simulating INS experiments, but the general workflow can be easily used to simulate other experimental modalities, including diffraction as well as quasi-elastic and inelastic scattering of neutrons, X-rays, and electrons.
The training set was initially composed of strained and scaled structures, based on ideal structures using reference data from DFT calculations (Subsection 2.4). In the case of benzene, the initial dataset also included dimer configurations to ensure that intermolecular interactions are captured in the training dataset. An initial model was trained on all available data. The dataset was then augmented with structures from several iterations of active learning. To this end, we trained an ensemble of five models by randomly splitting the data into training and validation sets, which was subsequently used to estimate the model uncertainty. MD simulations were then carried out between 10 and 200 K and at pressures ranging from 0 to 10 GPa for benzene, and from 300 to 2000 K and at pressures ranging from −1 to 10 GPa for Sc-doped BaTiO3 using the respective current generation of NEP models. The NEP model for Sc-doped BaTiO3 was trained on structures from the extended temperature range from 300 to 2000 K to increase the robustness of the model, by ensuring that the training data set contains a varied set of configurations. Structures encountered at the target temperature 15 K are well within the interpolative regime of the NEP model (Section S1 in the ESI†). The ensemble of models was used to select structures with a high prediction uncertainty, quantified by a range of predictions over the ensemble, for which we computed reference energies, forces, and stresses via DFT. These configurations were subsequently included when training the next-generation NEP model. The training set for benzene consisted of 798 unique benzene structures, corresponding to a total of 94
470 atoms. For Sc-doped BaTiO3, the training set contained 2280 unique structures, corresponding to a total of 138
438 atoms. Structures were generated and manipulated using the ASE25 and hiphive packages.26
We obtained the final NEP models after 13 iterations for benzene and 6 iterations for Sc-doped BaTiO3. The final models were trained on all available data. For the benzene model, the resulting average root mean square errors (RMSEs) over the ensemble are 1.1(8) meV per atom for the energies and 63(30) meV Å−1 for the forces, and 8.4(16) meV per atom for the virials. Note that the relatively large uncertainty in the predicted forces is due to one of the ensemble models being an outlier. The corresponding average coefficients of determination on the same folds are R2 = 0.9997(3), R2 = 0.9949(51), and R2 = 0.9988(6) for energies, forces, and virials, respectively. The RMSEs and R2 scores for the final benzene model were 8.510 meV per atom and R2 = 0.9998 for the energies, 59 meV Å−1 and R2 = 0.9962 for the forces, and 9.3 meV per atom and R2 = 0.9987 for the virials (Section S2 in the ESI†). For the Sc-doped BaTiO3 model, the ensemble RMSEs were 6.6(13) meV per atom for the energies, 186(34) meV Å−1 for the forces, and 31(5) meV per atom for the virials. The respective coefficients of correlation (R2) were R2 = 0.999 98(1), R2 = 0.9765(58), and R2 = 0.9964(9) for energies, forces, and virials, respectively. The RMSEs and R2 scores for the final Sc-doped BaTiO3 model were 6.2 meV per atom and R2 = 0.9999 for the energies, 172 meV Å−1 and R2 = 0.9792 for the forces, and 30 meV per atom and R2 = 0.9963 for the virials (Section S3 in the ESI†).
The resulting NEP models along with the reference data used for training are available via zenodo as specified in the Data Availability statement.
For Si, a supercell comprising 38 × 38 × 38 primitive cells for a total of 438976 atoms was simulated at 300 K, 900 K, 1200 K, and 1500 K, with equilibration of the system in the NPT ensemble and production for 1 ns in the NVE ensemble, with a timestep of 2 fs. The atomic positions were written to file every 14 fs in order to accurately resolve the fastest vibrations in the system when computing the dynamic structure factor.
Crystalline benzene was simulated in a supercell containing a total of 57024 atoms. The benzene system was equilibrated in the path-integral molecular dynamics (PIMD) ensemble27,28 to avoid the significant underestimation of the cell volume in the classical NPT ensemble at low temperatures. The MD simulations were conducted at 127 K to strike a balance between computational cost and the number of PIMD beads (see Sections S4 and S5 in the ESI†). Production runs were then performed for 1 ns in the NVE ensemble. A time step of 0.5 fs was used, and the positions were written every 3 fs. Ten independent MD runs were performed to improve the statistics of the computed dynamic structure factor.
Hydrogenated supercells of Sc-doped BaTiO3 were constructed for various Sc concentrations in the range 16% to 70% in both the cubic and hexagonal phase. The supercell contained ≃40
000 atoms. Equilibration was performed in the PIMD ensemble at a temperature of 15 K, and production was carried out for 350 ps in the thermostated ring-polymer MD ensemble29 with a timestep of 0.5 fs. This approach captures nuclear quantum effects on the frequencies,28 but it should be noted that the phonon occupation statistics are still classical. Both equilibration and production runs used 32 PIMD beads, for an effective system size of ≃13
00
000 atoms, limiting the length of the production run compared to Si and benzene because of the increased computational cost.
The specific supercell sizes used in this work were chosen in order to strike a balance between computational cost and convergence of the dynamic structure factor with regards to the number of q-points commensurate with the supercell (see Section S6 in the ESI† for a convergence study of the supercell size for crystalline benzene, as well as an extended discussion on supercell size and commensurate q-points).
![]() | (1) |
![]() | (2) |
![]() | (3) |
![]() | (4) |
The dynamic structure factor in eqn (4) can be further generalized for multi-component systems. Different atomic nuclei scatter neutrons, X-rays, and electrons with varying intensity, which can be taken into account by weighting the partial dynamic structure factor for species α and β accordingly. In the case of neutrons, the partial dynamic structure factor should be weighted by the scattering lengths, bα and bβ,
![]() | (5) |
The dynamic structure factor in eqn (5) was computed from the MD trajectories using the dynasor package18,19 in the third step of the workflow (Fig. 1c). q-points and time lags were selected to match the accessible range of the simulated neutron scattering instruments. Specifically, for Si a Brillouin zone path was sampled connecting the high-symmetry points Γ, X, K, and L. The path was sampled in 52 different Brillouin zones, randomly selected from the first zone up to |q| = 12 Å−1 for a total of 6136 q-points. Randomly selected q-points up to a magnitude |q| = 14 Å−1 and |q| = 18 Å−1 were sampled for benzene and Sc-doped BaTiO3, respectively, yielding 2116 and 2601 q-points, respectively. Gaussian broadening with a width of 0.01 Å was then applied to each q-point, followed by averaging over spherical shells in |q| to produce S(q,ω).
Instrument-specific resolution functions and kinematic constraints were applied to the calculated spectra using the euphonic package31 with the resolution functions defined in the resins package.32 The resolution functions used here are Gaussians with energy-dependent width; the functions for TOSCA and Lagrange are based on implementations in AbINS, and the functions for MAPS and ARCS are based on PyChop.33 (The instrument functions for both AbINS and PyChop are distributed in Mantid.34,35) Note that the true resolution functions are four-dimensional and non-Gaussian, but these 1-D approximations are used routinely in INS simulations. The kinematic constraints have their origin in the instrument geometry and transformation from time-of-flight measurements to (q,ω) space. In the simulations they are applied as a mask to data computed directly in the (q,ω) space.
Finally, a quantum correction factor was applied to all dynamic structure factors, in order to correct for the classical phonon statistics generated by the MD simulations. Specifically, we applied the following correction factor based on first-order Stokes-Raman scattering,36,37
![]() | (6) |
![]() | ||
Fig. 2 (a) Simulated INS dispersion of Si from MD for the ARCS spectrometer at the Spallation Neutron Source at ORNL, with the harmonic phonon dispersion calculated using the NEP model overlaid (turquoise lines). (b) The simulated intensity at the X-point, as well as (c) the first peak in the intensity at the X-point in the third Brillouin zone for 300 K, 900 K, 1200 K, and 1500 K (red lines). Note that the dispersion in (a) and intensity in (b) is aggregated over 52 different Brillouin zones, while in (c) the results for only a single Brillouin zone are shown. The experimental data (black circles) is from ref. 30. The MD simulation captures both the anharmonicity and multi-phonon effects present in the experimental data, as well as the mode softening as the temperature is increased. The multi-phonon effects manifest as non-zero intensity between the acoustic and optical branches at the X-point. |
We validate our results by comparing them to the experimental data measured on the ARCS spectrometer in ref. 30. Specifically, we consider the intensity at the high-symmetry X-point (Fig. 2b), where the simulated intensity has been multiplied by an extra factor of ω2. Our results are in quantitative agreement with experiments, with the centroids of the phonon mode peaks agreeing well. The relative intensity between the different phonon mode peaks in the experiment are not entirely reproduced in the simulation, which could be due to a missing correction factor or experimental variability.
We note, in particular, the nonzero intensity measured in the experiment and captured by the simulation in the region 30 meV to 50 meV. This scattered intensity corresponds to multi-phonon effects, which are inherently captured by MD simulations. Furthermore, the effect of thermal expansion as the temperature is varied is also directly included by the MD simulations, where specifically the low-energy mode at the X-point is softened as temperature is increased (Fig. 2c). One can observe that some of the predicted mode energies are slightly shifted compared to experiments, by approximately 1 meV. Given that the MLIP accurately reproduces the harmonic phonon dispersion from DFT around the X-point, it is most likely due to the underlying exchange-correlation functional (Section S7 in the ESI†).
At 1500 K it moreover appears that in the region around 40 meV the simulated and experimental intensities differ. This discrepancy could be due to the Brillouin zone in which the simulations have been conducted. Here, we show the X-point in the third Brillouin zone, i.e., q = [0.5, −1.0, −1.5], while in the experimental ref. 30 the exact q-point is not specified. In fact, the intensity at the X-point varies substantially depending on the Brillouin zone, especially the intensity of the multi-phonon shoulder around 30 meV (Fig. 4). For the comparison shown in (Fig. 2c), we selected the Brillouin zone for which the simulated spectrum best reproduces the experimental data, based on the mean-squared error calculated over the spectrum.
In MD, the dynamics of the system described by the potential model is captured at the classical level, including high-order phonon effects, thermal expansion, and full anharmonicity. Efforts have been made in recent years to include the effects of anharmonicity on top of harmonic models, including but not limited to using higher-order force constants,26 temperature-dependent effective potentials,55 anharmonic lattice models,56 and the self-consistent harmonic approximation.57 However, a harmonic model is inherently limited in describing such complicated dynamic events.
The scattered intensity from benzene is dominated by incoherent scattering from hydrogen, owing to the exceptionally large incoherent scattering length of hydrogen. We thus study the INS spectrum directly. The dynamic structure factor can be integrated over |q| in order to obtain Comparing the raw simulated spectrum with the experimental data, we find that the simulated spectrum captures the peaks corresponding to different modes but the relative intensity between them is not reproduced. Furthermore, the low-energy peak at 10 meV is not captured. The reason for these discrepancies is to a large extent due to the kinematic constraint and resolution function of the TOSCA spectrometer (Fig. 3). The two detector banks of TOSCA map out two lines in q–ω space, where high (low) frequencies correspond to large (low) q. By sampling along these q–ω lines and convoluting the resulting spectrum with the resolution function of the instrument, the agreement improves notably.
However, the ratio in intensity between the high and low-energy regions is still not reproduced. The main reason for this discrepancy is the classical statistics of MD simulations, which we correct for with the quantum correction factor according to eqn (6). Applying both kinematic constraint and quantum correction yields a simulated spectrum that is in near-quantitative agreement with experiments. The remaining difference to experiments is a redshift of the simulated spectrum by approximately 25 meV. We attribute this redshift to the DFT functional used to train the NEP model, as well as weak intermolecular interactions not being fully captured by the NEP model. A more detailed discussion comparing experiments and first-principles calculations to the predictions from the NEP model can be found in the ESI† (Section S8).
We can further elucidate the simulated INS spectrum by comparing it to the phonon dispersion according to the underlying MLIP (Fig. 5b). The simulated INS spectrum differs notably from the phonon density of states in terms of intensity, owing to the kinematic constraint of the TOSCA spectrometer, and the quantum correction. Furthermore, the full anharmonicity included in the MD simulation in combination with the TOSCA resolution function yields a broadening of the peaks in the simulated INS spectrum.
In summary, this study of crystalline benzene highlights the importance of considering the resolution and kinematic constraints of the specific instrument, as well as correcting the statistics from classical MD simulations, when aiming for quantitative predictions of neutron scattering experiments.
Sc-doped BaTiO3 undergoes a phase transition from a hexagonal structure to a cubic perovskite structure as the Sc concentration increases. On MD time scales, both structures are, however, at least metastable over the entire composition range, which (in contrast to experiment) allows us to sample structure and composition independently (Fig. 6).
![]() | ||
Fig. 6 Hydrogen dynamics in hydrogenated Sc-doped BaTiO3 (BaTi1−xScxO3Hx) for various concentrations of dopants, compared with experimental results measured at three different neutron spectrometers: IN1 Lagrange, TOSCA, and MAPS. Experimental data are from ref. 58. (a) IN1 Lagrange is a wide-range spectrometer that probes the dynamics in the region from 0![]() |
The simulated spectra using our workflow and the experimental spectra agree well in the full energy range 0to 500 meV. The peak at 125 meV corresponds to O–H vibrations, and is best described by the hexagonal phase for low Sc concentrations and by the cubic phase for high Sc-concentrations (Fig. 6a). However, the overtone peak at 250 meV is underestimated in both simulated phases. This discrepancy could be due to the quantum correction factor in eqn (6), which is only valid for first-order scattering. This is further supported by the simulated spectra obtained using AbINS, which accurately capture the intensity of the 250 meV overtone peak. The latter method handles multi-phonon effects perturbatively and includes quantum effects but does not account for anharmonicity, which explains the sharper first-order features compared to the MD-based simulations.
The results from TOSCA highlight the 125 meV feature further (Fig. 6b). For low Sc concentrations the simulated spectrum using our workflow for the hexagonal structure agrees well with experiments, although the simulated spectrum is redshifted by approximately 25 meV. The simulated spectrum for the cubic structure agrees better with experiments as the Sc concentration is increased, which is in line with the hexagonal to cubic phase transition with increasing Sc concentration. We can thus clearly distinguish the spectra for the two phases of Sc-doped BaTiO3, as the Sc-doping is varied.
Finally, the MAPS spectrometer probes the high-energy region between 300 meV to 600 meV. The feature in the experimental spectra at 450 meV corresponds to stretching of the O–H bond according to Perrichon et al., with the peak at 550 meV assigned as a combination mode of the O–H wag mode at 120 meV and the O–H stretch mode at 450 meV. The fundamental vibrational peak at 450 meV is captured by the simulations, although with a slight blueshift of 10 meV. However, the intensity for the combination mode is not reproduced by the simulation, neither using the MD-based workflow nor AbINS. In this case, we can further elucidate the nature of the combination modes at 550 meV using AbINS (Fig. 7). In these harmonic incoherent-approximation simulations, the intensity in that region is mainly composed of fourth-order scattering events and above. Such high-order phonons require a higher-order correction factor in order for the statistics to come out correctly using the MD-based workflow. However, applying a higher-order correction factor is not straightforward, as one would have to know a priori in which region of the spectrum to apply the correction, and the order of the higher-order scattering process.
The predictions show remarkable quantitative agreement with experiments. Almost all experimental features in the form of vibrational peaks are faithfully reproduced in the predicted spectra, including their relative intensities after applying correction factors for the quantum statistics. Remaining differences between the spectra, such as systematic blue- or redshifts of certain peaks in the case of crystalline benzene, can be most likely attributed to the difficulties in capturing the weak intermolecular interactions with the MLIP, as well as the DFT functional used for training the MLIP. However, the intensity for some higher-order phonon processes, such as in Sc-doped BaTiO3, are not reproduced faithfully compared to experiments at the present level. In principle these discrepancies can, however, be corrected for by applying higher-order correction factors to recover the correct statistics for overtones and combination modes.
Such a correction would not represent a general-purpose ab initio approach as it is only possible in regions where these features can clearly be identified and separated, such that a single correction can be applied. For an unknown system, identifying such modes in a MD simulation is problematic, although some information can be gained from harmonic calculations, e.g., using the AbINS algorithm in Mantid. However, these corrections only affect the relative intensity of these peaks, while their positions are directly obtained from the MD simulations. Applying higher-order quantum corrections is thus not strictly necessary in order to give a reasonable prediction of a neutron scattering experiment. In general, we suggest applying the lowest order correction for the whole spectrum to be sufficient for the purpose of guiding neutron scattering experiments.
The workflow as presented in this work relies on MLIPs in order to run accurate and efficient MD simulations. Classical force fields could be used but the results might be of limited accuracy, especially in systems involving both bonded and non-bonded interactions. However, training a MLIP or selecting an appropriate force field for a system of interest constitutes a bottleneck in the workflow, requiring domain knowledge and effort. Foundation models trained on large parts of the periodic table, such as MACE-MP-0 or CHGNet among others,59–63 offer an appealing alternative to creating bespoke MLIPs or using existing force fields. These foundation models can either be used out-of-the-box, or fine tuned with a small number of structures from DFT to yield an accurate model with comparatively low effort. It should, however, be noted that these models are often computationally much more demanding than either NEP models or classical force fields. Recently, a foundation model based on the NEP framework, NEP89, has been released, which combines the computational efficiency of NEP with a competitive transferability and accuracy compared to previously published foundation models out-of-the-box.64 Additionally, NEP89 can efficiently be finetuned for a specific application, using only a limited amount of training data from DFT. Using finetuned models based on NEP89 would thus directly alleviate the main bottleneck of our workflow, developing the MLIP, whilst retaining the high computational efficiency. Foundation models further improves the ease of use of our workflow, and enables researchers to quickly predict scattering signatures for systems without existing MLIPs, for the purpose of guiding experiments or materials discovery.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5ta03325j |
This journal is © The Royal Society of Chemistry 2025 |