Fang Wanga,
Zhi Yanga,
Fenglian Lib,
Jian-Li Shao*c and
Li-Chun Xu*a
aCollege of Physics, Taiyuan University of Technology, Jinzhong, 030600, China. E-mail: xulichun@tyut.edu.cn
bCollege of Information and Computer, Taiyuan University of Technology, Jinzhong, 030600, China
cState Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing, 100081, China. E-mail: shao_jianli@bit.edu.cn
First published on 30th October 2023
This study developed a machine learning-based force field for simulating the bcc–hcp phase transitions of iron. By employing traditional molecular dynamics sampling methods and stochastic surface walking sampling methods, combined with Bayesian inference, we construct an efficient machine learning potential for iron. By using SOAP descriptors to map structural data, we find that the machine learning force field exhibits good coverage in the phase transition space. Accuracy evaluation shows that the machine learning force field has small errors compared to DFT calculations in terms of energy, force, and stress evaluations, indicating excellent reproducibility. Additionally, the machine learning force field accurately predicts the stable crystal structure parameters, elastic constants, and bulk modulus of bcc and hcp phases of iron, and demonstrates good performance in predicting higher-order derivatives and phase transition processes, as evidenced by comparisons with DFT calculations and existing experimental data. Therefore, our study provides an effective tool for investigating the phase transitions of iron using machine learning methods, offering new insights and approaches for materials science and solid-state physics research.
Based on the extended X-ray absorption fine structure (EXAFS) technique, Wang and Ingalls13 initially proposed three possible microscopic mechanisms for the phase transition from bcc to hcp. The first two mechanisms involve different paths of bcc-to-hcp transition, while the third mechanism involves a path from bcc to an intermediate fcc phase and then to hcp phase. From an energetic standpoint, the third mechanism presents a more advantageous route for the phase transition. However, research evidence confirming its existence remains scarce. Kadau et al.14 employed classical molecular dynamics to simulate the phase transition in single-crystal iron under shock compression. They found that the bcc–hcp transition occurs when two adjacent crystal planes slip relative to each other along the [110] crystal direction, consistent with the Burgers relationship. Kalantar et al.15 directly observed the bcc–hcp phase transition in impaction iron via nanosecond X-ray diffraction (XRD). Moreover, based on first-principles calculations, Lu et al.16 argued that the transferable fcc state during the transition process is energetically unfavorable. Due to the small size of the unit cell typically employed in first-principles calculations, phase transitions involving larger atomic scales cannot be simulated. Moreover, temperature plays a crucial role in the dynamic phase transition process, and a thermodynamic energy-based description alone is inadequate to provide a comprehensive and accurate depiction of the kinetics of the phase transition. In simulations of phase transitions in iron, atomic-scale information requires molecular dynamics simulations that take into account temperature.17–24 The accuracy of molecular dynamics simulations heavily relies on the choice of interatomic potential. Quantum-mechanical potentials are computationally expensive for large-scale systems,25–27 while commonly used empirical potentials may have deviations in describing the high-energy transition region of phase transitions. Therefore, an efficient and high-precision empirical potential will benefit a deeper understanding of the phase transition process in simulations.
In recent years, various new interatomic potentials28–32 have been applied to simulate the atomic phase transition of iron under different loading conditions and initial microstructures at high pressure. Among them, machine learning force fields (MLFFs), as a typical representative of the new paradigm of “AI for science”, are increasingly being used by researchers in molecular dynamics simulations.33–38 In MLFF models, the potential energy is described as a function of descriptors representing the atomic structure of the material, and the parameters of the function are optimized to reproduce first-principles (FP) quantities, including the total energy, forces, and stress tensor components, to accurately and efficiently predict interatomic potentials.
In addition to the robustness of the model itself, the dataset required for fitting the model is crucial. To simulate the phase transition process, the dataset needs to cover the relevant configurations during the phase transition, enabling the model to predict the entire process through interpolation. In this study, we propose a systematic ML approach to construct an Fe interatomic potential, including the sampling process of the dataset, feature analysis of descriptors, and fitting process of the potential function. We demonstrate that this Fe MLFF can achieve close-to-DFT accuracy over a wide range of properties, including energy, forces and stress tensor, elastic properties. Importantly, it can also simulate the phase transition process of iron effectively.
Once the atomic configurations have been converted into structural descriptors, a labeled dataset is necessary to train the model. In the case of machine learning force fields, the dataset is labeled with the total energies of the configurations and the forces on each atom. At the time of our research, publicly available datasets specifically focused on the structures formed by the Fe element were not identified. However, subsequent to our research work, we came across a recent paper46 that provides a dataset on iron clusters. Regrettably, there remains a lack of available data on the bulk structure, which is directly relevant to our work. To efficiently construct the required dataset, we used a Bayesian learning algorithm with diverse sampling methods for real-time data collection and model fitting. After data collection is complete, a post-processing step involves using singular value decomposition to solve the system of linear equations in the ridge regression method to improve the model's accuracy.
In this paper, we used two sampling methods to construct a database suitable for phase transition studies. The construction of our training set is a fundamental component of our methodology, and we will provide an extended and more detailed description as follows.
In all sampling methods, each configuration was labeled with its energy, stress, and the forces on all atoms in it, which were calculated based on spin-polarized density functional theory (DFT).50 The DFT calculations were performed with the Perdew–Burke–Ernzerhof (PBE)51 exchange-correlation functional with projector-augmented wave (PAW)52 pseudopotentials implemented in the VASP code.53 The energy cutoff of 500 eV was used consistently across all configurations, which is much greater than 1.3 times the ENMAX of iron's pseudopotential. The strict condition of 1 × 107 eV was used to break the electronic self-consistent loop, and the Methfessel–Paxton smearing was set for each orbital occupation with 0.1 eV broadening. The k-space sampling of the first Brillouin zone was performed in gamma-centered Monkhorst–Pack grids with a linear density of 0.18 Å−1, the related SOAP descriptor and Bayesian formalism used in this work were implemented in the VASP(Version 6.3.2).53
We first analyzed the distribution characteristics of the data in the energy–volume space. As shown in Fig. 2(a), based on Bayesian inference, the energy distribution of the dataset is reasonable and generally normal. Among them, it should be noted that the number of samples in the low-energy region of the dataset is very rare, and the lower boundary presents a volume–energy relationship curve conforming to the law of the equation of state. This feature reflects the advantage of Bayesian inference in constructing dataset. Based on prior experience of existing data, it can infer whether new structures need to be added to the dataset before first-principles calculation, which avoids the problem of repeated sampling of structures. In the low-energy region, the structural changes are so small that a few samples can cover the typical atomic local environment. With the increase of energy, the degree of chaos of atomic arrangement increases rapidly. At this time, a large number of samples are added to the dataset, which increases the coverage of samples to the energy space, and thus the extrapolation ability of the dataset corresponding to the potential function. In order to analyze the descriptive ability of a dataset on phase transitions, we constructed a classical Burgers phase transition path and embedded it into an energy–volume curve. As shown by the green dotted line in Fig. 2(a), this dataset can essentially cover the region traversed by the phase transition path, and high-energy data still exist in regions with energies higher than the transition state. Therefore, the energy distribution of the dataset is reasonable.
Fig. 2 Distribution of the configurations in (a) the energy–volume space and (b) Steinhardt order parameter. The green path represents the classic Burgers phase transition path. |
After solving the problem of sample coverage to the energy space, we further analyzed the distribution characteristics of the dataset in the structural phase space. Due to the inherent symmetry of crystal structure, the current common means of machine learning potential function is to transform the structure into a local environment descriptor, and regression modeling is carried out between the descriptor and the energy and force properties of the structure. Therefore, we also adopted atomic local environment characteristics to analyze the distribution characteristics of samples in the structural phase space of the dataset.
The Steinhardt order parameters are a set of parameters that characterizes the local atomic environment and is popularly used to distinguish crystal structures, and its expression is as follows:
(1) |
In this article, the descriptor we used is smooth overlap of atomic positions (SOAP), which utilizes a local expansion of a Gaussian-smoothed atomic density with orthonormal functions that are based on spherical harmonics and radial basis functions. When using the typical truncation radius (5 Å), 8 radial basis functions, and 4 spherical harmonics quantum numbers in the construction of the SOAP descriptor, the characteristic dimension of each atom's local environment reached 5000. In order to analyze the distribution characteristics of the data after being mapped by the SOAP descriptor, we performed principal component analysis (PCA) to map the 5000-dimensional features into a 2-dimensional principal component space. To evaluate the sampling efficiency, we split the dataset according to the sampling method and included the number of data samples for each method. As shown in Fig. 3, the distribution of principal components of the SOAP descriptor covers a wide range, completely covering the region traversed by the classical Burgers path, which suggests that mapping the dataset with the SOAP descriptor is reasonable for studying phase transition problems.
Fig. 3 Smooth overlap of atomic positions (SOAP) descriptors combined with principal component analysis. The small figure shows the classification data obtained by different sampling methods. |
In addition, principal component analysis (PCA) based on different sampling methods revealed that the data distributions obtained by different methods were relatively concentrated, with significant differences in the coverage areas. The elastic sampling method obtained the fitted equation-of-state (EOS) structures by perturbing the structure through stretching or compressing. The obtained data showed that the SOAP descriptors could reflect the continuity of structural changes when describing structural features, which is necessary for studying phase transitions in variable cell solids.
The trajectory sampling data obtained by performing high-temperature molecular dynamics simulations based on the hcp equilibrium structure were highly concentrated (0.2 < PC1 < 0.05, 0.05 < PC2 < 0.1) even though the number of trajectories reached 1915. The trajectory sampling data obtained based on the bcc equilibrium structure had a larger coverage area (0.1 < PC1 < 0.3, 0.1 < PC2 < 0.3), but the overlap between the two types of structures obtained by the bcc and hcp molecular dynamics simulations was small, which was not conducive to accurately describe the phase transition between the two. The trajectory sampling data obtained by performing high-temperature molecular dynamics simulations based on the fcc equilibrium structure had the largest number of trajectories, and the distribution could cover the area between bcc and hcp structures well, which was related to some reports stating that fcc is the intermediate phase in the bcc–hcp phase transition. Nevertheless, the efficiency of the molecular dynamics simulation-based sampling methods was generally low in describing the intermediate processes of the bcc–hcp phase transition, and the dataset may have better coverage of the bcc–fcc phase transition (transition temperature 1180 K) instead. The iron phase diagram shows that the bcc–hcp phase transition can be driven by pressure, which requires considering sampling at different pressures in molecular dynamics simulations. However, the ability of ordinary molecular dynamics simulations to cross high-energy barriers is low, making it difficult to obtain samples near the transition saddle point using this sampling scheme. To quickly and specifically obtain a dataset and a machine learning potential suitable for studying phase transition problems, we also tried the random walk algorithm on the potential energy surface. This algorithm can cross high energy barriers and obtain continuous phase transition paths, which is very helpful for constructing phase transition research dataset. We further added Bayesian inference to the original algorithm to infer whether new samples can be described by the existing dataset while randomly walking. This set can accelerate the construction of the dataset and avoid the collection of duplicate samples. The red data points in Fig. 3 show the data obtained by the sampling method based on the Stochastic Surface Walking (SSW) approach. The random walk started from the bcc ground state structure. It can be seen from the figure that the sampling in the initial stage was in line with the Hamburg path, diverged in the middle area, and finally converged near the hcp ground state. Compared with the molecular dynamics-based sampling methods, the dataset obtained based on the random potential energy surface walking scheme had higher coverage of phase transitions, which is conducive to training potentials with stronger generalization ability and more suitable for phase transition research. Moreover, the surface model and defect model had significant advantages over ideal crystals in obtaining local configurations, which could increase the diversity of sampling data.
In this work, we applied the kernel-based Bayesian regression model integrated within the VASP code.34 In this model, a variant of the smooth overlap of atomic positions was adopted as the descriptors Xi, and a kernel function K was used to measure the similarity between a local configuration and the reference local configuration. The cutoff radius of radial descriptors and angular descriptors were 8.0 Å with a 0.5 Å Gaussian broadening. The descriptors were expanded by radial basis functions (N = 12) and spherical harmonics (Lmax = 4), the weight of radial descriptors in the kernel was set to 0.1. Bayesian linear regression was employed to get the fitting coefficients wiB in the linear equations for the energies U and kernel functions K,
(2) |
The threshold for the CUR algorithm used in the sparsification of local reference configurations was 10−9, the convergence criterion for the optimization of parameters was 10−15.
Fig. 4 Kernel-based Bayesian regression model predictions compared with first-principles results for (a) the energy, (b) the force, and (c) the stress. |
Table 1 provides a comparison of the Fe MLFF model predictions for the lattice constants and elastic properties of bcc and hcp Fe with experiments. It is found that the calculated lattice constants for bcc and hcp phases by the model are in excellent agreement with DFT and experimental values. The elastic properties of bcc Fe predicted by MLFF are also in good agreement with DFT. For instance, the predicted values of C11, C12, and C44 are 267, 145, and 86 GPa, respectively, with errors of 4.3%, 3.6%, and 7.5% compared to DFT, while the errors of EAM are significantly higher at 10.2%, 3.6%, and 46.3%.54,55 However, the elastic properties of hcp Fe predicted by MLFF show relatively larger errors compared to DFT. The volume modulus estimated using the Voigt–Reuss–Hill approximation56 shows good agreement with DFT, but the volume modulus from the MEAM potential is greatly underestimated.
bcc | hcp | |||||
---|---|---|---|---|---|---|
Exp | DFT | MLFF | Exp | DFT | MLFF | |
a | 2.866 | 2.832 | 2.831 | 2.347 | 2.458 | 2.461 |
c | — | — | — | 3.797 | 3.887 | 3.863 |
E | — | −8.237 | −8.234 | — | −8.153 | −8.169 |
C11 | 256 | 267 | 527 | 655 | ||
C12 | 140 | 145 | 178 | 219 | ||
C44 | 80 | 86 | 164 | 186 | ||
B | 180 | 185 | 289 | 347 | ||
Ev | 2.16 | 1.91 | 0.05 | 0.03 | ||
Em | 0.69 | 0.89 | 1.49 | 1.64 |
To further investigate predictions of force and lattice dynamics using Fe MLFF, phonon dispersion curves for bcc and hcp Fe 3 × 3 × 3 supercell were calculated by the finite displacement method, as shown in Fig. 5(a and b). The predicted phonon dispersion curves of bcc and hcp Fe are in good agreement with those calculated by DFT and measured by the experiments. The imaginary frequency is not observed, and the lowest frequency is located at the point. The deviation of the results calculated by DFT and MLFF mainly occurs in the range of high-frequency phonons. To determine the effect of these deviations on thermal properties, we calculated the Helmholtz free energy (A), entropy (S) and constant volume molar thermal capacity (Cv) of bcc and hcp Fe by DFT and MLFF. Both methods show nearly identical curves for all three quantities, demonstrating the accuracy of machine-learned potential functions in simulating thermal properties (Fig. 5(c and d)).
To explore whether the constructed machine learning potential function can simulate the phase transition process, we performed variable-cell double-ended surface walking method simulations57 using MLFF to determine the phase transition process. The 32-atom bcc Fe supercell was assumed as initial state, and hcp Fe was the final state. The energy trajectories from bcc to hcp phase was shown in Fig. 6, the phase transition from bcc to hcp needs to overcome a energy barrier of 0.12 eV, which is very close to the previous DFT calculation results (0.132 eV,58 0.156 eV,46 0.112 eV,8 0.185 eV59). We also calculated the energies of these intermediate configurations using DFT. The energies of DFT and MLFF on the side of the bcc phase are very consistent, and there is a certain deviation between the two on the side of the hcp phase, and the largest deviation occurs on the transition state structure. This is related to the lack of high-energy region data used in the fitting, although we use the SSW sampling method to avoid this problem. This phenomenon is not unique to our model; the PES calculated by Jana et al.46 used Dragoni's GAP potential28 and Mendelev's EAM potential60 also exhibit similar behavior. The low weight of transition state structures in the fitting process, owing to their small proportion in the dataset, contributes to the challenges in accurately capturing their energy during the fitting process. To evaluate the impact of potential functions, we employed three publicly available potentials: Dragoni's GAP potential, Mendelev's EAM potential, and Jana's TurboGAP potential. We applied these potentials to compute the energy of these same structures in PES, depicted in Fig. 6(b). Our findings reveal that, while Dragoni's GAP potential and Mendelev's EAM potential offer insights into the relative energy relationship between bcc and hcp structures, they fall short of characterizing the phase transition barrier between them. The most recent Jana's TurboGAP potential and our constructed potential demonstrate comparable predicted energies, both of which are slightly lower than those obtained through DFT calculations. However, it's essential to acknowledge that differences in data sets and variations in force predictions affect the assessment. For Dragoni's GAP potential and Mendelev's EAM potential, the structures near the boundary are not ground states, potentially impacting their ability to describe the potential energy surface for the phase transition. Notably, when comparing these results, it becomes evident that Dragoni's GAP potential, despite its GAP-type nature, is constrained by the limitations of the dataset, making it somewhat inadequate in describing the potential energy surface for the phase transition.
Fig. 6 (a)IS, TS, FS structures and (b) reaction energy profile along the Fe bcc-to-hcp phase transition pathway the double-ended surface walking trajectories from bcc to hcp phase, calculated by our MLFF, DFT, Mendelev's EAM potential,60 Dragoni's GAP potential28 and the latest Jana's TurboGAP potential46 with the same structures based on our MLFF PES. The dashed line represents the barrierless prediction using MLFF trained without SSW data. |
To explicitly demonstrated how SSW-added workflow outperforms standard MD sampling in the transition state region, we specifically fitted a machine learning potential using the subset of data collected through MD simulations (8016 data points) with the same parameters as a reference. This reference model, based on the “no-ssw data”, was then utilized to calculate the potential energy barrier between bcc and hcp structures. Simulated by variable-cell double-ended surface walking method simulations, within a system of 32 atoms, the energy of the transition state is only marginally higher (0.001 eV) than that of the bcc structure. Considering numerical errors, this implies the absence of a substantial energy barrier between the two states. This behavior closely resembles the response of the EAM and GAP potential functions, as shown in Fig. 6(b). This phenomenon may be attributed to the complexity of force properties. For each configuration, it necessitates the labeling of one total energy, six stress values, and as many as 3Natom force values. While machine learning can predict a single total energy value easily using interpolation, it becomes notably challenging for the 3Natom force values. Interpolation in such cases is intricate, and errors tend to amplify, making accurate force predictions more challenging, particularly when data is limited. When force predictions are inaccurate, it becomes impossible to derive precise phase transition paths. Therefore, in Fig. 6(b), we present a failed schematic representation of the barrierless prediction using MLFF trained without SSW data. In the absence of SSW data, the assessment of forces within the intermediate state encounters limitations. These limitations, in turn, undermine the meaningfulness of calculating the energies associated with these structures. Upon the inclusion of 821 data points generated through the SSW sampling process (comprising only 1/10 of the data from MD sampling), the constructed potential can accurately model the energy barrier between bcc and hcp structures, which means that both the force and the energy predictions are reasonable. This compelling evidence underscores the significance of the SSW sampling strategy and highlights the necessity of SSW in addressing the limitations of traditional MD sampling when studying phase transitions. These additional insights confirm the advantages of our workflow in modeling phase transitions with improved accuracy and efficiency. We have shown that our approach is more efficient, requiring SSW data to achieve comparable or better accuracy in predictions.
Considering the phase transition process from bcc to hcp as a whole, our MLFF model can reasonably reproduce the potential energy surface, energy, and force predictions, providing a reasonable representation of the phase transition behavior and reasonable barrier heights. In addition, due to the limitations of the current machine learning force field model and the large amount of calculation required to construct a database containing atomic oriented magnetic moments, the properties of magnetic moments are not explicitly reflected in the current force field model, which also limits an accurate description of phase transitions in magnetic materials.
This journal is © The Royal Society of Chemistry 2023 |