Christopher M.
Collins
*ab,
Hasan M.
Sayeed
c,
George R.
Darling
a,
John B.
Claridge
a,
Taylor D.
Sparks
*c and
Matthew J.
Rosseinsky
*ab
aDepartment of Chemistry, University of Liverpool, Crown Street, Liverpool, L69 7ZD, UK. E-mail: rossein@liverpool.ac.uk; c.m.collins@liverpool.ac.uk
bLeverhulme Research Centre for Functional Materials Design, Materials Innovation Factory, University of Liverpool, Crown Street, Liverpool, L69 7ZD, UK
cDepartment of Materials Science and Engineering, University of Utah, 122 Central Campus Dr, Salt Lake City, Utah 84112, USA. E-mail: sparks@eng.utah.edu
First published on 30th May 2024
The prediction of new compounds via crystal structure prediction may transform how the materials chemistry community discovers new compounds. In the prediction of inorganic crystal structures there are three distinct classes of prediction: performing crystal structure prediction via heuristic algorithms, using a range of established crystal structure prediction codes, an emerging community using generative machine learning models to predict crystal structures directly and the use of mathematical optimisation to solve crystal structures exactly. In this work, we demonstrate the combination of heuristic and generative machine learning, the use of a generative machine learning model to produce the starting population of crystal structures for a heuristic algorithm and discuss the benefits, demonstrating the method on eight known compounds with reported crystal structures and three hypothetical compounds. We show that the integration of machine learning structure generation with heuristic structure prediction results in both faster compute times per structure and lower energies. This work provides to the community a set of eleven compounds with varying chemistry and complexity that can be used as a benchmark for new crystal structure prediction methods as they emerge.
A second, recently emerging approach to the prediction of crystal is through the use of machine learning (ML) models. ML structure generation making use of now well-curated structure databases, such as the Materials Project7 or ICSD,8 to train models to rapidly generate plausible crystal structures for inorganic solids. These models include the use of graph neural networks,9 diffusion models10 or large language models.11 ML models have reached the point where they can efficiently generate large numbers of plausible crystal structures for a target composition. However, when such models are used far from their training data, or for non-trivial compositions they frequently fail to produce the correct structure, or a structure which can be relaxed into the ground state with a conventional chemistry calculation, for example with density functional theory (DFT). A final, newly emerging approach is that of mathematical optimisation, where the structure prediction problem is constructed as a series of linear equations, which can then be solved. Within the constraints of how the problem is formulated, this then provides a guarantee that the global minimum structure has been located.12
The types of heuristic methods for crystal structure prediction mentioned above are all dependent on an algorithm to generate initial structure(s) with an element of randomness that then evolve in some specified way. The initial structures can be produced using simple rules, for example randomly selecting a unit cell then populating it with atoms with minimum inter-atomic distance constraints, or with more complex rule sets based upon knowledge of inorganic chemistry, as used, e.g., in FUSE having structures assembled from randomly generated blocks using rules based on how such blocks connect in known compounds. Hereafter, we refer to all of these methods to generate trial crystal structures as “random structure generation”, in contrast to structures generated by ML models. Once structures are generated, they are optimised into their nearest energy minimum according to the forces acting on the atoms calculated using well-established chemistry methods, such as the Density Functional Theory (DFT) code VASP,13 referred to here as local optimisation. The vast majority of the computational costs of heuristic crystal structure prediction methods lie in these local optimisation steps, with the total cost very closely aligned with how close the initially generated structures are to local minima on the potential energy surface.
In this work we demonstrate a new hybrid approach to CSP: to use ML models to generate initial crystal structures for a given composition in place of the conventional random structure generation. We integrate this ML model into the heuristic CSP code FUSE, and demonstrate usage across eleven compounds: eight known compounds, and three hypothetical. We hypothesise that the inclusion of ML generated crystal structures can yield two potential benefits to CSP:
1. As ML generated structures should be close to plausible crystal structures, they should be closer to minima on potential energy surfaces and so reduce the computational cost per structure.
2. ML generation models will on average produce structures which are closer to ground state structures and so will reduce the energy of the global minimum structure located by FUSE within a similar amount of compute time when compared to using random structure generation.
The crystal graph used in the original GN was defined by nodes (vi), edges connecting nodes (ek), and global attributes (u), representing atoms, pairs, and macroscopic attributes, respectively. The crystal graph was numerically represented by G({vi}i=1,nv,{ek}k=1:ne,u), where vi and ek are elemental and pair attributes, and nv and ne are the number of atoms and pairs in the cell. The GN model is equipped with elemental embeddings and pair features, which are learned during the training process on two benchmark databases: the open quantum materials database (OQMD)14,15 and Matbench (MatB).16 An embedding layer and a matrix were added to accommodate atomic attributes and pair connectivity, respectively. The GN model consisted of MEGNet17 layers and set2set layers to update the elemental and pair matrices. The GN model, trained separately on OQMD and MatB, resulted in two models, GN(OQMD) and GN(MatB), with the latter demonstrating superior performance in CSP despite having slightly higher mean absolute error (MAE) during training.
Two benchmark datasets, OQMD and MatB, were employed for GN model training and evaluation. Data cleaning was performed to ensure reliability and comparability, and the datasets were split into training, validation, and test sets following a consistent ratio. The trained GN models, GN(OQMD) and GN(MatB), were selected based on hyperparameter optimization to minimize errors between GN-predicted and density functional theory (DFT)-calculated formation enthalpies (ΔH) on the test set.
To enhance the efficiency of CSP, a symmetry constraint was incorporated, taking into account the observed prevalence of symmetry in experimental crystal structures, particularly at low temperatures. Additional structural features, crystal symmetry S and the occupancy of Wyckoff position Wi, were incorporated, allowing for CSP with symmetry constraints. Symmetry operations were chosen from the 229 space groups after P1, and Wyckoff positions were selected accordingly. The procedure ensured the generation of symmetrical crystal structures during CSP.
Three optimization algorithms (OAs) – random searching (RAS), particle-swarm optimization (PSO), and Bayesian optimization (BO) – were adopted for CSP. BO, specifically implemented with a Gaussian mixture model based on the tree of Parzen estimators, demonstrated superior performance in efficiently exploring the structural space. The GN-OA approach iteratively generated and evaluated crystal structures until convergence, with BO demonstrating superior performance due to its effective balance between exploitation and exploration.
Given its superior performance, we opted to retrain the GN(MatB)-BO model specifically for our use case. An illustration of the model architecture that was chosen to be trained is shown in Fig. 1. The retraining process involved the direct adoption of the GN model from the original paper, with no hyperparameter tuning. The training, conducted on a “Tesla A100” GPU with 2 GPU devices per node and utilizing 80 GB global memory, proceeded until the 479th iteration step, achieving a validation MAE of 0.034786 eV per atom. The original model was directly adopted without additional evaluation, aligning with the reported results from the original paper.
Fig. 1 (a) The GN model was trained using the MatBench benchmark dataset. The input atomic feature for the crystal graph is (b) embedded atomic number (1 to Nv) for each compositional atom (1 to nv) and the input pair feature is (c) Gaussian-expanded distance (1 to Ne) for each pair connecting atom i (1 to nv) and atom j (1 to nv). (d) The structural generation phase encompasses adding symmetry constraints, generating structures, and converting them into crystal graphs. (e) The GN model integrates embedded atomic numbers, Gaussian-expanded pair distances, MEGNet blocks, set2set layers for atomic numbers and pair distances, a concatenation layer, and a fully connected layer to derive the correlation model between a crystal and its formation enthalpy. (f) A Bayesian Optimization block is included. This figure was adapted from ref. 9. |
In the next section, we discuss the integration of this ML model with FUSE, elucidating how this combination enhances overall performance.
For FUSE to be able to use the output of generative ML models in place of randomly generated structures, the code required altering such that it is possible to decompose any arbitrary crystal structure into a set of constituent sub-modules that can then be used in the code's basin hopping routine. The ability to do this for any arbitrary crystal structure will greatly increase the flexibility of the sub-modules that FUSE is able to use.
The new implementation of the code (written in python 3) starts by computing the size of a sub-module based upon the atomic radii within the composition as described above. Then along each of the three unit cell axes, FUSE calculates the nearest integer number of sub-modules along each direction. The resulting sub-modules are then populated by extracting atoms from the starting structure according to their fractional atomic co-ordinates. The angles for the sub-module are inherited from the unit cell of the structure being broken down. The stacking direction is taken as c axis of the parent crystal structure. For example, as shown in Fig. 2, FUSE calculates the size of a sub-module for Ca3Al2Si3O12 (ref. 18) to be 4.24 × 4.24 × 2.12 Å. This sub-module size equates to the full structure being broken down into 54 sub-modules, in a 3 × 3 × 6 grid. Corresponding to the structure being broken down into 6 modules, with each module comprising of 3 × 3 × 1 sub-modules.
Fig. 2 (a) Example of FUSE2 selecting a module shape to slice the known garnet compound Ca3Al2Si3O12,18 based upon creating sub-modules with the lattice parameter 4.24 × 4.24 × 2.12 Å, derived from atomic radii as outlined in Methods. FUSE2 slices the structure with a grid of 3 × 3 × 6, yielding a total of 54 sub-modules within the structure. (b) The first nine sub-modules extracted from the structure of Ca3Al2Si3O12, illustrating the diversity of sub-modules when compared to those in the previous version of FUSE, restricted to one of only eight possible motifs. Atoms coloured as follows: Ca (green), Al (cyan), Si (blue), O (red). |
Therefore when FUSE is extracting a sub-module with the position x, y, z on the grid for the above, it will use all atoms with the fractional co-ordinates within the range: . The crystal structure now broken down into the constituent sub-modules can then be evolved using a basin hopping routine similar to that in the original implementation of the code. The flowchart for the new implementation of FUSE is shown in Fig. 3.
Fig. 3 Flowchart detailing the workflow of FUSE2 as presented in this work. Red sections indicate where FUSE has been modified for this work. “Run ML model”: the trained gn-boss model presented in this work is run for the given composition, to generate potential crystal structures for use in the main CSP search. “Rank generated structures”: in order to remove non-physical structures which are generated (as a result of noise in the model), the generated structures are re-ranked using a statistical proxy potential19 (see Methods), structures are then fed into the initial population for FUSE2 using this re-ranking, remaining structures from the ML generated population may then be used later in the “Alter structure” stage. “Alter structure: positions/unit cell”: here the basin hopping move which would generate a random new structure is replaced by using remaining structures from those generated using the ML model, if none remain, FUSE2 reverts back to using the original version of the random structure generator. |
In the new implementation of FUSE, at the start of a structure prediction run, the code will run the ML model, outlined in Section 2.1 generating a user specified number of crystal structures for each number of formula units specified by the use in the input file, using the specified optimizer and range of space groups. To remove un-physical structures generated by the ML models (generated as a result of noise within the model), an option to re-rank the generated structures has been included. In this work, ML generated structures are re-ranked using universal, statistical inter-atomic proxy potentials19 (SPP), derived from all ordered crystal structures within the ICSD. The structures which can be successfully computed at the ranking stage are then compiled into a pool of structures which the basin hopping routine can access. Once structures have been generated and ranked, in order to proceed with the basin hopping search routine used in FUSE, the code needs to assemble the initial population of structures. The population is assembled by taking the top x structures (where x is a user defined initial population) from the ML generated pool, which are then broken down into their sub-modular structures as outlined above, and their energies calculated using the user chosen method. If the ML generator fails to produce x structures which FUSE is able to rank, it will revert to generating additional structures using the original structure generation algorithm to complete the initial population. With the new implementation of FUSE, all crystal structures can now be symmetrised prior to any geometry optimisation using the python spglib library, as this reduces the occurrence of crystal structures with very flat shapes, which can be slow to compute with computational chemistry codes. After the structure has been parsed by spglib (if used), the calculation proceeds with no symmetry imposed during the local optimisation in P1 symmetry. The basin hopping routine then progresses in the same manner as per the original implementation, with the following move types written to work with this implementation (presented in order in which they are in the code), the probability of the code using each move is definable by the user, by default there is an equal probability of using each:
1. Swap the position of two atoms: locate two atoms of different species within the crystal structure and swap their positions, this move is not used if the composition is elemental.
2. Swap an atom into a vacant space within the structure: first choose an atom in the structure, then locate a space within the crystal structure which is more than x Å from any other atom, and move it into this space. Where x is a user specified distance used within the code as a threshold for atom–atom contacts which are too short, the default value is 1 Å.
3. Swapping the position of i atoms, where n > i > 2 where n is the total number of atoms within the structure. This move is not used if the composition is elemental, or the structure contains fewer than 4 atoms.
4. Swapping the positions of i atoms, where n > i > 2, as with move 3. But allowing atoms to move into vacant spaces within the structure as defined in move 2.
5. Swap the positions of all of the atoms within the structure, this move is not used if the composition is elemental.
6. Swap the positions of all of the atoms within the structure, allowing for atoms to be moved into vacant positions as outlined in move 2.
7. Swap the position of two sub-modules: swap the location of two sub-modules from within the crystal structure.
8. Swap the position of two modules within the structure.
9. Generate a new unit cell shape for the sub-modules within the current structure. This move reshapes the sub-modules of the current structure into a new unit cell shape, with the shape randomly determined according to the original implementation of FUSE.
10. Mutation of the structure: inspired by genetic algorithms, the code selects a sub-module from the current structure and modifies the positions of the atoms within it. Currently, the only option, is for the code to select a sub-module with <5 atoms, and move their fractional co-ordinates to match one of the original sub-module motifs from the original implementation of FUSE.
11. Double the current structure along one axis. Providing the current structure is ≤ half the maximum number of atoms permitted from the input file, the structure will be doubled along one randomly selected crystallographic axis. For the new part of the crystal structure there is an even chance to: repeat the structure, translate the structure by 0.5 fraction co-ordinates in the plane perpendicular to the direction chosen, invert the atomic co-ordinates, about the centre of the plane through which the structure has been doubled or mirror the structure through a mirror plane in the face through which the structure has been doubled.
12. Triple the crystal structure along one axis. Providing the number of atoms within the current structure is ≤ than one third of the maximum atoms permitted in the input file, the code will triple the current structure along one crystallographic axis. When the structure is tripled, there is an equal probability to: repeat the structure along the chosen axis or to translate the atomic co-ordinates of each new set of atoms by one third in the plane perpendicular to the direction in which the structure is being extended.
13. Generate a random new structure up to the size of the current structure. The current structure is replaced by a new random structure, with an equal probability to select a structure from the pool of ML generated structures if unused structures are available, or to generate a new structure using the algorithm used in the original implementation of FUSE.
14. Generate a random new structure up to the maximum size permitted from the input file. An identical copy of move 13, with the maximum number of atoms raised to be equal to the maximum number of atoms within the input file.
This new implementation of FUSE is hereafter referred to as FUSE2.
For each of the compositions tested in this work, the same set of four experiments with FUSE2 have been performed:
1. “fgen”: as our baseline experiment, FUSE2 is run without using the ML model outlined in Section 2.1, all crystal structure are therefore generated as in the original version of the code, using the original random unit cell selection and sub-module motifs to populate the unit cell. Local optimisation is then only performed using VASP. This experiment is similar to structure prediction runs using the original implementation of FUSE.
2. “mlgen”: the initial population of crystal structures is generated using the ML model in 2.1. The ML generated structures are ranked using SPPs and the top x structures selected and broken down into their sub-modules. The remaining ML structures which do not form the initial population of structures then form a pool of structures which may be introduced into the search via moves 13 and 14 outlined above. Local optimisation is then only performed using VASP.
3. “fgen-SPP”: as experiment 1, but local optimisation is performed in two stages: (1) structures are locally optimised using SPPs and (2) the SPP optimised structure is then re-optimised using VASP, with the final energy taken from VASP.
4. “mlgen-SPP”: as experiment 2, but local optimisation is performed in two stages: (1) structures are locally optimised using SPPs and (2) the SPP optimised structure is then re-optimised using VASP, with the final energy taken from VASP.
For each of the experiments performed in this work, the same approximate quantity of computing time has been allocated; note that this means that due to the differences in the compute time required per structure (see below), each experiment will not have explored the same number of individual crystal structures. In total, for the experiments below approximately 1.5M core hours were used, equally distributed among all experiments, with all experiments for a given composition performed on the same high performance computing cluster to ensure a fair comparison between experiments. For all of the timings discussed below where the ML structure generation is used, the computational cost of running the ML model for the initial structure generation is not factored in, as this a constant value and the amount this contributes towards the overall runtime is minimal. In this work, the ML structure generation and ranking for each run of the code uses between 2–16 core hours (approximately 600 core hours in total), compared to the total CPU time budget of 1.5M core hours.
For each of the compositions described below (see Tables 1 and 3), a maximum limit of 50 atoms per structure was used, with an initial population of 25 structures. For each experiment, three independent runs of FUSE2 were performed, replicating a typical use case, and the lowest energy from the combined results of each experiment is reported. For timings, the run times across all three runs are aggregated, along with the total number of structures to produce a mean run time. For all but one of the FUSE2 experiments described below, where ML generated structures were used, 5000 structures were generated per formula unit, using the Bayesian optimisation search which is integrated into the ML model (see Section 2.1) with only triclinic symmetries used. For each composition, the ML model was run for up to 8 formula units, or the highest number of formula units which is less than the set limit of 50 atoms. In the case of Si6Al6B3Fe3NaO30F, where only one formula unit could be used, 7500 structures were generated.
Where SPPs are used in both the ranking and pre-optimising structures prior to optimisation with DFT, all structures were relaxed to their nearest minimum, with forces optimised until the normalised gradient of the forces was < than 0.1, computed using GULP.28 DFT calculations were performed for all structures within searches, using the density functional theory code VASP13 with conventional PBE pseudo-potentials,29 for a total of 220 steps or until the forces are below 0.03 eV Å−1, Γ-centred k-point grids are generated using the “KSPACING” setting in VASP, with the value at the final step of the calculation of 0.3. The plane-wave cutoff was set for each composition to be 1.3× the maximum plane wave energy from the pseudo-potentials.
For each of the four experiments described above, two key performance metrics were used: the mean time taken to obtain an energy for each individual structure and the lowest energy obtained from any run of each experiment.
Composition | fgen-SPP | mlgen | mlgen-SPP |
---|---|---|---|
Ca3Ti2O7 | 1.3 | 1.4 | 2.9 |
CoAs2 | 0.4 | 3.9 | 2.0 |
Cu7S4 | 0.5 | 1.9 | 1.9 |
Mn | 1.4 | 0.9 | 1.5 |
Pb5As3O12Cl | 3.3 | 2.2 | 5.3 |
Si6Al6B3Fe3NaO30F | 2.1 | 1.8 | 3.3 |
WCl2 | 1.5 | 2.2 | 8.3 |
YWB4 | 0.7 | 0.1 | 0.5 |
The second metric used to compare the different experiments is the lowest energy which was obtained from each experiment as shown in Table 2. For all eight compositions, the lowest energies were obtained from experiments starting from crystal structures generated using the ML model. For two compositions Cu7S4 and Mn both the mlgen and mlgen-SPP experiments obtained the same energy, and for Pb5As3O12Cl and CoAs2 the lowest energies were obtained by the mlgen experiment. For the compositions Ca3Ti2O7, Si6Al6B3Fe3NaO30F and WCl2 the lowest energy was obtained by the mlgen-SPP experiment. For YWB4, both the baseline fgen and mlgen-SPP experiments have obtained equal energies.
Composition | fgen | fgen-SPP | mlgen | mlgen-SPP |
---|---|---|---|---|
Ca3Ti2O7 | −7.69 | −7.64 | −7.69 | −7.73 |
CoAs2 | −5.55 | −5.43 | −5.62 | −5.57 |
Cu7S4 | −3.97 | −4.01 | −4.03 | −4.03 |
Mn | −8.93 | −8.92 | −8.93 | −8.93 |
Pb5As3O12Cl | −5.31 | −5.55 | −5.59 | −5.56 |
Si6Al6B3Fe3NaO30F | −7.03 | −7.09 | −7.01 | −7.37 |
WCl2 | −5.94 | −6.12 | −6.03 | −6.14 |
YWB4 | −8.14 | −8.12 | −7.79 | −8.14 |
For each of the three hypothetical compounds tested in this work, which are un-reported experimentally, the mlgen-SPP experiment showed the fastest speed-up, with the smallest speedup being for Li2Sn2S3Cl4 with a speed up factor of 3.7 (Table 3). The fastest speedup was for the composition Li4BrOCl, with a speedup factor of 6.3. For two of the compositions in this section, the lowest energy structure was obtained by one of the experiments using ML generated structures. For the two compositions Li2Sn2S3Cl4 and Li3Si3O5Cl5 the lowest energy structure was obtained by the mlgen-SPP experiment. The lowest energy structures from the mlgen-SPP experiments are shown in Fig. 5. For Li4OBrCl, the lowest energy was obtained by all four experiments (Table 4).
Composition | fgen-SPP | mlgen | mlgen-SPP |
---|---|---|---|
Li2Sn2S3Cl4 | 1.0 | 2.3 | 3.7 |
Li3Si3O5Cl5 | 1.5 | 3.1 | 5.0 |
Li4OBrCl | 5.3 | 4.6 | 6.3 |
Fig. 5 The minimum energy structures obtained for the three hypothetical compounds tested in this work, obtained from the mlgen-SPP experiment. |
Composition | fgen | fgen-SPP | mlgen | mlgen-SPP |
---|---|---|---|---|
Li2Sn2S3Cl4 | −3.66 | −3.73 | −3.78 | −3.80 |
Li3Si3O5Cl5 | −5.20 | −4.85 | −5.50 | −5.55 |
Li4OBrCl | −4.05 | −4.05 | −4.05 | −4.05 |
For these four experiments, 24.6K CPU hours of compute time was used, with each generator using the same amount of CPU time within each composition. For the ChgNET part of the structure relaxations, the structure was relaxed until the maximum force acting on the atoms was less than 0.05 eV Å−1, or 2000 ionic steps had passed. The resulting structures were then re-optimized using VASP as outlined above. The remaining setup of the experiment then remained the same as for the fgen-SPP and mlgen-SPP experiments listed above, with the exception that ranking the ML generated structures. Only single point calculations were used with ChgNET, as a result of preliminary testing, where attempting relaxations on ML generated structures failed due to isolated atoms within some structures (defined within ChgNET as having a distance of greater than 6 Å to its nearest neighbour), with this failure often resulting in ChgNET crashing and making FUSE2 runs unstable. In future versions of FUSE we will include filters prior to structures being passed to ChgNET to avoid this issue, allowing for ML generated structures to be optimised at this stage. The results for these experiments are shown in Table 5. It was observed that in both cases the mlgen-ChgNET increases the speed of optimising crystal structures by 5 and 1.7 times for the Li2Sn2S3Cl4 and Li3Si3O5Cl5 compositions relative to the fgen-ChgNET experiments respectively. It was also observed that all four of the experiments using ChgNET to pre-optimise structures resulted in lower energy structures (structures shown in Fig. 6). This result, when combined with the observations in the previous section, continue to demonstrate that the use of ML based structure generation with heuristic structure prediction methods, results in a significant increase in the speed of calculations. Our observations with using both a data driven potential (SPPs) and machine learnt potential (ChgNET), additionally suggest that the speed increases are generally complementary to each other rather than in competition.
Composition | mlgen-ChgNET speed-up | fgen-ChgNET energy | mlgen-ChgNET energy |
---|---|---|---|
Li2Sn2S3Cl4 | 5.0 | −3.79 | −3.80 |
Li3Si3O5Cl5 | 1.7 | −5.57 | −5.58 |
Fig. 6 The lowest energy structures obtained in experiments running FUSE2 using ChgNET30 in place of SPPs, combined with VASP for the compositions Li2Sn2S3Cl4 and Li3Si3O5Cl5. |
Examining the results of the experiments in terms of the energies obtained, for the majority of experiments, the lowest energy structure was obtained by an experiment using the ML structure generation. This is with the exception of the composition YWB4, where both the fgen and mlgen-SPP experiments were tied and Li4OBrCl where all four experiments obtained the same energy minimum. The largest difference comes for the composition Si6Al6B3Fe3NaO30F, where the mlgen-SPP experiment obtains an energy 338 meV per atom lower than obtained by the baseline fgen experiment.
The experiments using the ML structure generation in this work demonstrate the benefits of integrating ML structure generation to create a combined CSP method which is superior to using purely ML structure prediction or heuristic methods. In 58 out of the 66 calculations making up the main mlgen based experiments in the Results section (11 compositions, with 6 mlgen experiments per composition) the lowest energy structure is not obtained in the starting generation of structures generated with ML, it is arrived at by the basin hopping within FUSE2, demonstrating the benefits of combining ML structure generation with heuristic CSP to obtain lower energy structures for a given composition (Table 6). Of all 66 runs, the mean improvement over the ML generated starting structures is 155 meV per atom, with a standard deviation of 209 meV per atom. The large standard deviation comes from the wide range of improvements, from the 8 runs where the improvement is zero, and the largest improvement is for one run of WCl2 in the mlgen-SPP experiment, where the basin hopping improved the energy from the initial population by 969 meV per atom. This is demonstrated by the composition Ca3Ti2O7 in the mlgen-SPP experiment, where the lowest energy structure from the initial population generated by ML has an energy of −7.61 eV per atom, during the basin hopping routine, FUSE2 is then reduces the energy to −7.73 eV per atom, a reduction of 112 meV per atom shown in Fig. 7. We have implemented the recently developed machine learnt potential ChgNET, demonstrating that the speed advantage between the two structure generation methods presented in this work is maintained, using our ML structure generation and ChgNET, we obtained lower energies than were obtained in the main study in this work, demonstrating the advantages of combining both ML structure generation, data derived potentials and heuristic CSP methods.
Composition | mlgen mean difference | mlgen-SPP mean energy difference |
---|---|---|
Ca3Ti2O7 | −34 | −92 |
CoAs2 | −191 | −150 |
Cu7S4 | −3 | −4 |
Mn | −29 | −207 |
Pb5As3O12Cl | −198 | −225 |
Si6Al6B3Fe3NaO30F | −22 | −214 |
WCl2 | −182 | −928 |
YWB4 | 0 | −224 |
Li2Sn2S3Cl4 | −54 | −70 |
Li3Si3O5Cl5 | −213 | −340 |
Li4OBrCl | −22 | −19 |
In conclusion, we have demonstrated how data driven approaches to materials modelling can be integrated with conventional heuristic CSP methods to create a new implementation of our CSP method FUSE. Our results overall demonstrate that the integration of ML structure generation with heuristic CSP results in a speed up of up to 8 times vs. our baseline experiment and is able to obtain lower energy structures. As part of our demonstration, we have curated a set of eleven compounds (eight known, three hypothetical), which present a challenge to modern CSP methods, and so can be used as a benchmark set for the wider structure prediction community. We have demonstrated that FUSE2, provides a CSP platform to use both data driven structure generation and derived potentials, written in python, creating an open platform where researchers can readily integrate new models. As FUSE2 is developed in the future, we envision the development of better ML models to generate our initial structures either through tuning the existing model, or the adoption or development of new ML structure generators, as well as the implementation of new ML interatomic potentials as they become available. Additional improvements are envisioned to the core code through both the development of new move options for the basin hopping routine and through the implementation of reinforcement learning to guide the code on the weights to use for each of the available move types, as we have recently implemented for the original version of the code,31 and the integration of mathematical optimisation to contribute to the construction of structures and determine when to stop a calculation.12 FUSE2 will enable accelerated CSP across solid state inorganic chemistry, which will only continue to improve as new and better data driven ML models for inorganic chemistry are created alongside future development of the code.
This journal is © The Royal Society of Chemistry 2025 |