Accurate and efficient machine learning interatomic potentials for finite temperature modelling of molecular crystals†
Abstract
As with many parts of the natural sciences, machine learning interatomic potentials (MLIPs) are revolutionizing the modelling of molecular crystals. However, challenges remain for the accurate and efficient calculation of sublimation enthalpies – a key thermodynamic quantity measuring the stability of a molecular crystal. Specifically, two key stumbling blocks are: (i) the need for thousands of ab initio quality reference structures to generate training data; and (ii) the sometimes unreliable nature of density functional theory, the main technique for generating such data. Exploiting recent developments in foundation models for chemistry and materials science alongside accurate quantum diffusion Monte Carlo benchmarks, offers a promising path forward. Herein, we demonstrate the generation of MLIPs capable of describing molecular crystals at finite temperature and pressure with sub-chemical accuracy, using as few as ∼200 data structures; an order of magnitude improvement over the current state-of-the-art. We apply this framework to compute the sublimation enthalpies of the X23 dataset, accounting for anharmonicity and nuclear quantum effects, achieving sub-chemical accuracy with respect to experiment. Importantly, we show that our framework can be generalized to crystals of pharmaceutical relevance, including paracetamol and aspirin. Nuclear quantum effects are also accurately captured as shown for the case of squaric acid. By enabling accurate modelling at ambient conditions, this work paves the way for deeper insights into pharmaceutical and biological systems.