Using physical property surrogate models to perform accelerated multi-fidelity optimization of force field parameters†
Abstract
Accurate representations of van der Waals dispersion–repulsion interactions play an important role in high-quality molecular dynamics simulations. Training the force field parameters used in the Lennard Jones (LJ) potential typically used to represent these interactions is challenging, generally requiring adjustment based on simulations of macroscopic physical properties. The large computational expense of these simulations, especially when many parameters must be trained simultaneously, limits the size of training data set and number of optimization steps that can be taken, often requiring modelers to perform optimizations within a local parameter region. To allow for more global LJ parameter optimization against large training sets, we introduce a multi-fidelity optimization technique which uses Gaussian process surrogate modeling to build inexpensive models of physical properties as a function of LJ parameters. This approach allows for fast evaluation of approximate objective functions, greatly accelerating searches over parameter space and enabling the use of optimization algorithms capable of searching more globally. In this study, we use an iterative framework which performs global optimization with differential evolution at the surrogate level, followed by validation at the simulation level and surrogate refinement. Using this technique on two previously studied training sets, containing up to 195 physical property targets, we refit a subset of the LJ parameters for the OpenFF 1.0.0 (Parsley) force field. We demonstrate that this multi-fidelity technique can find improved parameter sets compared to a purely simulation-based optimization by searching more broadly and escaping local minima. Additionally, this technique often finds significantly different parameter minima that have comparably accurate performance. In most cases, these parameter sets are transferable to other similar molecules in a test set. Our multi-fidelity technique provides a platform for rapid, more global optimization of molecular models against physical properties, as well as a number of opportunities for further refinement of the technique.