Facilitating ab initio QM/MM free energy simulations by Gaussian process regression with derivative observations†
Abstract
In combined quantum mechanical and molecular mechanical (QM/MM) free energy simulations, how to synthesize the accuracy of ab initio (AI) methods with the speed of semiempirical (SE) methods for a cost-effective QM treatment remains a long-standing challenge. In this work, we present a machine-learning-facilitated method for obtaining AI/MM-quality free energy profiles through efficient SE/MM simulations. In particular, we use Gaussian process regression (GPR) to learn the energy and force corrections needed for SE/MM to match with AI/MM results during molecular dynamics simulations. Force matching is enabled in our model by including energy derivatives into the observational targets through the extended-kernel formalism. We demonstrate the effectiveness of this method on the solution-phase SN2 Menshutkin reaction using AM1/MM and B3LYP/6-31+G(d,p)/MM as the base and target levels, respectively. Trained on only 80 configurations sampled along the minimum free energy path (MFEP), the resulting GPR model reduces the average energy error in AM1/MM from 18.2 to 5.8 kcal mol−1 for the 4000-sample testing set with the average force error on the QM atoms decreased from 14.6 to 3.7 kcal mol−1 Å−1. Free energy sampling with the GPR corrections applied (AM1-GPR/MM) produces a free energy barrier of 14.4 kcal mol−1 and a reaction free energy of −34.1 kcal mol−1, in closer agreement with the AI/MM benchmarks and experimental results.
- This article is part of the themed collection: 2022 PCCP HOT Articles