Modeling molecular ensembles with gradient-domain machine learning force fields

Alex M. Maldonado; Igor Poltavsky; Valentin Vassilev-Galindo; Alexandre Tkatchenko; John A. Keith

doi:10.1039/D3DD00011G

Modeling molecular ensembles with gradient-domain machine learning force fields†

Alex M. Maldonado,

^a Igor Poltavsky,^b Valentin Vassilev-Galindo,

^bc Alexandre Tkatchenko

*^b and John A. Keith

*^a

Author affiliations

* Corresponding authors

^a Department of Chemical and Petroleum Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA
E-mail: jakeith@pitt.edu

^b Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
E-mail: alexandre.tkatchenko@uni.lu

^c IMDEA Materials Institute, C/Eric Kandel 2, 28906 Getafe, Madrid, Spain

Abstract

Gradient-domain machine learning (GDML) force fields have shown excellent accuracy, data efficiency, and applicability for molecules with hundreds of atoms, but the employed global descriptor limits transferability to ensembles of molecules. Many-body expansions (MBEs) should provide a rigorous procedure for size-transferable GDML by training models on fundamental n-body interactions. We developed many-body GDML (mbGDML) force fields for water, acetonitrile, and methanol by training 1-, 2-, and 3-body models on only 1000 MP2/def2-TZVP calculations each. Our mbGDML force field includes intramolecular flexibility and intermolecular interactions, providing that the reference data adequately describe these effects. Energy and force predictions of clusters containing up to 20 molecules are within 0.38 kcal mol⁻¹ per monomer and 0.06 kcal (mol Å)⁻¹ per atom of reference supersystem calculations. This deviation partially arises from the restriction of the mbGDML model to 3-body interactions. GAP and SchNet in this MBE framework achieved similar accuracies but occasionally had abnormally high errors up to 17 kcal mol⁻¹. NequIP trained on total energies and forces of trimers experienced much larger energy errors (at least 15 kcal mol⁻¹) as the number of monomers increased—demonstrating the effectiveness of size transferability with MBEs. Given these approximations, our automated mbGDML training schemes also resulted in fair agreement with reference radial distribution functions (RDFs) of bulk solvents. These results highlight mbGDML's value for modeling explicitly solvated systems with quantum-mechanical accuracy.

Supplementary files

Article information

DOI: https://doi.org/10.1039/D3DD00011G
Article type: Paper
Submitted: 27 Jan 2023
Accepted: 09 May 2023
First published: 17 May 2023
This article is Open Access

Download Citation

Digital Discovery, 2023,2, 871-880

Permissions

Request permissions

Modeling molecular ensembles with gradient-domain machine learning force fields

A. M. Maldonado, I. Poltavsky, V. Vassilev-Galindo, A. Tkatchenko and J. A. Keith, Digital Discovery, 2023, 2, 871 DOI: 10.1039/D3DD00011G

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Digital Discovery

Modeling molecular ensembles with gradient-domain machine learning force fields†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Modeling molecular ensembles with gradient-domain machine learning force fields

Social activity

Search articles by author

Spotlight

Advertisements