Exploring chemical and conformational spaces by batch mode deep active learning

Viktor Zaverkin; David Holzmüller; Ingo Steinwart; Johannes Kästner

doi:10.1039/D2DD00034B

Exploring chemical and conformational spaces by batch mode deep active learning†

Viktor Zaverkin,

^a David Holzmüller,

^b Ingo Steinwart^b and Johannes Kästner

*^a

Author affiliations

* Corresponding authors

^a University of Stuttgart, Faculty of Chemistry, Institute for Theoretical Chemistry, Germany
E-mail: kaestner@theochem.uni-stuttgart.de

^b University of Stuttgart, Faculty of Mathematics and Physics, Institute for Stochastics and Applications, Germany

Abstract

The development of machine-learned interatomic potentials requires generating sufficiently expressive atomistic data sets. Active learning algorithms select data points on which labels, i.e., energies and forces, are calculated for inclusion in the training set. However, for batch mode active learning, i.e., when multiple data points are selected at once, conventional active learning algorithms can perform poorly. Therefore, we investigate algorithms specifically designed for this setting and show that they can outperform conventional algorithms. We investigate selection based on the informativeness, diversity, and representativeness of the resulting training set. We propose using gradient features specific to atomistic neural networks to evaluate the informativeness of queried samples, including several approximations allowing for their efficient evaluation. To avoid selecting similar structures, we present several methods that enforce the diversity and representativeness of the selected batch. Finally, we apply the proposed approaches to several molecular and periodic bulk benchmark systems and argue that they can be used to generate highly informative atomistic data sets by running any atomistic simulation.

Supplementary files

Article information

DOI: https://doi.org/10.1039/D2DD00034B
Article type: Paper
Submitted: 29 Apr 2022
Accepted: 11 Jul 2022
First published: 12 Jul 2022
This article is Open Access

Download Citation

Digital Discovery, 2022,1, 605-620

Permissions

Request permissions

Exploring chemical and conformational spaces by batch mode deep active learning

V. Zaverkin, D. Holzmüller, I. Steinwart and J. Kästner, Digital Discovery, 2022, 1, 605 DOI: 10.1039/D2DD00034B

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Digital Discovery

Exploring chemical and conformational spaces by batch mode deep active learning†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Exploring chemical and conformational spaces by batch mode deep active learning

Social activity

Search articles by author

Spotlight

Advertisements