Issue 5, 2022

Exploring chemical and conformational spaces by batch mode deep active learning

Abstract

The development of machine-learned interatomic potentials requires generating sufficiently expressive atomistic data sets. Active learning algorithms select data points on which labels, i.e., energies and forces, are calculated for inclusion in the training set. However, for batch mode active learning, i.e., when multiple data points are selected at once, conventional active learning algorithms can perform poorly. Therefore, we investigate algorithms specifically designed for this setting and show that they can outperform conventional algorithms. We investigate selection based on the informativeness, diversity, and representativeness of the resulting training set. We propose using gradient features specific to atomistic neural networks to evaluate the informativeness of queried samples, including several approximations allowing for their efficient evaluation. To avoid selecting similar structures, we present several methods that enforce the diversity and representativeness of the selected batch. Finally, we apply the proposed approaches to several molecular and periodic bulk benchmark systems and argue that they can be used to generate highly informative atomistic data sets by running any atomistic simulation.

Graphical abstract: Exploring chemical and conformational spaces by batch mode deep active learning

Supplementary files

Article information

Article type
Paper
Submitted
29 Apr 2022
Accepted
11 Jul 2022
First published
12 Jul 2022
This article is Open Access
Creative Commons BY license

Digital Discovery, 2022,1, 605-620

Exploring chemical and conformational spaces by batch mode deep active learning

V. Zaverkin, D. Holzmüller, I. Steinwart and J. Kästner, Digital Discovery, 2022, 1, 605 DOI: 10.1039/D2DD00034B

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements