Issue 43, 2021

Graphical Gaussian process regression model for aqueous solvation free energy prediction of organic molecules in redox flow batteries

Abstract

The solvation free energy of organic molecules is a critical parameter in determining emergent properties such as solubility, liquid-phase equilibrium constants, pKa and redox potentials in an organic redox flow battery. In this work, we present a machine learning (ML) model that can learn and predict the aqueous solvation free energy of an organic molecule using the Gaussian process regression method based on a new molecular graph kernel. To investigate the performance of the ML model for electrostatic interaction, the nonpolar interaction contribution of the solvent and the conformational entropy of the solute in the solvation free energy, three data sets with implicit or explicit water solvent models, and contribution of the conformational entropy of the solute are tested. We demonstrate that our ML model can predict the solvation free energy of molecules at chemical accuracy with a mean absolute error of less than 1 kcal mol−1 for subsets of the QM9 dataset and the Freesolv database. To solve the general data scarcity problem for a graph-based ML model, we propose a dimension reduction algorithm based on the distance between molecular graphs, which can be used to examine the diversity of the molecular data set. It provides a promising way to build a minimum training set to improve prediction for certain test sets where the space of molecular structures is predetermined.

Graphical abstract: Graphical Gaussian process regression model for aqueous solvation free energy prediction of organic molecules in redox flow batteries

Supplementary files

Article information

Article type
Paper
Submitted
29 Sep 2021
Accepted
25 Oct 2021
First published
28 Oct 2021

Phys. Chem. Chem. Phys., 2021,23, 24892-24904

Author version available

Graphical Gaussian process regression model for aqueous solvation free energy prediction of organic molecules in redox flow batteries

P. Gao, X. Yang, Y. Tang, M. Zheng, A. Andersen, V. Murugesan, A. Hollas and W. Wang, Phys. Chem. Chem. Phys., 2021, 23, 24892 DOI: 10.1039/D1CP04475C

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements