Expanded ensemble predictions of toluene–water partition coefficients in the SAMPL9 log P challenge
Abstract
The logarithm of the partition coefficient (log P) between water and a nonpolar solvent is useful for characterizing a small molecule's hydrophobicity. For example, the water–octanol log P is often used as a predictor of a drugs lipophilicity and/or membrane permeability, good indicators of its bioavailability. Existing computational predictors of water–octanol log P are generally very accurate due to the wealth of experimental measurements, but may be less so for other non-polar solvents such as toluene. In this work, we participate in a Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) log P challenge to examine the accuracy of a molecular simulation-based absolute free energy approach to predict water–toluene log P in a blind test for sixteen drug-like compounds with acid–base properties. Our simulation workflow used the OpenFF 2.0.0 force field, and an expanded ensemble (EE) method for free energy estimation, which enables efficient parallelization over multiple distributed computing clients for enhanced sampling. The EE method uses Wang–Landau flat-histogram sampling to estimate the free energy of decoupling in each solvent, and can be performed in a single simulation. Our protocol also includes a step to optimize the schedule of alchemical intermediates in each decoupling. The results show that our EE workflow is able to accurately predict free energies of transfer, achieving an RMSD of 2.26 kcal mol−1 (1.65 log P units), and R2 of 0.80. An examination of outliers suggests that improved force field parameters could achieve better accuracy. Overall, our results suggest that expanded ensemble free energy calculations provide reasonably accurate log P prediction for a general-purpose force field.
- This article is part of the themed collection: The SAMPL Challenges