Development and application of a comprehensive machine learning program for predicting molecular biochemical and pharmacological properties†
Abstract
We establish a comprehensive quantitative structure–activity relationship (QSAR) model termed AlphaQ through the machine learning algorithm to associate the fully quantum mechanical molecular descriptors with various biochemical and pharmacological properties. Preliminarily, a novel method for molecular structural alignments was developed in such a way to maximize the quantum mechanical cross correlations among the molecules. Besides the improvement of structural alignments, three-dimensional (3D) distribution of the molecular electrostatic potential was introduced as the unique numerical descriptor for individual molecules. These dual modifications lead to a substantial accuracy enhancement in multifarious 3D-QSAR prediction models of AlphaQ. Most remarkably, AlphaQ has been proven to be applicable to structurally diverse molecules to the extent that it outperforms the conventional QSAR methods in estimating the inhibitory activity against thrombin, the water–cyclohexane distribution coefficient, the permeability across the membrane of the Caco-2 cell, and the metabolic stability in human liver microsomes. Due to the simplicity in model building and the high predictive capability for varying biochemical and pharmacological properties, AlphaQ is anticipated to serve as a valuable screening tool at both early and late stages of drug discovery.