Machine learning-augmented docking. 1. CYP inhibition prediction†
Abstract
A significant portion of the oxidative metabolism carried out by the human body is accomplished by six cytochrome P450 (CYP) enzymes. The binding of small molecules to these enzymes affects drug activity and half-life. Additionally, the inhibition or induction of a CYP isoform by a drug can lead to drug–drug interactions, which in turn can lead to toxicity. To predict CYP inhibition, a variety of computational methods have been used, with docking methods being less accurate than machine learning (ML) methods. However, the latter methods are sensitive to training data and show reduced accuracy on test sets outside of the chemical space represented in the training set. In contrast, docking methods do not have this generalization issue and allow for visual analysis. We hypothesize that combining ML methods with docking can improve CYP inhibition predictions. To test this hypothesis, we pair our in-house docking program FITTED with several ML techniques to investigate the accuracy and transferability of this hybrid methodology, which we term ML-augmented docking. We find that ML-augmented docking can significantly improve the accuracy of docking software while consistently surpassing the performance of ligand-only models. Additionally, we show that ML-augmented docking is more generalizable than machine learning models trained on ligand-only data. The open-source code created for this project can be found at https://github.com/MoitessierLab/ML-augmented-docking-CYP-inhibition.