Discovery of new Cdc2-like kinase 4 (CLK4) inhibitors via pharmacophore exploration combined with flexible docking-based ligand/receptor contact fingerprints and machine learning†
Abstract
Cdc2-like kinase 4 (CLK4) inhibitors are of potential therapeutic value in many diseases particularly cancer. In this study, we combined extensive ligand-based pharmacophore exploration, ligand–receptor contact fingerprints generated by flexible docking, physicochemical descriptors and machine learning-quantitative structure–activity relationship (ML-QSAR) analysis to investigate the pharmacophoric/binding requirements for potent CLK4 antagonists. Several ML methods were attempted to tie these properties with anti-CLK4 bioactivities including multiple linear regression (MLR), random forests (RF), extreme gradient boosting (XGBoost), probabilistic neural network (PNN), and support vector regression (SVR). A genetic function algorithm (GFA) was combined with each method for feature selection. Eventually, GFA-SVR was found to produce the best self-consistent and predictive model. The model selected three pharmacophores, three ligand–receptor contacts and two physicochemical descriptors. The GFA-SVR model and associated pharmacophore models were used to screen the National Cancer Institute (NCI) structural database for novel CLK4 antagonists. Three potent hits were identified with the best one showing an anti-CLK4 IC50 value of 57 nM.