Discovery of novel PDGFR inhibitors targeting non-small cell lung cancer using a multistep machine learning assisted hybrid virtual screening approach
Abstract
Non-Small Cell Lung Cancer (NSCLC) is a formidable global health challenge, responsible for the majority of cancer-related deaths worldwide. The Platelet-Derived Growth Factor Receptor (PDGFR) has emerged as a promising therapeutic target in NSCLC, given its crucial involvement in cell growth, proliferation, angiogenesis, and tumor progression. Among PDGFR inhibitors, avapritinib has garnered attention due to its selective activity against mutant forms of PDGFR, particularly PDGFRA D842V and KIT exon 17 D816V, linked to resistance against conventional tyrosine kinase inhibitors. In recent years, Machine Learning has emerged as a powerful tool in pharmaceutical research, offering data-driven insights and accelerating lead identification for drug discovery. In this research article, we focus on the application of Machine Learning, alongside the RDKit toolkit, to identify potential anti-cancer drug candidates targeting PDGFR in NSCLC. Our study demonstrates how smart algorithms efficiently narrow down large screening collections to target-specific sets of just a few hundred small molecules, streamlining the hit discovery process. Employing a Machine Learning-assisted virtual screening strategy, we successfully preselected 220 compounds with potential PDGFRA inhibitory activity from a vast library of 1.048 million compounds, representing a mere 0.013% of the original library. To validate these candidates, we employed traditional genetic algorithm-based virtual screening and docking methods. Remarkably, we found that ZINC000002931631 exhibited comparable or even superior inhibitory potential against PDGFRA compared to Avapritinib, which highlights the value of our Machine Learning approach. Moreover, as part of our lead validation studies, we conducted molecular dynamic simulations, revealing critical molecular–level interactions responsible for the conformational changes in PDGFRA necessary for substrate binding. Our study exemplifies the potential of Machine Learning in the drug discovery process, providing a more efficient and cost-effective means of identifying promising drug candidates for NSCLC treatment. The success of this approach in preselecting compounds with potent PDGFRA inhibitory potential highlights its significance in advancing personalized and targeted therapies for cancer treatment.