Nano-QSAR modeling for predicting biological activity of diverse nanomaterials†
Abstract
This study reports robust reliable ensemble learning (EL) approach based nano-QSAR models for predicting the biological effects of diverse nanomaterials (NMs) using simple molecular descriptors. EL based nano-QSAR models implementing stochastic gradient boosting and bagging algorithms were constructed and used to establish statistically significant relationships between measured biological activity profiles of nanoparticles (NPs) and their simple structural properties. To demonstrate the predictive ability of the developed nano-QSAR models, five different representative data sets (case studies) of NMs (NPs with diverse metal cores, NPs with similar core but diverse surface modifiers, metal oxide NPs, surface modified multi-walled carbon nanotubes, and fullerene derivatives) studied recently using in vitro cell based assays were employed. Rigorous validation of the constructed classification and regression nano-QSAR models performed using various statistical parameters suggested robustness of the EL based models for their future use. Proposed nano-QSAR models showed high prediction accuracy (binary classification) of more than 93.18% (case study 1), 97.25% (case study 2), and yielded correlation (R2) of more than 0.851 between experimental and model predicted values of biological activity in complete data of different diverse sets of NPs. Results for all five case studies demonstrated better predictive performance of the proposed nano-QSAR models compared to the previous studies. The proposed models reliably predicted the biological activity of all considered NPs, and the methodology is expected to provide guidance for the future design and manufacturing of NMs ensuring better and safer products.