QSAR modeling for predicting reproductive toxicity of chemicals in rats for regulatory purposes†
Abstract
The experimental determination of multi-generation reproductive toxicity of chemicals involves high costs and a large number of animal studies over a long period of time. Computational toxicology offers possibilities to overcome such difficulties. In this study, we have established ensemble machine learning (EML) based quantitative structure–activity relationship models for predicting the reproductive toxicity potential (LOAEL) of structurally diverse chemicals in accordance with the OECD guidelines. Accordingly, decision tree forest (DTF) and decision tree boost (DTB) QSAR models were developed using a novel dataset composed of the toxicity endpoints for 334 chemicals. Relevant structural features of chemicals responsible for toxicity potential were identified and used in QSAR modeling. The generalization and prediction abilities of the constructed QSAR models were evaluated by internal and external validation procedures and by deriving several stringent statistical criteria parameters. In the test set, the two models (DTF and DTB) yielded R2 of 0.856 and 0.945, between the experimental and predicted endpoint toxicity values. The models were also evaluated for predictive use through the most recent criteria based on root mean squared error (RMSE) and mean absolute error (MAE). The values of various statistical validation coefficients derived for the test data were above their respective threshold limits and thus put a high confidence in this analysis. The applicability domains of the constructed QSAR models were defined using the leverage and standardization approaches. The results suggest that the proposed QSAR models can reliably predict the reproductive toxicity potential of diverse chemicals and can be useful tools for screening new chemicals for safety assessment.