In silico prediction of serious eye irritation or corrosion potential of chemicals†
Abstract
Rapidly and correctly identifying eye irritants or corrosive chemicals is an important issue in health hazard assessment. The purpose of this study is to describe the development of in silico methods for the classification of chemicals into irritants/corrosives or non-irritants/non-corrosives. A total of 5220 chemicals for a serious eye irritation (EI) dataset and 2299 chemicals as an eye corrosion (EC) dataset were collected from available databases and literature. Structure–activity relationship (SAR) models were developed to separately predict serious EI or EC via machine learning methods. According to the overall prediction accuracy, the Pub-SVM model gave the best results for both serious EI (overall classification accuracy CA = 0.946) and EC (CA = 0.959). The sensitivity and specificity of serious EI were 97.3% and 86.7% for the training set, and 96.9% and 82.7% for the external validation set, respectively. Similarly, the sensitivity and specificity of EC were 95.5% and 96.2% for the training set, and 94.9% and 96.2% for the external validation set, respectively. The high specificity and sensitivity indicated that our models were reliable and robust, which can be used to predict the potential seriousness of EI/EC of compounds. Moreover, several structural alerts for characterizing serious EI/EC were identified using the combination of information gain and substructure frequency analysis.