Applying local interpretable model-agnostic explanations to identify substructures that are responsible for mutagenicity of chemical compounds†
Abstract
The local interpretable model-agnostic explanations method was applied to identify substructures that represent the mutagenicity of chemical compounds using machine learning models. Random forest and extremely randomized trees were used to build models to be explained using the Hansen and Bursi Ames mutagenicity datasets. The models were analyzed using precision, recall, F1, and accuracy metrics. The aim of this study is to address the challenge of identifying substructures that indicate the mutagenicity of chemical compounds. The goal is to provide stable and consistent explanations for the mutagenicity of chemical compounds, which is crucial for trust and acceptance of the findings, especially in the sensitive field of computational toxicology. This approach is significant as it contributes to the interpretability and explainability of machine learning models, particularly in the context of identifying substructures associated with mutagenicity, thereby advancing the field of computational toxicology. Identifying substructures that represent the mutagenicity of chemical compounds is important because it can help predict the potential toxicity of new chemical compounds. This is particularly relevant in fields such as drug development and environmental toxicology, where the potential risks of exposure to new compounds need to be carefully evaluated. Some examples of chemical compounds that have been identified as mutagenic include epoxides, N-aryl compounds, nitro compounds, aromatic amines, N-oxides, nitro-containing compounds, and polycyclic aromatic hydrocarbons with a bay-region. These examples demonstrate the importance of identifying and studying mutagenic chemical compounds to better understand their potential risks and adverse effects on human health and the environment.