Issue 108, 2016

High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods

Abstract

The environmental protection agency thinks that quantitative structure–activity relationship (QSAR) analysis can better replace toxicity tests. In this paper, we developed QSAR methods to evaluate the narcosis toxicities of 50 phenol analogues. We first built multiple linear regression (MLR), stepwise multiple linear regression (SLR) and support vector regression (SVR) models using five descriptors and three different partitions, and the optimal SVR models with all three training-test partitions had the highest external prediction ability, about 10% higher than the models in the literature. Second, to identify more effective descriptors, we applied two in-house methods to select descriptors with clear meanings from 1264 descriptors calculated by the PCLIENT software and used them to construct the MLR, SLR and SVR models. Our results showed that our best SVR model (Rpred2 = 0.972) significantly increased 16.55% on the test set, and the appropriate partition presented the better stability. The different partitions of the training-test datasets also supported the excellent predictive power of the best SVR model. We further evaluated the regression significance of our SVR model and the importance of each single descriptor of the model according to the interpretability analysis. Our work provided a valuable exploration of different combinations among data partition, descriptor selection and model and a useful theoretical understanding of the toxicity of phenol analogues, especially for such a small dataset.

Graphical abstract: High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods

Article information

Article type
Paper
Submitted
22 Aug 2016
Accepted
01 Nov 2016
First published
02 Nov 2016

RSC Adv., 2016,6, 106847-106855

High-accuracy QSAR models of narcosis toxicities of phenols based on various data partition, descriptor selection and modelling methods

W. Zhou, Y. Fan, X. Cai, Y. Xiang, P. Jiang, Z. Dai, Y. Chen, S. Tan and Z. Yuan, RSC Adv., 2016, 6, 106847 DOI: 10.1039/C6RA21076G

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements