Issue 70, 2020, Issue in Progress

QSPR model for Caco-2 cell permeability prediction using a combination of HQPSO and dual-RBF neural network

Abstract

The Caco-2 cell model is widely used to evaluate the in vitro human intestinal permeability of drugs due to its morphological and functional similarity to human enterocytes. Although it is safe and relatively economic, it is time-consuming. A rapid and accurate quantitative structure-property relationship (QSPR) model of Caco-2 permeability is helpful to improve the efficiency of oral drug development. The aim of our study is to explore the predictive ability of the QSPR model, to study its permeation mechanism, and to develop a potential permeability prediction model, for Caco-2 cells. In our study, a relatively large data set was collected and the abnormal data were eliminated using the Monte Carlo regression and hybrid quantum particle swarm optimization (HQPSO) algorithm. Then, the remaining 1827 compounds were used to establish QSPR models. To generate multiple chemically diverse training and test sets, we used a combination of principal component analysis (PCA) and self-organizing mapping (SOM) neural networks to split the modeling data set characterized by PaDEL-descriptors. After preliminary selection of descriptors by the mean decrease impurity (MDI) method, the HQPSO algorithm was used to select the key descriptors. Six different methods, namely, multivariate linear regression (MLR), support vector machine regression (SVR), xgboost, radial basis function (RBF) neural networks, dual-SVR and dual-RBF were employed to develop QSPR models. The best dual-RBF model was obtained finally with R2 = 0.91, and Rcv52 = 0.77, for the training set, and RT2 = 0.77, for the test set. A series of validation methods were used to assess the robustness and predictive ability of the dual-RBF model under OECD principles. A new application domain (AD) definition method based on the descriptor importance-weighted and distance-based (IWD) method was proposed, and the outliers were analyzed carefully. Combined with the importance of the descriptors used in the dual-RBF model, we concluded that the “H E-state” and hydrogen bonds are important factors affecting the permeability of drugs passing through the Caco-2 cell. Compared with the reported studies, our method exhibits certain advantages in data size, transparency of modeling process and prediction accuracy to some extent, and is a promising tool for virtual screening in the early stage of drug development.

Graphical abstract: QSPR model for Caco-2 cell permeability prediction using a combination of HQPSO and dual-RBF neural network

Supplementary files

Article information

Article type
Paper
Submitted
25 Sep 2020
Accepted
06 Nov 2020
First published
26 Nov 2020
This article is Open Access
Creative Commons BY license

RSC Adv., 2020,10, 42938-42952

QSPR model for Caco-2 cell permeability prediction using a combination of HQPSO and dual-RBF neural network

Y. Wang and X. Chen, RSC Adv., 2020, 10, 42938 DOI: 10.1039/D0RA08209K

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements