Zemin Zhua,
Ziaur Rahmana,
Muhammad Aamira,
Syed Zahid Ali Shahb,
Sattar Hamidc,
Akhunzada Bilawald,
Sihong Lie and
Muhammad Ishfaq*a
aCollege of Computer Science, Huanggang Normal University, Huanggang 438000, China. E-mail: muhammad@hgnu.edu.cn; Tel: +86 15972855212
bDepartment of Pathology, Faculty of Veterinary and Animal Sciences, The Islamia University of Bahawalpur-Pakistan, Pakistan
cThe University of Agriculture Peshawar, Khyber Pakhtunkhwa, 25130, Pakistan
dCollege of Food Science, Northeast Agricultural University, Harbin, China
eKey Laboratory of Applied Technology on Green-Eco-Healthy Animal Husbandry of Zhejiang Province, College of Animal Science and Technology, College of Veterinary Medicine, Zhejiang A&F University, Hangzhou 311300, China
First published on 11th January 2023
Mycoplasma pneumoniae (MP) is one of the most common pathogenic organisms causing upper and lower respiratory tract infections, lung injury, and even death in young children. Toll-like receptors (TLRs) play an important role in innate immunity by allowing the host to recognize pathogens invading the body. Previous studies demonstrated that TLR4 is a potential therapeutic target for the treatment of MP pneumonia. Therefore, the present study aimed to screen biologically active ingredients that target the TLR4 receptor pathway. We first used molecular docking to screen out the active compounds inhibiting the TLR4 pathway, and then used regression and classification machine learning algorithms to establish a quantitative structure–activity relationship (QSAR) model to predict the biological activity of the screened compounds. A total of 78 molecules were used in QSAR modelling, which were retrieved from the ChEMBL database. The QSAR models had acceptable correlation coefficients of R2 on the training and testing dataset in the range of 0.96 to 0.91 and 0.93 to 0.76, respectively. The multiclass classification models showed accuracy on training and testing data within ranges of 1.0 to 0.70, 0.96 to 0.63, and log loss ranges from 0.27 to 8.63, respectively. In addition, molecular descriptors and fingerprints have been studied as structural elements involved in increased and decreased inhibitory activities. These results provide a quantitative analysis of QSAR and classification models applicable for high-throughput screening, as well as insights into the mechanisms of inhibition of TLR4 antagonists.
Recently, more and more active compounds in natural products have been found to inhibit the TLR4 pathway. Studies revealed that natural products are rich in molecules that possess the potential to inhibit the TLR4 protein and have attracted the attention of researchers.12–15 Furthermore, TLR4 inhibition mediated by small molecules led to an array of research focusing on the molecular mechanism of action of TLR4 inhibitors.16 However, future studies are needed to confirm these findings. Besides, the emergence of antibiotic resistance represents another challenge in the context of the treatment of MP infections.17 It is therefore crucial to find alternatives to antibiotics to prevent the emergence of resistance against anti-mycoplasma drugs. Hence, active ingredients of natural products could be used to modulate the host immune inflammatory response. The active ingredients of natural products require more experimental data on its toxicology and pharmacology before its use in clinical trials, which is a lengthy process. To screen TLR4 inhibitors in a limited period of time, it is necessary to use fast, accurate, and reliable screening methods based on detailed study of TLR4 inhibition and regulation. In this context, modern machine learning technologies make better use of information obtained from several sources to predict the bioactivities of drugs for several diseases, thus facilitating the discovery of new drugs more efficiently. In recent years, computer-aided drug screening is advancing towards practicality and is emerging as a core technology for innovative drug research. Several drug libraries were screened in a very short time-period, leading to the discovery of many active compounds in traditional Chinese medicines and the successful repurposing of several approved drugs.18,19 Based on previous research, virtual screening paved the way for the future development of improved chemical analogs for use in treating a wide variety of human and animal diseases through medicinal chemistry structure–activity relationships and drug screenings.20 Molecular docking and quantitative structure activity relationship (QSAR) models provide structural information and insight into TLR4 inhibitors that can be used to guide more effective drug development, including screening and rational drug discovery of TLR4 inhibitors. Therefore, the objective of the present study was to identify lead compounds that can inhibit the TLR4 protein for the treatment of MP pneumonia. The regression and classification QSAR models were developed from a set of known chemical TLR4 inhibitors. These QSAR models will be used to predict and classify the bioactive compounds based on their predicted bioactivity (pIC50) values and provide theoretical foundations to enable the development of potent drugs from natural products for the prevention and treatment of MP-pneumonia. The flow chart for the experimental process is shown in Fig. 1.
S. No. | Drug names | MW | Docking score | RFR model | ETR model | DTR model | ABR model | GBR model |
---|---|---|---|---|---|---|---|---|
1 | (R)-2-(3-(3-Carbamoyl-5-methylphenylsulfonamido)tetrahydrofuran-3-yl)acetic acid | 342.37 | −5.228 | 5.430829277 | 6.314821411 | 6.958607315 | 4.813787229 | 5.136554754 |
2 | N-(2-Oxo-2-((6R)-4-oxo-3,9-diazabicyclo[4.2.1]nonan-9-yl)ethyl)-1H-indole-2-carboxamide | 340.38 | −5.127 | 4.815372966 | 6.16129079 | 4.978810701 | 4.644380954 | 4.864479656 |
3 | 4-(N-(2-Carbamoylphenyl)sulfamoyl)-3-fluorobenzoic acid | 338.31 | −5.014 | 5.131434015 | 5.942968037 | 4.779891912 | 4.644380954 | 4.848622926 |
4 | (R)-3-(5-Oxo-2,5-dihydro-1H-1,2,4-triazol-3-yl)-N-((3-(trifluoromethyl)-1H-1,2,4-triazol-5-yl)methyl)piperidine-1-carboxamide | 360.30 | −4.888 | 5.335803885 | 4.311301872 | 5 | 5.128070842 | 5.154233598 |
5 | 1-(((1S,2R)-2-Hydroxy-1,2,3,4-tetrahydronaphthalen-1-yl)carbamoyl)cyclopent-3-enecarboxylic acid | 301.34 | −4.857 | 5.162769848 | 5.498993506 | 5.229147988 | 5.853871964 | 5.136223419 |
6 | 3-(3-Amino-5-methylisoxazole-4-sulfonamido)-4-methoxybenzoic acid | 327.31 | −4.841 | 5.42557839 | 5.515008442 | 7.096910013 | 6.198970004 | 4.768100468 |
7 | 5-(2-((2-Aminoethyl)amino)thiazol-4-yl)-2-hydroxybenzamide | 278.33 | −4.788 | 5.062946566 | 4.451147788 | 6.795880017 | 4.676057904 | 4.715750754 |
8 | (R)-3-((4-Amino-6,7-dimethoxyquinazolin-2-yl)amino)-2-hydroxy-2-methylpropanoic acid | 322.32 | −4.765 | 5.384378737 | 4.723592771 | 5 | 5.551578992 | 5.318842099 |
9 | 4-((3-(1-(3-Methylbutanoyl)piperidin-4-yl)ureido)methyl)benzoic acid | 361.44 | −4.729 | 5.352768581 | 4.604814799 | 6.958607315 | 5.128070842 | 4.930009052 |
10 | (S)-3-(5-Fluoropyridine-3-sulfonamido)-2-hydroxypropanoic acid | 264.23 | −4.626 | 5.469392305 | 5.131277132 | 6.958607315 | 5.857242359 | 5.598004671 |
11 | 3,4-Difluoro-N-(2-((2R,4R)-4-hydroxy-2-(hydroxymethyl)pyrrolidin-1-yl)-2-oxoethyl)benzamide | 314.29 | −4.559 | 5.303713126 | 4.866537918 | 6.958607315 | 5.121293699 | 4.907629008 |
12 | (S)-3-(2,3-Dichlorophenylsulfonamido)-2-hydroxypropanoic acid | 314.14 | −4.530 | 5.448314838 | 4.792741186 | 6.958607315 | 5.121293699 | 4.849140284 |
13 | 2-(((3R,4R)-3-Methyltetrahydro-2H-pyran-4-yl)amino)-5-sulfamoylbenzoic acid | 314.36 | −4.524 | 5.737739993 | 4.597185429 | 6.795880017 | 4.644380954 | 5.211471563 |
14 | (R)-2-(3-Methyl-1H-1,2,4-triazol-5-yl)-N-(4-oxochroman-3-yl)acetamide | 286.29 | −4.508 | 5.198586102 | 4.835972975 | 6.795880017 | 5.121293699 | 4.841100257 |
15 | 4-(N-(2-Amino-2-oxoethyl)-N-benzylsulfamoyl)-1H-pyrrole-2-carboxylic acid | 337.35 | −4.507 | 4.93587296 | 5.078963707 | 4.256568635 | 4.644380954 | 5.278315448 |
16 | (R)-4-Isopropoxy-3-(1-(tetrahydrofuran-3-yl)-1H-pyrazole-4-sulfonamido)benzoic acid | 395.43 | −4.504 | 5.659485927 | 5.673346573 | 5 | 4.676057904 | 5.283553443 |
17 | 3-(N-(5-Cyano-2-(methylamino)phenyl)sulfamoyl)-5-fluorobenzoic acid | 349.34 | −4.489 | 5.2393042 | 6.270021811 | 4.586700236 | 4.676057904 | 5.202654077 |
18 | 1-((2-(5-Fluoro-1H-indol-3-yl)ethyl)carbamoyl)azetidine-3-carboxylic acid | 305.30 | −4.463 | 5.138107357 | 5.712863539 | 6.958607315 | 5.121293699 | 4.859899139 |
19 | 4-(3-Ethylphenylsulfonamido)-3-hydroxybenzoic acid | 321.35 | −4.456 | 5.326559803 | 5.193816275 | 5 | 5.266000713 | 5.426620204 |
20 | (R)-3-(3-(5,6-Dimethyl-4-oxo-1,4-dihydrothieno[2,3-d]pyrimidin-2-yl)propanamido)-2-hydroxy-2-methylpropanoic acid | 353.39 | −4.450 | 5.372020827 | 5.783184637 | 5 | 5.121293699 | 5.442711298 |
21 | (2R,3S,4R)-1-(Tert-butoxycarbonyl)-3,4-dihydroxypyrrolidine-2-carboxylic acid | 247.25 | −4.443 | 5.469071944 | 5.613893366 | 5 | 5.121293699 | 4.953810874 |
22 | 2,4-Difluoro-N-(2-((2R,4R)-4-hydroxy-2-(hydroxymethyl)pyrrolidin-1-yl)-2-oxoethyl)benzamide | 314.29 | −4.411 | 5.278958496 | 4.61805273 | 6.958607315 | 4.813787229 | 4.931341817 |
23 | (3R,5R)-1-(6-(((3-Cyclopropyl-1H-pyrazol-5-yl)methyl)amino)pyrimidin-4-yl)-5-((dimethylamino)methyl)pyrrolidin-3-ol | 357.45 | −4.399 | 5.284046701 | 4.564151855 | 6.602059991 | 5.121293699 | 5.125098228 |
24 | (R)-2-(1-Oxo-1,2-dihydroisoquinoline-3-carboxamido)-3-phenylpropanoic acid | 336.34 | −4.386 | 5.163010804 | 5.302824295 | 5.853871964 | 5.595633098 | 5.484515313 |
25 | (R)-4-(N-(2-Oxo-2-((tetrahydro-2H-pyran-3-yl)amino)ethyl)sulfamoyl)benzoic acid | 342.37 | −4.380 | 5.820985738 | 5.221450444 | 6.958607315 | 5.035830554 | 5.35223672 |
26 | (R)-2-(6,7-Dihydro-5H-pyrrolo[1,2-a]imidazole-3-sulfonamido)-2-(3-methoxyphenyl)acetic acid | 351.38 | −4.365 | 5.275202795 | 5.291271987 | 5 | 4.676057904 | 5.012575341 |
27 | (R)-N-(1-Amino-3-methoxy-1-oxopropan-2-yl)-7-methyl-1H-indole-2-carboxamide | 275.30 | −4.342 | 5.130831094 | 5.946522112 | 6.795880017 | 5.121293699 | 4.992894312 |
28 | N-((2R,3R)-4-Hydroxy-3-(methylthio)butan-2-yl)-2-oxo-2,3-dihydrobenzo[d]oxazole-6-sulfonamide | 332.40 | −4.339 | 5.36250487 | 5.963766686 | 6.958607315 | 4.813787229 | 4.866545583 |
29 | 1-(2-Morpholino-2-oxoethyl)-3-(pyridin-3-yl)urea | 264.28 | −4.278 | 5.323176653 | 4.938421827 | 6.795880017 | 5.121293699 | 4.845185655 |
30 | (3R,4R)-1-((3-Carbamoylphenethyl)carbamoyl)-4-methylpiperidine-3-carboxylic acid | 333.38 | −4.260 | 5.316126193 | 4.52083702 | 6.476253533 | 5.121293699 | 4.632121561 |
Fig. 3 Panels (A–F) show exploratory TLR4 inhibitors data analysis and panel (G) shows chemical space analysis. The scatter plot showed the diversity of ALogP versus MW of TLR4 inhibitory compounds. |
Models | R2 (train) | RMSE (train) | MAE (train) | R2 (test) | RMSE (test) | MAE (test) | R2 (CV) | RMSE (CV) | MAE (CV) |
---|---|---|---|---|---|---|---|---|---|
RFR | 0.91 | 0.36 | 0.25 | 0.89 | 0.39 | 0.3 | 0.71 | 0.65 | 0.68 |
ETR | 0.96 | 0.23 | 0.06 | 0.76 | 0.53 | 0.41 | 0.76 | 0.56 | 0.41 |
DTR | 0.96 | 0.22 | 0.04 | 0.82 | 0.53 | 0.42 | 0.63 | 0.69 | 0.51 |
ABR | 0.91 | 0.36 | 0.26 | 0.89 | 0.39 | 0.29 | 0.74 | 0.59 | 0.42 |
GBR | 0.96 | 0.23 | 0.06 | 0.93 | 0.29 | 0.21 | 0.79 | 0.51 | 0.36 |
Accuracy (train) | RMSE (train) | MAE (train) | Accuracy (test) | RMSE (test) | MAE (test) | Accuracy (CV) | RMSE (CV) | MAE (CV) | Log loss |
---|---|---|---|---|---|---|---|---|---|
1 | 0 | 0 | 0.96 | 0.2 | 0.04 | 0.82 | 0.46 | 0.19 | 0.27 |
Precision | Recall | F1-score | Support | |
---|---|---|---|---|
Classification report of RF model on test data | ||||
0 | 0.888889 | 1 | 0.941176 | 8 |
1 | 1 | 0.933333 | 0.965517 | 15 |
2 | 1 | 1 | 1 | 1 |
Accuracy | 0.958333 | 0.958333 | 0.958333 | 0.958333 |
Macro avg. | 0.962963 | 0.977778 | 0.968898 | 24 |
Weighted avg. | 0.962963 | 0.958333 | 0.95884 | 24 |
Classification report of RF model on cross validation data | ||||
0 | 0.736842 | 0.777778 | 0.756757 | 18 |
1 | 0.844828 | 0.924528 | 0.882883 | 53 |
2 | 1 | 0.142857 | 0.25 | 7 |
Accuracy | 0.820513 | 0.820513 | 0.820513 | 0.820513 |
Macro avg. | 0.860557 | 0.615054 | 0.62988 | 78 |
Weighted avg. | 0.833834 | 0.820513 | 0.79698 | 78 |
Models | Accuracy (train) | RMSE (train) | MAE (train) | Accuracy (test) | RMSE (test) | MAE (test) | Log loss |
---|---|---|---|---|---|---|---|
RF model | 1 | 0 | 0 | 0.96 | 0.2 | 0.04 | 0.27 |
KNeighbors classifier | 0.76 | 0.64 | 0.3 | 0.75 | 0.5 | 0.25 | 5.95 |
SVC | 0.7 | 0.54 | 0.29 | 0.63 | 0.61 | 0.37 | 0.87 |
Decision-tree classifier | 1 | 0 | 0.35 | 0.88 | 0.35 | 0.13 | 4.31 |
AdaBoost classifier | 0.98 | 0.14 | 0.02 | 0.92 | 0.46 | 0.13 | 0.54 |
Gradient boosting classifier | 1 | 87.5 | 0 | 0.88 | 0.35 | 0.13 | 0.97 |
Linear discriminant analysis | 0.98 | 0.27 | 0.04 | 0.71 | 0.54 | 0.29 | 5.79 |
Quadratic discriminant analysis | 1 | 0 | 0 | 0.75 | 0.5 | 0.25 | 8.63 |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2ra06178c |
This journal is © The Royal Society of Chemistry 2023 |