Marianna
Kotzabasaki‡
*,
Iason
Sotiropoulos‡
,
Costas
Charitidis
and
Haralambos
Sarimveis
*
School of Chemical Engineering, National Technical University of Athens, 9 Heroon Polytechneiou Street, Zografou Campus, 15780, Athens, Greece. E-mail: mariannako@chemeng.ntua.gr; hsarimv@central.ntua.gr; Fax: +30 2107723138; Tel: +30 2107723236 Tel: +302107723237
First published on 12th April 2021
Multi-walled carbon nanotubes (MWCNTs) are made of multiple single-walled carbon nanotubes (SWCNTs) which are nested inside one another forming concentric cylinders. These nanomaterials are widely used in industrial and biomedical applications, due to their unique physicochemical characteristics. However, previous studies have shown that exposure to MWCNTs may lead to toxicity and some of the physicochemical properties of MWCNTs can influence their toxicological profiles. In silico modelling can be applied as a faster and less costly alternative to experimental (in vivo and in vitro) testing for the hazard characterization of MWCNTs. This study aims at developing a fully validated predictive nanoinformatics model based on statistical and machine learning approaches for the accurate prediction of genotoxicity of different types of MWCNTs. Towards this goal, a number of different computational workflows were designed, combining unsupervised (Principal Component Analysis, PCA) and supervised classification techniques (Support Vectors Machine, “SVM”, Random Forest, “RF”, Logistic Regression, “LR” and Naïve Bayes, “NB”) and Bayesian optimization. The Recursive Feature Elimination (RFE) method was applied for selecting the most important variables. An RF model using only three features was selected as the most efficient for predicting the genotoxicity of MWCNTs, exhibiting 80% accuracy on external validation and high classification probabilities. The most informative features selected by the model were “Length”, “Zeta average” and “Purity”.
MWCNTs were discovered by Sumio Iijima in 1991,2 as soot-like products in the Krätschmer–Huffman arc discharge synthesis reactor used for the formation of fullerene (C60).3 Later, single-walled carbon nanotubes (SWCNTs) were discovered.4 SWCNTs are assembled into coaxial Russian-doll structures, forming MWCNTs with walls corresponding to each single-walled nanotube. The intertubular distance in MWCNTs is 0.340 nm. The main difference between SWCNTs and MWCNTs is that the former are flexible, while the latter are tough and rigid, rod-shaped structures. SWCNTs are also of smaller widths, with diameters typically in the range 1–2 nm, with curved structures rather than straight.5
Due to unique tubular structures and perfect covalent C–C bonding, MWCNTs exhibit fascinating physical and chemical properties, such as ultra-high mechanical strength, very good electrical and thermal conductivity (∼3000 W m−1 K−1, comparable to diamond), large aspect ratios, high surface area (SA), desirable chemical and environmental stability and distinct optical characteristics.6 MWCNTs are the strongest materials ever discovered. A MWCNT's highest measured tensile strength was up to 63 GPa, which is around 50 times higher than steel.5,7
The aforementioned MWCNTs' superb (and unique) characteristics give them great potential in various emerging applications in areas from biotechnology (imaging, tissue engineering, sensors and targeted drug delivery) to electronics (energy production and storage, transistors) and other fields of materials science (photonics, multi-functional coatings/films and nanocomposites).8 In the biomedical area, a number of applications of MWCNTs have been proposed including drug delivery, diagnosis and imaging, cancer diagnosis, gene therapy, photothermal therapy, etc. For instance, MWCNTs are efficiently used as carrier to deliver quantum dots (QDs) and proteins into cancer cells because QDs have photoluminescent properties, which are beneficial in bioimaging. MWCNTs may also be used for the treatment of HIV/AIDS and Neurodegenerative Diseases like Alzheimer Syndrome, while carboxylated-nanotubes may have potential as antioxidants in anti-ageing cosmetics and food preservation. Moreover, they have emerged as potential biosensors and nanorobots in diagnostics and as nanoprobes in scanning probe microscopy.9
Although MWCNTs have exceptional physicochemical properties, concerns have been raised about their safety, which are due to their small sizes and fibrous structures. MWCNTs may pose hazards, similar to asbestos.10 A number of physicochemical properties and characteristics of MWCNTs have been associated with their toxicological profiles: tube length and diameter, specific surface area, surface reactivity, metal impurities/catalysts, agglomeration state/dispersion, surface functionalization and rigidity/flexibility.11 As Poland et al. reported,12 longer MWCNTs exhibited stronger inflammation activity to mouse in the intraperitoneal administration route. Lam et al.,13 also pointed out that the content of metal used as a catalyst is very important. They concluded that the impure metals may enhance the pulmonary toxicity of carbon nanoforms. In other studies, it has been shown that contamination of amorphous carbon in the MWCNTs may affect strongly their biological activity, compared to purer MWCNTs.14,15 Additionally, it was reported that the agglomeration state of carbon nanomaterials also affected toxicity. Exposure to well-dispersed MWCNTs led to fewer granulomatous lesions in the lung, while non-dispersed nanostructures produced granulomatous inflammation.9
The investigation and study of potential hazards resulting from exposure to engineered MWCNTs is of particular interest in the nanotoxicology area. In silico predictive modelling is a computational data-driven approach that can be applied for the prediction of adverse effects of MWCNTs, in the effort to reduce animal testing. This aligns with Annex XI16 of Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) regulation (European Parliament and Council 2006), which describes alternative (non-animal) approaches such as grouping and read-across,17 Quantitative Structure–Activity Relationship (QSAR),18in vitro methods and weight of evidence, which can be used instead of in vivo tests to examine and evaluate the risks of exposure to chemicals.
Read-across methods are based on the hypothesis that similar chemical substances, follow a regular pattern, based on their chemical composition and/or physicochemical (PC) properties and may also have comparable toxicological properties. Thus, the toxicological profile of one target substance can be predicted by using data for the same endpoint from other reference substances. Grouping of MWCNTs types to read-across data for toxicological endpoints has been efficiently evaluated by Aschberger et al.,11 following the workflow proposed by the European Chemical Agency (ECHA) for grouping and reading across nanomaterials.19
Nano-QSARs, are mathematical models, which are based on the idea that the structure of a substance affects its activity and thus similar structures exhibit similar activities. Molecular descriptors (which are the physicochemical properties of MWCNTs in our case) play a fundamental role in QSAR and other in silico models, since they formally are the numerical representation of a molecular structure. For instance, Kotzabasaki et al.20 developed a robust QSAR model for predicting the nanotoxicity of superparamagnetic iron oxide nanoparticles (SPIONs) in stem-cell monitoring applications, using as molecular descriptors only the “overall particle size” and the “magnetic core chemical composition” of SPIONs. Kim21 and co-workers have successfully proposed a nano-QSAR model based on Quasi-SMILES to predict the cytotoxicity of MWCNTs to human lung cells.
In this study, we have developed a fully validated QSAR model for the prediction of genotoxicity of MWCNTs. Firstly, we selected 15 different types of MWCNTs and constructed a dataset that includes both physicochemical and toxicological data. Genotoxicity was selected as the hazard endpoint. According to REACH, genotoxicity testing is required, even at the lowest tonnage level (above 1 tonne per year per manufacturer/importer) for all substances manufactured in the EU.22 Next, we developed and fully validated a nanotoxicity classification QSAR model for predicting genotoxicity of MWCNTs using physicochemical characteristics as input features. A number of cheminformatics workflows based on machine learning algorithms, were constructed and implemented in order to produce the best performing model. The modelling workflows were designed to include the selection of the most important variables affecting the toxicity of MWCNTs and the definition of the domain of applicability (DOA)23 of the predictive model. The model was finally implemented as a ready-to-use web application in the Jaqpot computational platform (https://app.jaqpot.org/) and is available to the community through the BIORIMA virtual organisation.
In the final step of this study, the model with the best performance was chosen and was implemented as a user-friendly web application in the Jaqpot platform.
Features |
---|
Carbon purity (%)11 |
Minimum length (nm)29 |
Maximum length (nm)29 |
Average length (nm)29 |
Minimum diameter (nm)29 |
Maximum diameter (nm)29 |
Average diameter (nm)29 |
Specific surface area (SSA) measured by BET (m2 g−1)29 |
% Impurities (Fe2O3, CoO, NiO, MgO, MnO)29 |
Concentration of endotoxins (Eu ml−1)29 |
Combustion elemental analysis (CEA), C, H, N, O (wt%)27 |
Surface coatings OH, COOH, NH2, (mmol g−1)27 |
Zeta average (Zave) batch and zeta average at 12.5 and 200 μg ml−1 (nm)27 |
Polydispersity index (PdI) batch and PdI at 12.5 and 200 μg ml−1 (ref. 27) |
Reactive oxygen species (ROS) and respective peak concentrations (μg ml−1)27 |
The information available for each MWCNT was structured in a ready-for-modelling dataset including both physicochemical and toxicological information (Table S1 in the ESI†). The 15 MWCNTs with their partitioning into groups, the genotoxicity end-points and some key physicochemical features are presented in Table 2.
Group | Code | Type | Length | Diameter | BET | Impurity | Purity | CEA | Endotoxinest | Genotoxicity |
---|---|---|---|---|---|---|---|---|---|---|
Group I | NRCWE-040 | Pristine | 518.9 | 22.1 | 150 | 0.773 | 98.60 | 96.00 | 0.18 | 0 |
NRCWE-041 | –OH | 1005.0 | 26.9 | 152 | 0.462 | 99.20 | 97.00 | 0.22 | 0 | |
NRCWE-042 | –COOH | 723.2 | 30.2 | 141 | 0.321 | 99.20 | 96.00 | 026 | 0 | |
Group II | NRCWE-043 | Pristine | 771.3 | 55.6 | 82 | 1.219 | 98.50 | 96.00 | 0.25 | 0 |
NRCWE-044 | –OH | 1330.0 | 32.7 | 74 | 1.006 | 98.60 | 97.00 | 0.27 | 1 | |
NRCWE-045 | −COOH | 1553.0 | 30.2 | 119 | 2.782 | 96.30 | 93.00 | 0.34 | 1 | |
Group III | NRCWE-046 | Pristine | 717.2 | 29.1 | 223 | 0.783 | 98.70 | 96.00 | 0.19 | 0 |
NRCWE-047 | –OH | 532.5 | 22.6 | 216 | 0.781 | 98.70 | 97.00 | 0.01 | 0 | |
NRCWE-048 | –COOH | 1604.0 | 17.9 | 185 | 0.721 | 98.80 | 96.00 | 0.03 | 0 | |
NRCWE-049 | –NH | 731.4 | 14.9 | 199 | 0.738 | 98.80 | 97.0 | 0.05 | 0 | |
Standard | NM-400 | Pristine | 847.0 | 11.0 | 254 | 0.368 | 90.00 | 88.00 | 0.24 | 1 |
Materials | NM-401 | Pristine | 4048.0 | 67.0 | 18 | 0.065 | 99.19 | 98.00 | 0.42 | 0 |
NM-402 | Pristine | 1372.0 | 11.0 | 226 | 1.313 | 92.97 | 92.00 | 0.01 | 1 | |
NM-403 | Pristine | 1373.0 | 13.0 | 227 | 5.313 | 90.00 | 97.00 | 0.01 | 1 | |
NRCWE-006 | Pristine | 5700.0 | 65.0 | 26 | 0.680 | 99.00 | 98.00 | 0.51 | 1 |
Next step was the augmentation of the dataset. The standard deviations of “Diameter” values were used to create two new features: “Minimum Diameter” and “Maximum Diameter”. Feature “Type” indicates the type of the functional group of an observation and was originally a categorical field. This feature was encoded via the “One-Hot Encode” method resulting to 4 new binary features: “Type_Pristine”, “Type_OH”, “Type_COOH” and “Type_NH2”. For each analogue, these columns contained the value “0”, except from the column that represented the specific type of the analogue, e.g. for analogue NRCWE-041 “Type_Pristine”, “Type_COOH” and “Type_NH2” values were “0”, whereas the “Type_OH” column had the value “1”. The original “Type” column was then discarded.
The endpoint “genotoxicity” of this study was a binary column with values “1”, indicating an analogue as genotoxic, and “0” indicating an analogue as non-genotoxic. The endpoint values were taken from the Aschberger et al. study.11 Hence, according to the results of the in vitro chromosome aberration (micronucleus) analogues, NM-400, NM-402, NM-403 and NRCWE-006 were considered as toxic. In addition, results of in vivo DNA damage-comet assay indicated that NRCWE-044 and NRCWE-045 in different concentrations could result to DNA strand breaks, and these NMs were also considered as genotoxic.11
Finally, all features were scaled between values “0” and “1” (Table S2†). Scaling was performed by subtracting from each feature the lowest value and dividing it with the difference between highest and lowest values (min–max scaling).
Support Vector Machines (SVM) 34,37 is a method that aims to represent the data as points in the multi-dimensional space, mapped in a way that the samples of each class are divided by the widest possible gap. Given a training set, SVM constructs a set of hyperplanes that discern the analogues according to their class (in this case genotoxic or non-genotoxic). Finally, the hyperplane with the largest distance to the nearest training sample of any class is chosen for the classification.
Logistic Regression (LR) 34,38 is a statistical model that uses the “logistic function” in order to predict if a sample is labeled in class “0” or in class “1”. In LR, the logit function (log-odds) is calculated as a linear combination of the features and the values of this function can vary between 0 and 1. Then, by applying the logistic function the log-odds are converted to probability of class and also vary between 0 and 1. Finally, if the probability of a sample is higher than 0.5 the sample is labeled in class “1”, else it is tagged in class “0”.39
Naive Bayes (NB) 34,39 is a probabilistic model, based on the Bayes' theorem (eqn (1)), assuming independence among the features. A Gaussian NB model (GNB) is a Naive Bayes model where the likelihood of the features is assumed to follow a Gaussian distribution (eqn (2)).
(1) |
(2) |
Model | Hyperparameter | Description |
---|---|---|
RF | n_estimators | Number of trees in the forest |
Min_samples_split | Minimum number of observations required to split a node | |
Max_features | Maximum number of features to consider when looking for the best split of a node | |
LR | C | Inverse of regularization strength |
Penalty | Norm used in the penalization | |
SVM | C | Regularization parameter |
Gamma | Kernel coefficient | |
Kernel | Kernel type |
(3) |
hi = xTi(XTX)−1xi | (4) |
In the second workflow, the aim was to eliminate the “noisy” features, i.e. features which were not highly informative for predicting the endpoint of interest. The RFE method was used for selecting the most important variables.
Fig. 2 Diagram representing the percentage of the explained variance as a function of the principal components. |
For the selection of the testing samples, the KS algorithm31 was applied on the reduced dataset produced by the application of the PCA method. The following five analogues:
• NRCWE-040
• NRCWE-041
• NRCWE-048
• NM-401
• NM-402
were included in the test set, while the remaining 10 MWCNTs formulated the training set.
The RF, LR, SVM and NB statistical models were applied on the training set, in order to examine if the dimensional reduction process improves the prediction of genotoxicity endpoint. The hyperparameters of the RF, LR, SVM methods were optimized using the Bayesian optimization technique where the cross-validated accuracy score in the training set was the objective function that was minimised.40 The optimal parameters are presented in Table 4.
Model | Hyperparameter | Optimal values |
---|---|---|
RF | n_estimators | 9 |
Min_samples_split | 0.1025 | |
Max_features | 0.6983 | |
LR | C | 43.71 |
Penalty | L2 | |
SVM | C | 4.37 |
Gamma | — | |
Kernel | Linear |
The metrics of the models on the training set showed that all models were able to learn the underlying patterns of the training dataset, leading to an accuracy score of 1.0. The trained models were then used to predict the genotoxicity endpoint of the testing analogues. The performance of the produced models on the test set was examined using several classification metrics: confusion matrix, accuracy score, precision, specificity, sensitivity, F1-score and MCC metrics (eqn (S1)–(S7) of the ESI†). The accuracy score in a 4-fold cross validation test25 was used to evaluate the robustness of the models. The results are summarized in Table 5.
RF | LR | SVM | NB | |||||
---|---|---|---|---|---|---|---|---|
Accuracy | 0.800 | 0.800 | 0.800 | 0.600 | ||||
Precision | 0.000 | 0.50 | 0.500 | 0.000 | ||||
Sensitivity | 0.000 | 1.000 | 1.000 | 0.000 | ||||
Specificity | 1.000 | 0.750 | 0.750 | 0.750 | ||||
F1-score | 0.000 | 0.670 | 0.667 | 0.000 | ||||
MCC | 0.000 | 0.612 | 0.612 | −0.250 | ||||
Cross-validation | 0.833 | 0.542 | 0.708 | 0.625 | ||||
Confusion matrix | 4 | 0 | 3 | 1 | 3 | 1 | 3 | 1 |
1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 |
The values of the metrics indicated that NB was unable to predict correctly two out of five testing samples. In addition, the RF model classified all the testing samples as non-genotoxic indicating a possible overfit of the model. The rest of the models had an accuracy score of 0.8 due to a misclassification of a non-genotoxic MWCNT. However, this particular MWCNT (NRCWE-048) is considered to be slightly out the domain of applicability of the models, hence in can be neglected. The low cross validation metric indicated that the LR and NB models failed the cross-validation test. The RF and SVM models' high cross-validation scores illustrated that both these models are highly robust.
Model | Hyperparameter | Optimal values |
---|---|---|
RF | n_estimators | 10 |
Min_samples_split | 0.5 | |
Max_features | 0.1 | |
LR | C | 43.71 |
Penalty | L2 | |
SVM | C | 9.98 |
Gamma | 0.254 | |
Kernel | Sigmoid |
The metrics of the models on the training set showed that all models were able to learn the training dataset. In more details, the RF, LR and NB models classified all training samples correctly, whereas the SVM model misclassified one non-genotoxic MWCNT, leading to a 0.9 training accuracy score and 0.82 MCC score. The produced models were validated on the test-dataset using the same classification metrics that were presented in the first approach. The results are presented in Table 7.
RF | LR | SVM | NB | |||||
---|---|---|---|---|---|---|---|---|
Accuracy | 0.600 | 0.800 | 0.8000 | 0.600 | ||||
Precision | 0.0.333 | 0.500 | 0.333 | 0.333 | ||||
Sensitivity | 1.000 | 1.000 | 1.000 | 0.500 | ||||
Specificity | 0.500 | 0.500 | 0.750 | 0.500 | ||||
F1-score | 0.500 | 0.667 | 0.500 | 0.500 | ||||
MCC | 0.666 | 0.612 | 0.612 | 0.408 | ||||
Cross-validation | 0.708 | 0.458 | 0.917 | 0.458 | ||||
Confusion matrix | 2 | 2 | 3 | 1 | 3 | 1 | 2 | 2 |
0 | 1 | 0 | 1 | 0 | 1 | 0 | 1 |
The performance of all produced models was clearly poorer compared to the models trained on the reduced PCA dataset. None of the models managed to predict correctly all the testing samples, while cross validation scores were lower, with the exception of the SVM method. The reason for this poor performance was the existence of noisy and non-informative features in the training set. Therefore, we proceeded with eliminating these variables, by applying the RFE method.41 This method was applied on the RF and LR algorithms only, because these models include attributes that distinguish the most significant features. The significances of the features in the LR model are represented by their coefficient at the exponential of the decision function, whereas the significance of the features in the RF models is indicated by the Gini criterion.36 The importance of the features in the two models is presented graphically in Fig. 3.
The RFE iterative process41 was applied to exclude the less important features, by eliminating the features with the smallest importance in each iteration. The result of this process was the development of LR and RF models, which use only four or three features accordingly, as shown in Table 8.
Features for the LR model | Features for the RF model |
---|---|
Zeta average at 12.5 μg ml−1 | Zeta average at 12.5 μg ml−1 |
Length (average) | Length (average) |
Polydispersity index (batch) | Purity (%) |
Purity (%) | — |
The “Purity” of MWCNTs as well as the “Zeta average” values at a 12.5 μg ml−1 dose and the “Length” were considered as significant features for both models. The two models were further optimized by applying the Bayesian optimization method. The optimal values of the hyperparameters are displayed in Table 9.
Model | Hyperparameter | Optimal values |
---|---|---|
LR | C | 4.37 |
Penalty | L2 | |
RF | n_estimators | 19 |
Min_samples_split | 0.116 | |
Max_features | 0.666 |
The performance of these trained models was tested by classifying both the training and the corresponding testing analogues. Both models classified correctly all training samples, leading to an accuracy score equal to 1.0. The metric values corresponding to the test set are presented in Table 10.
RF | LR | |||
---|---|---|---|---|
Accuracy | 0.800 | 0.800 | ||
Precision | 0.500 | 0.500 | ||
Sensitivity | 1.000 | 1.000 | ||
Specificity | 0.750 | 0.750 | ||
F1-score | 0.666 | 0.666 | ||
MCC | 0.612 | 0.612 | ||
Cross-validation | 0.917 | 1.000 | ||
Confusion matrix | 3 | 1 | 3 | 1 |
0 | 1 | 0 | 1 |
The accuracy score, precision, recall and F1-metrics and the confusion matrices all indicated that both LR and RF models were able to classify correctly the majority of the testing samples. Among the two methods, LR had a higher cross-validation score and was more robust. To further evaluate the performance of the two models, we present in Table 11 the probability distributions of all predicted outputs over the set of two classes. The probabilities were similar for the RF and the LR models and not close to 0.5, which indicating that the successful classifications are not due to chance correlations.
RF | LR | |||
---|---|---|---|---|
MWCNT | Prob. of “0” class | Prob. of “1” class | Prob. of “0” class | Prob. of “1” class |
NRWCE-040 | 0.74 | 0.26 | 0.74 | 0.26 |
NRWCE-041 | 0.74 | 0.26 | 0.82 | 0.18 |
NRWCE-048 | 0.68 | 0.32 | 0.73 | 0.27 |
NM-401 | 0.31 | 0.69 | 0.36 | 0.64 |
NM-402 | 0.00 | 1.00 | 0.31 | 0.69 |
Next step of the workflow was the calculation of DOA23,42 according to the Leverage method for the models. The thresholds were estimated to be equal to 1.2 for the LR model and 0.9 for the RF model and the values of the testing analogues are displayed below in Table 12. According to this table, all test MWCNTs were clearly within the DOA of the model, which further supported our confidence on the predictions of the model.
Name | RF | LR | ||
---|---|---|---|---|
Leverage value | Reliability | Leverage value | Reliability | |
NRCWE-040 | 0.20 | Reliable | 0.20 | Reliable |
NRCWE-041 | 0.18 | Reliable | 0.36 | Reliable |
NRCWE-048 | 0.33 | Reliable | 0.43 | Reliable |
NM-401 | 0.48 | Reliable | 0.51 | Reliable |
NM-402 | 0.13 | Reliable | 0.19 | Reliable |
A number of models produced by the two workflows were highly successful with respect to the prediction accuracy criterion. The final LR and RF models were similar in terms of performance. The LR model's mean cross validation score is equal to 1.0 indicating that it may be considered as a slightly more robust model compared to the RF modes (mean cross validation score 0.917). However, the simplicity criterion was considered as a key parameter to select the best model to classify the genotoxicity of MWCNTs. Under this consideration, the RF that predicted the genotoxicity end point using only three input features was selected as the model that outperformed the rest with respect to all other criteria: highest cross validation score, high MCC score on the testing data, high probabilities of the predictions and use of only three features. This model was chosen as the most efficient model for predicting genotoxicity of MWCNTs. The model was finally trained on the full dataset, in order to take into account all the information in the available dataset. The final model predicted correctly the genotoxicity endpoint for all samples in the full dataset. The threshold that defines the DOA of the full model was 0.9.
In conclusion, this study exhibited the value of using chemoinformatic techniques to produce a reliable model for predicting genotoxicity of MWCNTs. In future studies the predictive power of the model can be improved by considering information and data on extrinsic properties of the surrounding medium (pH, serum proteins, ionic strength).46,47
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0na00600a |
‡ Both authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2021 |