Ali Kamran‡
a,
Abdul Naman‡a,
Muhammad Irfan Majeed*a,
Haq Nawaz*a,
Najah Alwadie*b,
Noor ul Hudaa,
Umm-e- Habibaa,
Tania Tabussama,
Aqsa Banoa,
Hawa Hajaba,
Rabeea Razaqa,
Ayesha Ashrafa,
Saima Aziza,
Maria Asghara and
Muhammad Imranc
aDepartment of Chemistry, University of Agriculture Faisalabad, Faisalabad 38000, Pakistan. E-mail: irfan.majeed@uaf.edu.pk; haqchemist@yahoo.com
bDepartment of Physics, College of Science, Princess Nourah bint Abdulrahman University, P. O. Box 84428, Riyadh 11671, Saudi Arabia. E-mail: nhalwadie@pnu.edu.sa
cDepartment of Chemistry, Faculty of Science, King Khalid University, P. O. Box 9004, Abha 61413, Saudi Arabia
First published on 13th March 2024
The ability of surface-enhanced Raman spectroscopy (SERS) to generate spectroscopic fingerprints has made it an emerging tool for biomedical applications. The objective of this study is to confirm the potential use of Raman spectroscopy for early disease diagnosis based on blood serum. In this study, a total of sixty blood serum samples, consisting of forty from diseased patients and twenty (controls) from healthy individuals, was used. Because disease biomarkers, found in the lower molecular weight fraction, are suppressed by higher molecular weight proteins, 50 kDa Amicon ultrafiltration centrifugation devices were used to produce two fractions from whole blood serum consisting of a filtrate, which is a low molecular weight fraction, and a residue, which is a high molecular weight fraction. These fractions were then analyzed, and their SERS spectral data were compared with those of healthy fractions. The SERS technique was utilized on blood serum, filtrate and residue of patients with tuberculosis to identify characteristic SERS spectral features associated with the development of disease, which can be used to differentiate them from healthy samples using silver nanoparticles as a SERS substrate. For further analysis, the effective chemometric technique of principal component analysis (PCA) was used to qualitatively differentiate all the analyzed samples based on their SERS spectral features. Partial least squares discriminant analysis (PLS-DA) accurately classified the filtrate portions of healthy and tuberculosis samples with 97% accuracy, 97% specificity, 98% sensitivity, and an area under the receiver operating characteristic (AUROC) curve of 0.74.
The tuberculin skin test (TST) and interferon gamma release assay (IGRA) are used as medical tests for the diagnosis of tuberculosis disease. Different biomedical tests are used to detect Mycobacterium tuberculosis, including sputum smear microscopy,5 nucleic acid amplification test (NAAT),6 enzyme-linked immunosorbent assay (ELISA),7 chest radiograph,8 biosensor technique,9 acid-fast bacilli test,10 different DNA biomarkers and polymerase chain reaction.11 These tests are time-consuming, laborious and require costly consumables.
Raman spectroscopy is used in the early detection and identification of molecular components related to the disease. Vibrational spectroscopy has been widely used to analyze and understand processes in disease development. Raman spectroscopy of bodily fluids, notably blood serum, can be used to identify the chemical fingerprints of samples.12,13 Raman spectroscopy is an efficient qualitative and quantitative analytical tool for rapidly obtaining critical information.14 Although weak signal intensity often occurs with Raman spectroscopy, it is rapid, inexpensive, and precise. A few difficulties also exist when attempting to identify low concentrations of biofluids such as serum, plasma, joint submucosa, and lymph.15 Raman spectroscopy provides low-intensity Raman signals that are enhanced by nanoparticles as substrates, which enhance the ability to detect lower concentrations of biofluids.16 SERS has been previously used to characterize and compare the different types of TB disease, including pulmonary and extra-pulmonary TB, using healthy and diseased blood samples.2,17
Blood serum contains high molecular weight fraction (HMWF) and low molecular weight fraction (LMWF) proteins. The LMW protein fractions are thought to be tuberculosis disease biomarkers, but the analysis of human whole blood serum for the detection of TB disease-specific biomarkers is difficult because human blood serum contains large molecular weight proteins such as albumin and globulin that hinder the detection and identification of smaller molecular weight proteins.18
The current study focuses on the SERS analysis of filtrate samples obtained from healthy patients and patients with TB that were purified using 50 kDa filtration devices. This can assist in identifying characteristic SERS spectral signatures linked to specific protein molecules smaller than the filter size of 50 kDa and hence bypass the features related to proteins larger than this pore size. Thus, the blood serum samples from patients with TB can be identified and differentiated based on their distinct SERS spectral features. Notably, it was found that thus far, no studies have been published on this particular topic.
All blood serum samples were collected from PINUM Hospital, Faisalabad. All serum samples were collected from male patients 45 to 50 years of age, and healthy samples were collected from males without any co-morbidity or participation in any therapy. The mean patient age for all samples was calculated as 47.71, and the calculated standard deviation for age was 1.01 years. Thus, the controls and patients were of the same age group and sex. A total number of 60 samples were centrifuged, consisting of 20 healthy samples and 40 disease samples. This study was approved by the Bioethical Committee of the University of Agriculture Faisalabad, Pakistan.
PLS-DA is a supervised methodology that plays a vital role in statistical modeling. PLS-DA is utilized to examine the interrelationship between two distinct sets of variables consisting of a set of independent variables and a set of dependent variables. By employing PLS-DA, the SERS spectral data can effectively be analyzed. PLS-DA was used to find the linear combinations of the variables that best explain the variation in the data, so that the data can be classified into different groups or classes. Monte Carlo simulation is a technique that can be used to estimate the uncertainty in PLS-DA results by generating many random samples from the data. PLS-DA is then performed, and this can provide a range of possible outcomes and the likelihood of certain results.
The receiver operating characteristic (ROC) curve is a visual representation of the performance of a binary classifier system as the decision threshold is adjusted. It illustrates how the true positive rate (sensitivity) and false positive rate (1-specificity) of the classifier model changes with varying thresholds. The effectiveness of the ROC curve is quantified by the area under the curve (AUC), which assesses the overall effectiveness of the classifier. A perfect classifier would have an AUC of 1, indicating ideal discrimination ability, while an AUC of 0.5 signifies a classifier model that performs no better than random guessing.
Fig. 1 SERS mean spectra of uncentrifuged TB serum, and the filtrate and residue portions of TB serum. |
SERS bands | SERS peak assignment | Components | References |
---|---|---|---|
515–592 cm−1 | Phosphatidylinositol | Lipids | 22 and 23 |
640 cm−1 | Uric acid | Uric acid | 24 |
650 cm−1 | Tyrosine C–C twisting mode | Proteins | 23 |
681 cm−1 | Resonant ring vibrations within the DNA bases | DNA/RNA | 25 |
708 cm−1 | Cholesterol ester | Lipids | 26 |
728 cm−1 | Hypoxanthine | Hypoxanthine | 24 |
743 cm−1 | Symmetric breathing of tryptophan | Protein | 23 |
778 cm−1 | Cytosine/uracil ring breathing (nucleotides) | DNA/RNA | 27 |
802 cm−1 | Thymine-based DNA/RNA bases exhibiting ring-breathing mode | DNA/RNA | 23 |
815 cm−1 | C–O–C stretching vibration | Protein | 28 |
838 cm−1 | Glucose saccharide α-anomers exhibiting an α-saccharide or α-band | Carbohydrates | 16 |
889 cm−1 | Uric acid | Uric acid | 24 |
918 cm−1 | Proline, hydroxyproline, glycogen, and lactic acid | Proteins | 25 |
945 cm−1 | Stretching vibrations of single bonds in amino acids, including proline and valine | Proteins | 4 |
1008 cm−1 | Pectin, phenylalanine | Proteins | 13 |
1028 cm−1 | Stretching of methoxy groups (O–CH3) | DNA/RNA | 29 |
1089 cm−1 | Phospholipids | Lipids | 29 |
1099 cm−1 | Phospholipids | Lipids | 30 |
1133 cm−1 | C–C stretching mode of lipids | Lipids | 27 |
1173 cm−1 | C–H bending mode of tyrosine | DNA | 31 |
1203 cm−1 | Uric acid | Uric acid | 24 |
1273 cm−1 | Adenine ring-breathing mode | DNA | 32 |
1288 cm−1 | Cytosine ring-breathing mode in RNA | DNA | 32 |
1314 cm−1 | Guanine base | DNA/RNA | 33 and 34 |
1359 cm−1 | Guanine (N7, B, Z-marker) | DNA | 16 |
1401 cm−1 | Thymine | DNA | 35 |
1448 cm−1 | C–H vibration | Proteins | 27 and 36 |
1494 cm−1 | DNA base ring-breathing modes | DNA | 37 |
1540 cm−1 | Aromatic hydrogen, amide carbonyl (CO) vibrations | Proteins | 38 |
1689 cm−1 | Amide I (non-hydrogen bonded; disordered structure) | Proteins | 39 |
1786 cm−1 | Lipid content | Lipids | 40 |
In Fig. 1, three different types of SERS peak features are identified, which are denoted by solid, dotted, and dotted-dashed lines. These characteristic peaks, which are solely present in uncentrifuged/whole serum, are denoted by dotted-dashed lines, while the features that are present in only the filtrate and residue portions are represented by solid lines. The features common to both (filtrate and residue portions and whole serum) are shown by dotted lines. These SERS spectral features are the biomarkers that differentiate between filtrate and residue portions and uncentrifuged/whole serum samples. The SERS peak features, which are only present in uncentrifuged serum, are denoted by dotted-dashed lines and include 640, 708, 1089, 1173, 1203 and 1288 cm−1. The SERS peaks observed in the filtrate and residue portions of the serum are denoted by solid lines and include 650, 728, 743, 778, 802, 1099, 1133, 1273, 1359, 1401, 1494 and 1540 cm−1. There are other features that are common in the filtrate and residue portions and uncentrifuged serum samples, and they are shown with dotted lines and include 515, 592, 681, 838, 889, 945, 1028 and 1786 cm−1.
The bands that appear at 640 are associated with uric acid, 708 (cholesterol ester), 1089 (phospholipids), 1173 (C–H bending mode of tyrosine), 1203 (uric acid) and 1288 cm−1 (cytosine ring-breathing mode in RNA). After centrifugation of serum samples, some biomolecules with molecular sizes larger than 50 kDa were removed from the sample. The SERS peaks of these remaining biomolecules in the filtrate portion were compared with biomolecules of uncentrifuged samples to identify the differentiating SERS features.
Fig. 2 Mean SERS spectra of filtrate portions of healthy and tuberculosis positive blood serum samples with standard deviation. |
Some peaks present in the filtrate portions of healthy and disease samples denoted by dotted and dotted-dashed lines are linked with increasing and decreasing peak intensities, respectively. The SERS spectral peak at 592, which is associated with lipids (phosphatidylinositol) is present at a higher intensity in the disease filtrate as compared to the healthy filtrate portion, while peaks 640, 728, 1089, 1099, 1133, 1359 and 1448 cm−1 were higher in intensity in healthy filtrates as compared to disease filtrate portions.
The SERS spectral peaks of the tuberculosis filtrate include 515 (phosphatidylinositol), 681 (ring-breathing patterns in DNA bases), 708 (cholesterol ester), 802 (thymine-based ring-breathing mode of DNA/RNA base), 838 (glucose saccharide α-anomers exhibiting an α-saccharide or α-band), 945 (stretching vibrations of amino acids), 1028 (stretching of O–CH3), 1314 (guanine base of DNA), 1540 (aromatic hydrogen, amide carbonyl with (CO) vibrations) and 1786 cm−1 (C–C stretching of lipids). These are the major biomarkers of the filtrate portion from patients with tuberculosis. Other SERS peaks that are only present in healthy filtrates were observed at 547 (lipids), 743 (symmetric breathing of tryptophan), 815 (C–O–C stretching vibration), 1008 (pectin, phenylalanine) and 1099 cm−1 (palmitic acid). SERS peaks at 889 and 1203 cm−1 are associated with uric acid.
Some SERS spectral features were present with higher intensity in the disease filtrate portions (which are denoted by dotted-dashed lines) as compared to healthy filtrates of the serum, such as 592 cm−1, which is associated with lipids. Other features of the SERS spectra with higher intensity of peaks in the healthy filtrates as compared to disease filtrates are observed at 640 (uric acid), 728 (hypoxanthine), 889 (uric acid), 1089 (phospholipids), 1099 (palmitic acid), 1133 (palmitic acid), 1203 (uric acid), 1359 (guanine) and 1448 cm−1 (C–H vibration). The original SERS spectra of some diseased and healthy/control samples are shown in Fig. S1 and S2,† respectively.
Fig. 3(a) shows scatter plots of the SERS spectra of filtrates of tuberculosis and healthy samples. Green dots represent the spectra of healthy filtrate portions, while pink dots show the spectra of filtrates of disease samples. The clusters of green dots on the positive side of the x-axis represent healthy filtrate samples, while clusters of pink dots observed on the negative side of the x-axis represent disease filtrates. PC-1 (first principal component) on the x-axis separating two different groups shows a maximum variability value of 70.92%. PC-2 (second principal component) shows 9.78% variability in the dataset. These results show that the diseased and healthy samples are separately clustered from each other, which indicates distinct and significant SERS spectral differentiation for both. PCA provides information only to visualize the data but does not provide any information regarding the variation in data that are separately clustered.
Fig. 3 Pair-wise PCA analysis. (a) Scatter plot and (b) loadings between SERS spectral datasets of filtrate portions of healthy and tuberculosis disease samples. |
Fig. 3(b) shows loadings between SERS spectral bands of filtrates of disease and healthy samples. The negative loading shows the spectra in the scatter plot clustered on the negative side of the axis, while positive loading shows the spectra clustered on the positive axis of the scatter plot. The negative loadings include 547 (lipids), 640 (uric acid), 728 (hypoxanthine), 815 (C–O–C stretching vibration), 889 (uric acid), 1089 (phospholipids), 1133 (palmitic acid), 1203 (uric acid), 1273 (ring-breathing mode of adenine), 1359 (guanine (N7, B, Z marker)), 1448 (C–H vibration) and 1689 cm−1 (amide-I). The loadings of TB filtrates on the positive side of PC-1 include the SERS bands of 515 (phosphatidylinstole), 592 (phosphatidylinositol), 802 (thymine-based ring-breathing mode (DNA/RNA)), 838 (α-saccharide, α-anomers of glucose), 945 (stretching vibrations of single bonds in amino acids, including proline and valine), 1028 (stretching of (O–CH3)), 1401 (thymine), 1540 (aromatic hydrogen, amide carbonyl vibrations) and 1786 cm−1 (C–C stretching of lipids).
SERS spectral bands are extremely influenced by the elimination of larger proteins from biological samples, such as blood serum, through filtration with a 50 kDa filter. By separating major proteins with high molecular weights from smaller molecular weight proteins using a cellulose membrane, the efficiency of SERS spectroscopic detection is increased, which leads to greater differentiation of the SERS spectral bands associated with healthy and tuberculosis samples.
The presence of phosphatidylinositol in disease filtrates and its absence in healthy filtrates, as indicated by peaks observed at 519, 708, and 1786 cm−1, suggests that bacteria causing tuberculosis produce this molecule, and it could potentially play a pathogenetic role. Phosphatidylinositol (PI) is a phospholipid that is commonly found in the plasma membranes of eukaryotic cells, as well as in some bacteria that play a role in a range of cellular processes, including signal transduction, membrane trafficking, and autophagy. PI(4)P is a specific form of PI that is involved in regulating these processes. The peak assignments of 889, 1089, 1099 and 1133 cm−1 were present in the filtrate portion of serum samples from healthy and diseased patients. A common peak at 592 cm−1 was observed, but it was higher in intensity in the disease filtrate portion due to the presence of phosphatidylinositol.
The SERS peaks at 945 and 1540 cm−1 are associated with proline (protein) were only present in the disease filtrate portion and were absent in healthy filtrate samples. The tuberculosis bacterium secretes various proteins that assist in bacterial survival and replication within host cells. These proteins are involved in various processes such as nutrient acquisition, virulence, and immune evasion. For example, one protein called ESAT-6 is known to assist the tuberculosis bacterium in escaping from immune cells and establishing infection in the host. The SERS peak appearing at 815 cm−1 is specific to healthy serum filtrate and is related to the proteins. The other peaks are present in healthy and disease filtrate portions, but are higher in intensity in healthy samples. These peaks correspond to various protein assignments and vibrations, such as the symmetric breathing of tryptophan (743 cm−1) and the C–H vibration (1448 cm−1). Some SERS spectral features of DNA/RNA that appear in the filtrate portion of disease serum include 681, 862, 1028, and 1314 cm−1.
Fig. 4(a) shows the PLS-DA score plot between SERS spectra of healthy and disease filtrate samples. The PLS-DA model was used to classify the samples into two different categories based on their characteristic SERS spectral features. In the score plot, the cluster of green dots represents healthy filtrate samples on the positive axis, while the disease filtrate samples are represented by pink dots and are present on the negative axis. Fig. 4(b) shows a graphical depiction of the (ROC) curve, which is a measure of the performance of the PLS-DA model. The area under the curve (AUC) value of 0.74 indicates very good performance of the model. An AUC value close to 1 represents high accuracy and validity of the model, while an AUC value below 0.5 indicates that the model is not fit, and the results may be invalid or false.
Table 2 provides the values of specificity, sensitivity, precision, and accuracy, which are important metrics for evaluating the performance of a classification model. These metrics provide insights into the ability of the model to correctly classify the SERS spectra of different samples. A PLS-DA scatter score plot of 60 samples (by averaging 10 spectra for each sample: 1 spectrum = 1 sample), where one dot represents one sample, is shown in Fig. S3.†
Parameters of the PLS-DA model | Values |
---|---|
Sensitivity | 0.9874 |
Specificity | 0.9795 |
Precision | 0.9481 |
Accuracy | 0.9711 |
AUC | 0.74 |
To demonstrate the efficiency of SERS, advanced multivariate techniques such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were employed. PCA is used to qualitatively distinguish between SERS datasets containing healthy and tuberculosis samples. The quantitative approach enables precise discrimination of different tuberculosis-positive samples based on the SERS spectral data. The establishment of the PLS-DA model further enhances the categorization of filtrate fractions from healthy individuals and patients tuberculosis on the basis of their SERS spectra. The model exhibits impressive performance with high accuracy (0.9711), precision (0.9481), sensitivity (0.9874), and specificity (0.9795), and thus enables precise disease diagnosis based on the SERS data.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4ra00420e |
‡ The first two authors contributed equally. |
This journal is © The Royal Society of Chemistry 2024 |