Haobo
Guo
ab,
Pooria
Lesani
acde,
Hala
Zreiqat
ac and
Elizabeth J.
New
*bcf
aSchool of Biomedical Engineering, The University of Sydney, Sydney, NSW 2006, Australia
bSchool of Chemistry, The University of Sydney, Sydney, NSW 2006, Australia. E-mail: elizabeth.new@sydney.edu.au
cThe University of Sydney Nano Institute, Sydney, NSW 2006, Australia
dSchool of Science, STEM College, RMIT University, Melbourne, VIC 3000, Australia
eKoch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
fAustralian Research Council Centre of Excellence for Innovations in Peptide and Protein Science, The University of Sydney, Sydney, NSW 2006, Australia
First published on 12th November 2024
In this study, we present a sensor array for precise pH monitoring, based on carbon dots (CDs) synthesised from fructose and p-phenylenediamine through a one-step hydrothermal method. The CDs exhibited significant photostability and different fluorescence emissions in different pH conditions, making them suitable for real-time pH sensing across a wide pH range of 3 to 10. Our mechanistic studies revealed how surface functionalisation affects the pH response. For statistical analysis of our array data, we employed Gaussian process regression to create a predictive model for determining pH levels of test samples based on the array spectral data, and linear discriminant analysis for high-precision classification of pH values. This combined approach enabled a comprehensive analysis of the CDs' pH-sensitive fluorescence, demonstrating a significant methodological advancement in pH sensor technology. Our research advances the understanding of the fluorescence mechanisms of CDs in response to pH variations. Furthermore, our study demonstrates the power of integrating machine learning techniques to improve the performance and application scope of fluorescent nanomaterials. These findings have important implications for chemical sensing across a range of fields, including environmental monitoring and biomedical diagnostics.
A broad range of analytical tools has been developed to measure pH within complex environments, including H+ permeable microelectrodes,13 advanced nuclear magnetic resonance (NMR) techniques,14 and fluorescent sensors.15 The latter are advantageous as they do not require physical breaching of cell membranes, and they can be coupled to inexpensive and readily available instruments for quantifying and imaging fluorescence. A broad range of pH-responsive fluorescent sensors have been reported to date, including those based on fluorescent proteins,16 organic fluorophores,17 and fluorescent nanomaterials.18 The protonatable groups typically have a working range within ±1 pH unit of the pKα, therefore limiting the working range of the final assay to less than 2 pH units.
One approach to address this limitation is to simultaneously use multiple sensors with different pKα values, and therefore, different pH working ranges. This can be achieved through the use of a sensor array, built by combining multiple sensors, each with specific properties, into a single platform.19 Array-based sensing offers several advantages, including increased selectivity and sensitivity, and reduced interference from extraneous substances.6,20,21 This versatility enables the detection of a wide range of pH values and the ability to tailor the sensor array for specific applications or environments. The use of array sensing technologies in pH sensing applications can lead to enhanced accuracy compared to traditional methods. This improvement in accuracy is crucial for various fields where precise pH measurements are essential, such as environmental monitoring and biomedical applications. This approach has been elegantly demonstrated by Kim and co-workers, who used a library of 30 fluorescent sensors to achieve pH determination within 0.2 pH units.22
We were interested in developing an array with fewer sensor elements, in order to increase efficiency of synthesis and array application. We therefore aimed to develop a new sensor array based on carbon dots (CDs) due to their previously reported pH-responsive properties.23–26 Traditional pH indicators, such as dyes and organic fluorophores, face limitations, including reduced stability, susceptibility to photobleaching, and potential cytotoxicity.27 Conversely, CDs have emerged as a superior alternative, renowned for their robust fluorescence, exceptional photostability, low toxicity, and facile synthesis.28–30 As small carbon nanoparticles, CDs display unique optical properties due to quantum confinement effects, surface modifications, and diverse carbonaceous compositions.31–33 These features render CDs ideal for pH sensing applications, as they can significantly alter their emission wavelength and intensity in response to environmental pH changes, thus serving as effective fluorescent probes for pH differentiation across a spectrum of applications.23,34 Recent research has focused on exploring the fluorescence mechanisms of CDs.35–37 Understanding these mechanisms is crucial for several reasons: it paves the way for enhancing their fluorescent properties, facilitates the creation of novel CDs with tailored features, and aids in developing standardised methodologies for their characterisation. In this paper, we focused on a mechanistic study of the surface functionality impact on CDs pH response.
Here we synthesised CDs from fructose and p-phenylenediamine precursors. After determining the surface groups contributing to pH sensing, we meticulously analysed the pH sensing capabilities exhibited by the CDs array using two advanced statistical methodologies: Gaussian process regression (GPR) and linear discriminant analysis (LDA).
Attenuated total reflectance (ATR) spectroscopy provided structural insights into the five synthesised CDs. CD100, synthesised from fructose alone, displayed bands at 3300, 2970, 1660, 1405, and 1050 cm−1, corresponding to the stretching vibrations of O–H, C–H, CO, COO–, and C–O respectively (Fig. S2†). Introduction of some nitrogen doping, in CD92, and CD69, gave rise to an additional peak at approximately 1500 cm−1, corresponding to N–H. CD43 and CD25 had the highest N-doping levels, and therefore exhibited strong ATR peaks located at N–H (1510 cm−1), as well as peaks corresponding to C–N/C–O (1390, 1260, and 1070 cm−1).
X-ray photoelectron spectroscopic (XPS) analysis of CDs (Fig. 1 and S3 and Tables S2–S5†) gave insight into the surface functionality of the particles. CD100 showed 67.5% carbon (C–C/CC, C–O, and O–C
O) and 32.5% oxygen (–OH and
O), in agreement with the ATR results (Fig. S2†). As the degree of N-doping increased from CD92 to CD25, the nitrogen surface content determined by XPS increased from 5.5% to 17.13%, with a concomitant decrease in the oxygen atomic percentage (from 26.28% to 5.81%). This corresponds to an increase in the proportion of –OH groups and a corresponding increase in the proportion of amino groups.
UV-visible spectroscopy gave further insight into the composition of the CDs (Fig. 2). CD100 shows a predominance of oxygen-containing functional groups, such as hydroxyl and carboxyl groups, evidenced by strong absorption between 290–310 nm, corresponding to aromatic or graphitic π–π* transitions,39,40 and a secondary peak between 320–350 nm, corresponding to n–π* transitions of CO groups,41 suggesting significant oxidation.42 As the N-doping level increased from CD92 to CD25, there was an absorption increase around 350–400 nm, which was related to the π–π* transitions of the aromatic C
C bonds and n–π* transitions of C
N heterocycles. This shift was evidenced by a lower intensity in the 290–310 nm range and diminished n–π* transitions in the 320–350 nm range.
CD | Excitation maximum (nm) | Emission maximum (nm) | Stokes shift (nm) |
---|---|---|---|
CD100 | 360 | 510 | 150 |
CD92 | 393 | 510 | 117 |
CD69 | 369 | 530 | 161 |
CD43 | 354 | 525 | 171 |
CD25 | 359 | 545 | 186 |
In addition to variation in fluorescence wavelengths, total emission intensity also varies across the series of CDs (Fig. 3(f)). As the proportion of the nitrogen source PPD increased during hydrothermal synthesis, the fluorescence intensity of the CDs initially rose and then declined, with CD43 exhibiting the highest fluorescence intensity.
Thus, the effect of nitrogen doping on the fluorescence intensity of CDs was not straightforwardly linear. It involved a complex interplay of factors that could either enhance or quench fluorescence, depending on the specific levels of doping and the environmental conditions.47
The emission and absorption spectra of the CDs (Fig. S5, S6 and S9 and Table S6†) demonstrate a distinct dependency on pH levels, with clear shifts in peak positions and variations in intensity. For instance, CDs with a higher oxygen content, such as CD100, consistently exhibited emission peaks around 500 nm at low pH, accompanied by Stokes shifts ranging from 85 to 160 nm. On the other hand, CDs with a higher nitrogen content, like CD25, presented higher emission peaks at lower pH levels, with Stokes shifts ranging from 130 to 190 nm. These differences implied that the fluorescence properties of CDs were significantly affected by their oxygen and nitrogen content, which in turn influenced their sensitivity and stability across various pH levels. Specifically, higher nitrogen content appears to stabilise both the emission and excitation wavelengths over a broader pH range. Conversely, CDs with a higher oxygen content displayed more pronounced shifts, particularly in their emission peaks. These findings highlight the potential use of CDs in pH sensing applications. They demonstrate how variations in oxygen and nitrogen content could be leveraged to tune the fluorescence properties of CDs for specific sensing needs.
We found a clear transition from pH turn-on to pH turn-off response with increased N-doping (Fig. 4). The fluorescence spectra of the synthesised CDs with full pH range are presented in Fig. S6 and the naked eye observation of CDs under different pH are presented in Fig. S10.† CD100 exhibits turn-on behaviour with increasing pH (Fig. 4a). We hypothesise that this pH response is dominated by surface –OH groups. In comparison, CD25, which is rich in amino groups, presents a turn-off fluorescence response with increasing pH (Fig. 4e). The intermediate CDs, CD92, CD69 and CD43, show a turn-on pH response in acidic environments and a turn-off response in basic environments (Fig. 4b–d). This can be explained by the competition between hydroxyl groups, which dominate in acidic environments, and amino groups, which take over control in basic environments.
The CDs retained their pH sensitivity even after three months in solution at room temperature without protection from sunlight (Fig. S7†). Furthermore, the reversibility of response was confirmed through alternating addition of sodium hydroxide and hydrochloric acid to CD100 and CD25 (Fig. S8†).
The oxygen content of each CD (determined both by precursor ratio and by XPS characterisation) is well-correlated with the pH value at which the highest fluorescence intensity was observed (Fig. 5). This is consistent with our hypothesis that the surface functional groups influence the pH responsiveness of CDs.
To validate our hypothesis, we masked the hydroxyl groups on CD100 and CD69 using TBS protection (Fig. 5(a)), and the amino groups on CD69 and CD25 using Boc protection (Fig. 5(b)). This approach enabled us to investigate the specific impact of hydroxyl and amino groups on the fluorescence and pH response of the CDs. For CD100, masking the –OH groups using TBS protection resulted in significant fluorescence quenching, and very little pH sensitivity (Fig. 5(c)). This result supports our hypothesis that the pH turn-on response of CD100 can largely be attributed to the presence of –OH groups.
For CD25, masking the –NH2 groups led to a significant enhancement in fluorescence, but a dampening of the pH-responsive behaviour (Fig. 5(d)). The fluorescence enhancement may be due to the fact that amino groups themselves are fluorescence quenchers through surface passivation, and through non-radiative pathways for energy dissipation.48,49 The turn-off performance of CD25 decreased by 55%, confirming our hypothesis that –NH2 groups were responsible for the CDs' pH turn-off response and limited the CDs' fluorescence.
CD69 exhibited a pH response pattern characterised by fluorescence turn-on in the pH range of 3–7 and turn-off in the pH range of 7–10, as displayed in Fig. 5(e). This behaviour was due to its surface functionalisation, which contained considerable amounts of both –OH and –NH2 groups. For CD69, the original sample had a peak fluorescence value at pH 6.525, as determined from plot fitting results. After masking the –OH groups, the peak shifted to the acidic range at pH 4.816, and the fluorescence was clearly quenched compared to the original CD69. Conversely, after masking the –NH2 groups, the peak shifted to the basic range at pH 8.163, and the fluorescence was enhanced.
These results confirmed that hydroxyl groups promoted the transformation of CDs into pH-sensitive ‘turn-on’ sensors, while amino groups drove CDs to function as pH-sensitive ‘turn-off’ sensors. The masking experiments validated our hypothesis and provided a deeper understanding of the mechanisms underlying the pH responsiveness of CDs.
PCA was utilised to analyse feature sets, such as those obtained by adding sensors to a standard configuration, and to understand how the different features compared and contributed to the overall sensor array performance. From Fig. 6, PCA revealed that the combined features of the 9 CDs contributed significantly to the overall variance, indicating their collective importance in accurately sensing pH levels. This comprehensive approach leverages the strengths of each individual CD, leading to an enhanced and reliable pH sensing capability. The classification accuracy of LDA classification by using this 9 CDs array was confirmed to be 100% for both classification accuracy and cross-validation accuracy, as presented in Fig. S11 and Table S7.†
We had therefore demonstrated that these 9 CDs were able to correctly classify these pH values, over a greater than 7 pH range. However, the application of 9 elements to a sensor array is still labour-intensive, and so we sought to reduce the number of sensor elements required to still give 100% correct classification. The process of reducing the number of array elements can be achieved by an iterative trial-and-error approach,50 or by examination of the correlation statistics derived from PCA,51 as we have previously described. However, both techniques are cumbersome, and do not enable investigation of all permutations of the full sensor set. We therefore sought to develop an algorithm that would let us rapidly and comprehensively perform array reduction.
To address this challenge, we developed a MATLAB-based classifier using LDA and support vector machine (SVM) to carry out array-based classification and array optimisation for CDs based pH array development. A 4-fold cross-validation setup was implemented for robust performance evaluation. This program automatically tested all potential combinations driven by the 9 CD sensor features: a total of 511 combinations. Both SVM and LDA classifiers were trained and evaluated for each combination, individually. The performance metrics, including accuracy, precision, recall, and F1 score, were calculated and averaged over the cross-validation folds (Table S9†). The F1 score represented the mean of precision and recall, giving equal weight to both metrics. An F1 score ranges from 0 to 1, with 1 indicating a perfect model where all predictions are correct, and 0 representing the worst possible performance. In general, combinations containing our protected CDs were less accurate than those containing the original 5 CDs, consistent with the fact that masking groups led to less pH sensitivity.
The best-performing feature sets for both SVM and LDA were identified based on the highest accuracy. The SVM model used a linear kernel with a one-vs.-one approach for multi-class classification, while the LDA model employed a pseudo-linear discriminant type. This program provided a comprehensive evaluation of the sensor features, identifying the most effective combinations for pH level classification, and ensuring reliable and robust performance through meticulous data normalisation and cross-validation. Just using CD43 alone was found to achieve 100% classification accuracies for both SVM and LDA. However, since we sought to use a sensor array with at least two elements for subsequent pH quantification, we then looked for two sensor combinations that contained CD43 (Table S10†). 100% classification accuracies for both SVM and LDA classifiers were achieved by using combinations of CD43 and CD69, and CD43 and CD25. These results were verified by SPSS-based LDA, and 100% classification accuracies for both original classification and cross validation were achieved for both arrays.
The LDA classification results are shown in Fig. 7, Tables S14–S16† (CD43 and CD69), and Tables S17–S19 (CD43 and CD25).
![]() | ||
Fig. 7 The LDA classification results by using two CDs based array (a: CD43 and CD69; b: CD43 and CD25) as pH sensor array to discriminate 8 different pH environments. |
Of the three arrays tested, the array built using CD43 and CD69 achieved the best prediction for the “unknown” pH 6.9 sample (Fig. 8(a)). For the kernel parameters, the length scale (σl) was calculated to be 427.7, indicating a significant influence of input features on predictions. The signal variance (σf) was 0.316, representing the variation of function values from their mean. The standard deviation of predicted values (n = 4) was 0.110, indicating consistent predictions close to the average value. This demonstrates that the GPR model can predict pH values close to those measured by a pH meter.
![]() | ||
Fig. 8 GPR prediction by using 2 CDs array built by CD43 and CD69: (a–c) the test pH values were 6.9, 4.07, and 9.5. |
The R2 was 1, indicating that the model explains approximately 100% of the variance in the data. The MSE was 0, suggesting a good model fit. The MAE was 0.009, indicating that the model's predictions were accurate.
To further validate the performance of the pH sensor array constructed using CD43 and CD69, GPR tests were conducted at pH 4.07 and 9.05, (as presented in Fig. 8(b and c) and Table S20†). At pH 4.07, the standard deviation of the predicted values (n = 4) was 0.106, with an average predicted pH of 4.040 ± 0.106. The prediction range was 3.934 to 4.145, and the R2 was 0.987. MSE was 0.055, and the MAE was 0.165. For pH 9.05, the standard deviation of the predicted values was 0.0359, with an average predicted pH of 8.970 ± 0.0359. The prediction range was 8.934 to 9.006, and the R2 value was 0.996. The MSE was 0.019, and the MAE was 0.102. These results demonstrate that the pH sensor array provided accurate and precise pH predictions, with very high R2 values and low error metrics for both pH levels tested in acidic and basic conditions.
Overall, the CDs-based pH sensor array performed well in predicting pH values (from acidic to alkaline) based on the fluorescence response of the three sensors using the GPR model. The high R2 value, low MSE, and low MAE indicated a good fit and accurate predictions. The standard deviation of predicted values was relatively small, indicating consistency in predictions. The model's predictions for the test pH value were reasonably close to the actual value, confirming its effectiveness.
The performance of the GPR model, using CD69 and CD43 as features, demonstrated a high degree of accuracy in predicting the pH of fresh milk (Fig. 9(a)). The model predicted an average pH value of 6.89 ± 0.18, with a range of predictions between 6.70 and 7.07. This closely aligns with the actual pH of 6.74, reflecting the reliability of the sensing array in accurately detecting minor pH variations in fresh milk. The model exhibits a high R2 value of 0.971, indicating a strong correlation between the predicted and observed pH values. Furthermore, the MSE (0.084) and MAE (0.234) are relatively low, supporting the model's capacity to provide precise pH predictions.
![]() | ||
Fig. 9 GPR prediction by using 2 CDs array built by CD43 and CD69: a. fresh milk and b. spoiled milk. |
For the expired milk (Fig. 9(b)), the GPR model demonstrated strong predictive accuracy. The model predicted an average pH of 5.62 ± 0.48, with a prediction range of 5.14 to 6.09, closely approximating the actual pH value of 5.20.
The high R2 (0.9772) reflects a strong correlation between predicted and observed values, with low error rates – an MSE of 0.066 and an MAE of 0.179 – indicating precise prediction.
The relatively small standard deviation (0.48) confirms the model's robustness. These results suggest that the sensor array retained high accuracy even in expired milk, despite the potential influence of factors like microbial spoilage and chemical decomposition. The naked eye observation of pH sensor array applied for spoiled milk is shown in Fig. S12.†
Our CD-based sensor array exhibited considerable promise as a rapid and reliable method for assessing milk freshness. The high level of accuracy observed in milk pH predictions underscored its potential application in food quality monitoring systems, where real-time and non-invasive pH sensing is of significant commercial interest.
We were able to use this set of CDs to distinguish between pH values over a greater than 7 pH unit range, and in fact we have shown that just two carbon dots can be used to construct an array that gives 100% accuracy of classification, as well as a very high accuracy in quantification of pH. Employing Gaussian process regression and linear discriminant analysis, we developed a sophisticated analytical framework that not only predicts pH levels with remarkable accuracy but also classifies them with high precision. Importantly, we have shown that these methods can be used to rapidly and efficiently scan a full set of array data to identify the most promising set.
In our study, GPR was employed to construct a comprehensive multivariate regression model aimed at predicting pH levels in diverse test sensing environments. By analysing variations in the spectral features of the CDs' fluorescence emission in response to pH changes, GPR facilitated a nuanced understanding of the relationship between spectral attributes and pH levels, thereby enabling accurate and reliable pH estimation. The inherent flexibility of GPR, coupled with its ability to provide confidence intervals, significantly bolstered its applicability in sensor technology and analytical chemistry.
Concurrently, LDA was utilised as a powerful statistical tool for dimensionality reduction and classification within the pattern recognition domain.54,55 By maximising the ratio of between-class variance to within-class variance, LDA ensured optimal separation among different classes.56,57 In our CDs-based array, LDA harnessed the unique fluorescence-based pH responses of the CDs to create distinct fingerprints for varying pH levels. This enabled the effective reduction of data dimensionality while retaining critical distinguishing features, thereby streamlining the classification process.
This study contributes to the fields of nanotechnology and sensor development by providing detailed mechanistic insights into the fluorescence behaviour of CDs under different pH level. This study not only contributes to the fundamental understanding of CDs but also offers practical guidelines for the future design and optimisation of pH-responsive CDs, expanding their potential in various scientific and industrial applications. It also showcased the integration of nanomaterials with machine learning to create more sensitive, selective, and versatile sensing technologies. Our optimised two-CD sensor array was utilised for pH sensing in milk. During the sensor stability and fluorescence recovery tests, the CD sensors exhibited a dynamic fluorescent response to pH fluctuations, demonstrating their effectiveness in detecting pH variations. Sensors of this type could be embedded onto filter paper or a similar support for paper-based sensing, or could be used in solution, such as in a microwell plate, as we demonstrate here. While the former could provide a binary reading of high or low pH, for example for detecting food spoilage, the latter enables us to harness the full potential of intensity measurement and statistical analysis to accurately determine pH values. We are actively studying how to apply such systems to the measurement of biologically-relevant pH, both in intracellular pH mapping, and studies of clinical samples. Future research should aim to refine the synthesis and functionalization of CDs to further enhance their sensitivity and selectivity for specific ions or molecules, thereby expanding their utility in a wider range of applications such as environmental monitoring, industrial processes, and biomedical diagnostics.
X-ray photoelectron spectroscopy (XPS) measurements were collected using a Kratos Axis Nova spectrometer (Kratos Analytical, UK), equipped with a monochromatised aluminium X-ray source (Al Kα, 1486.6 eV), operating at 10 mA and 15 kV (150 kW). Both survey and high-resolution spectra were recorded at detector pass energies of 160 and 20 eV, respectively. The obtained XPS data were processed and analysed using Thermo Avantage software (version 5.9902). Photophysical evaluations were conducted using a PerkinElmer EnSpire Multimode Plate Reader, with experiments carried out in 300 μL, 96-well polypropylene microplates (item no.: 655209, Greiner Bio-One), ensuring standardised and reproducible measurement conditions.
CD | Fructose/g | p-Phenylenediamine/g | Synthesis time/h | Synthesis temperature/°C |
---|---|---|---|---|
CD100 | 0.5 | 0 | 8 | 180 |
CD92 | 0.43 | 0.07 | 8 | 180 |
CD69 | 0.278 | 0.222 | 8 | 180 |
CD43 | 0.147 | 0.353 | 8 | 180 |
CD25 | 0.078 | 0.422 | 8 | 180 |
Expected pH | Actual pH | Chemical 1 | Chemical 2 | Buffer concentration/mM |
---|---|---|---|---|
3 | 2.76 | Sodium citrate dihydrate | Citric acid | 50 |
4 | 3.92 | Sodium citrate dihydrate | Citric acid | 50 |
5 | 4.87 | Sodium citrate dihydrate | Citric acid | 50 |
6 | 6.9 | Potassium phosphate dibasic | Potassium phosphate monobasic | 50 |
7 | 6.64 | Potassium phosphate dibasic | Potassium phosphate monobasic | 50 |
8 | 8 | Potassium phosphate dibasic | Potassium phosphate monobasic | 50 |
9 | 9.05 | Sodium bicarbonate | Sodium carbonate (anhydrous) | 50 |
10 | 10.03 | Sodium bicarbonate | Sodium carbonate (anhydrous) | 50 |
The obtained solid was redissolved in Milli-Q water and transferred to a dialysis bag (molecular weight cutoff: 2 kDa) for purification purposes. The dialysis process was conducted over 24 h to remove any unreacted reagents and by-products.
GPR and LDA analyses were conducted using MATLAB (MathWorks, USA). Custom scripts were meticulously developed to handle the fluorescence data, train the models, and evaluate their performance comprehensively. To ensure the robustness and generalisability of the models, cross-validation techniques were employed to validate the results from both GPR and LDA.
For the fluorescence recovery test of CD25, a similar procedure was followed, with the order of HCl and NaOH additions reversed. In this case, 200 μL of 5 M NaOH was added first, followed by 400 μL of 5 M HCl, then 400 μL of NaOH, and finally 400 μL of HCl. After each addition, the solution was mixed thoroughly, and four 200 μL aliquots were collected. Fluorescence measurements were performed under the same conditions as for CD100.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sd00275j |
This journal is © The Royal Society of Chemistry 2024 |