Raven L.
Buckman Johnson
a,
Hark
Karkee
a and
Alexander
Gundlach-Graham
*ab
aDepartment of Chemistry, Iowa State University, Ames, Iowa, USA. E-mail: alexgg@iastate.edu
bTOFWERK, Thun, Switzerland
First published on 31st May 2025
Single-particle inductively coupled plasma time-of-flight mass spectrometry (spICP-TOFMS) can be used to measure metal-containing nanoparticles (NPs) and sub-micron particles (μPs) at environmentally relevant concentrations. Multielement fingerprints measured by spICP-TOFMS can also be used to differentiate natural and anthropogenic particle types. Thus, the approach offers a promising route to classify, quantify, and track anthropogenic NPs and μPs in natural systems. However, biases in spICP-TOFMS data caused by analytical sensitivities, Poisson detection statistics, and elemental variability at the single-particle level complicate particle-type classification. To overcome the inherent bias in spICP-TOFMS data for the classification of particle types, we have developed a multi-stage semi-supervised machine learning (SSML) strategy that identifies and subsequently trains on systematic noise in spICP-TOFMS data to produce more robust particle-type classifications. Here, we apply our two-stage SSML model to classify individual Ti-containing NPs and μPs via spICP-TOFMS analysis. To build our model, we measure neat suspensions of anthropogenic TiO2 particles (E171) and natural titanium-containing particle types: rutile, ilmenite, and biotite by spICP-TOFMS. Element mass amounts recorded per particle are used to classify particle type by SSML and then systematic particle misclassifications are identified and recorded as uncertainty classes. Following, a second SSML model is trained with the addition of uncertain particle-type categories. With two-stage SSML, we demonstrate low false-positive rates (≤5%) and moderate particle recoveries (50–90%) for all anthropogenic and natural particle types. Two-stage SSML is a streamlined, hands-off method to identify and overcome bias in spICP-TOFMS training data that provides a robust particle-type classification.
The use of single-particle inductively coupled plasma time-of-flight mass spectrometry (spICP-TOFMS) to measure and classify natural or anthropogenic NPs and μPs is a growing area of research interest. spICP-TOFMS provides high-throughput measurement and quantification of elements in individual particles, and this data can also be used to classify particle-types based on elemental fingerprints and their mass fractions. Strategies and rules for particle-type classification based on spICP-TOFMS data are still being developed and a consensus has not been reached. In the simplest case, particle-type detection limits and decision tree classification based on known, or measured, unique element associations and ratios can be established.8,24–26 Other approaches to classify individual particles by spICP-TOFMS use clustering algorithms to group and classify particles according to their detected elemental signatures.19,20,27,28 Supervised machine learning has also been used to classify particle-types with t-stochastic neighbor embedding (tSNE) and light gradient boosted decision trees,29 binomial logistic regression,30 and k-nearest neighbor embedding.31 While all of these examples demonstrate reasonable classification results, few examples of workflows with broad applicability to multiple particle types without extensive refinement have been reported. This is, in part, due to the inherent bias of spICP-TOFMS measurements.
In spICP-TOFMS, elements are only recorded in a given particle if they produce enough signal to be registered as a particle event, i.e. more signal than element-specific critical values (LC,sp). While LC,sp values depend on steady-state background levels and the background-signal distribution, the likelihood that an element produces signal greater than LC,sp also depends on a number of factors, including the true mass amount of the element present in a given particle, the measurement sensitivity, and random signal variations due to Poisson counting noise.32 In addition, the shape of the analyte particle size distribution and the heterogeneity of multi-element composition per particle can also lead to systematic biases apparent in spICP-TOFMS data. For example, minor elements in small particles may be undetectable, but become detectable in larger particles. These smaller particles do not have true altered element mass fractions, but—in the spICP-TOFMS data—their multi-element fingerprints lack the minor elements, and so their composition appears to have changed. This is bias in the spICP-TOFMS measurement, and it will influence the element compositions recorded for populations of particles. For single-particle classification, a workflow must be robust enough to account for the uncertainty and bias in a spICP-TOFMS measurement and should be readily adaptable for various particle-types.
To address the limitations of spICP-TOFMS particle classification, we developed a multi-stage semi-supervised machine learning (SSML) strategy.33 In our multi-stage SSML workflow, determined element mass amounts per particle from spICP-TOFMS measurements of known particle types are used to train an initial decision tree-based ensemble semi-supervised classification model. Misclassified particles from this first SSML model are largely due to systematic biases inherent to spICP-TOFMS measurements. Therefore, we reclassify the misclassified particles as belonging to uncertain class types. These new uncertain classes are incorporated into the second stage SSML model, which then produces more robust classification results. Two-stage SSML classifies particles based on both elemental mass distributions and associated elemental signatures, which allows for a practical and streamlined approach for particles with inherent heterogeneity. In this study, we apply our SSML workflow to classify mixtures of Ti-containing NPs and μPs, i.e. E171 (TiO2), rutile (TiO2), ilmenite (FeTiO3), and biotite ((K(Fe2+/Mg)2(Al/Fe3+/Mg/Ti)([Si/Al/Fe]2Si2O10)(OH/F)2) with median diameters ranging from 100 to 200 nm, and size ranges from 50 to 600 nm.8 In previous work, two-stage SSML was used to classify Ce-rich natural and anthropogenic particle types.33 Here, we extend our method for the (more challenging) analysis of a Ti-containing particles; our aim is to further explore the utility of the two-stage SSML classification approach as a tool for spICP-TOFMS data analysis.
![]() | ||
Fig. 1 Mass distributions of 48Ti in the pristine suspensions of biotite, E171, rutile, and ilmenite. |
In Fig. 2A and B, we show the classification performance of the first SSML model for the unlabeled training dataset; classification results for the unlabeled training data are shown in Table S3.† Confusion matrices are used in machine learning research to describe the performance of the model;37 these can be used to show the true-positives, false-negatives, true-negatives, and false-positives (TP, FN, TN, and FP, respectively). In the confusion matrix in Fig. 2A, the tiles are colored according to whether they were correctly (blue) or incorrectly (pink) predicted by the model. Row- and column-summaries are normalized to the row and column, respectively, and reflect percentages of TPs, FNs, FPs, and positive-predictions (PPs).
The accuracy of the first SSML model was determined to be 90.3 ± 0.2%, which indicates that ∼90% of the particles were correctly classified. From Fig. 2A, it can be seen that the first SSML model best classifies biotite and ilmenite particles; these particles, generally, have lower mass amounts of 48Ti and more additional elemental associations than rutile and E171 (see Fig. 1 and S1†). Therefore, the first SSML model is able to achieve low FN and FP percentages for biotite and ilmenite. However, the model does not demonstrate the same robustness for rutile (FP = 13.9%) and E171 (FP = 15.9%) particles. The FP percentages do not reflect the true false-positive rate (FPR) because they do not consider the TNs.33 FPR is a metric that reflects the probability of a type I error; typically a FPR less than 0.05 (i.e. 5%) is desirable. The FPR and other machine learning figures of merit were calculated based on the results described in Table S3† and are summarized in Fig. S3.† The Eng and Rut particle classes have FPRs of 5.1 ± 0.4% and 5.5 ± 0.7%, respectively, while Bio and Ilm have FPRs of 0.7 ± 0.1% and 1.5 ± 0.1%. While, these FPRs are within an acceptable range, we still observe systematic misclassifications of rutile and E171 particles.
In Fig. 2B, we plot all the particle events in the labeled training dataset as a function of their 48Ti mass (fg); particles are grouped vertically based on their true class and colored according to predicted class. Bubble sizes in Fig. 2B reflect the number of elements detected in the particle event. Here, we see that a subset of E171 particles with a single-metal fingerprint and 48Ti mass less than 2 fg are systematically misclassified as rutile. Similarly, single-metal rutile particles with 48Ti mass between 2 and 20 fg are systematically misclassified as E171. These misclassifications arise from the mass distributions of 48Ti in particles (see Fig. 1). Generally, the E171 particles have larger Ti mass than rutile particles, so the model tends to predict that low-mass single-metal Ti particles are rutile and high-mass single-metal Ti particles are E171. To overcome these systematic misclassifications, we create new labels based on the initial misclassifications and incorporate these labels into a second machine learning model.33 For example, particle events that were falsely predicted to be Eng by the first model are relabeled as ‘unclassifiable engineered’, i.e. UEng. Likewise, the particles misclassified as Ilm, Rut, or Bio are relabeled as UIlm, URut, or UBio, respectively. These unclassifiable classes are a way to represent the uncertainty of class predictions and also account for the apparent bias in the first SSML model.
In Fig. 2C, we provide the confusion matrix for the second SSML model, which incorporates the unclassifiable particle types. As seen in the second SSML model, the FP percentages for E171 and rutile particles decrease substantially compared to the first SSML model: from 16 to 5% and 14 to 1%, respectively. In our work, we strive for low false-positive classifications because they are important for the classification of particle types with large differences in number concentration. For example, if there are 100× more natural Ti-particles than E171 particles, then the 5% FP in E171 classifications would be more abundant than the TP E171 classifications. However, lowering the FP percentage requires the model to be more selective, which results in a decrease in TP classifications and an increase in FNs. Therefore, the overall accuracy of the second-stage SSML model was reduced to 75.2 ± 2.8%. In Table S4,† we provide classification results for the unlabeled training data. Comparisons between the first and second SSML model's ROC and PR curves and figures of merit are shown in Fig. S2–S4.† ROC curves describe how well the ML model is able to separate the positive classifications from negative classifications; optimal models should have high TPRs and low FPRs, resulting in an AUC close to one. In contrast, PR curves illustrate how many predictions made by the model are truly correct and are best utilized when the number of positive predictions is low. As with ROC curves, AUC values of a PR curve that are closer to one describe ideally performing models. Metrics for the models described here are shown in Fig. S3.† The calculated FPR for the second SSML model decreases for the Eng and Rut classes but does not significantly change for the Bio and Ilm classes (see Fig. S3†).
In Fig. 2D, we show the classification results of the second SSML model for the labeled training data in the same format as Fig. 2B. As seen, particles that were incorrectly classified by the first SSML model are now classified as uncertain by the second model. The increase in FN classification can also be observed, as the model now classifies particle events with masses and elemental signatures similar to those categorized as UEng, URut, UBio, or UIlm as uncertain. While the creation of the uncertain classes worsens the overall model accuracy, the reduction in FPR enables classification across a broader number concentration range and in more varied particle backgrounds. Our two-stage SSML work flow results demonstrate that the approach is suitable for spICP-TOFMS analysis of anthropogenic and natural Ti-containing particles.
In Fig. 3, we provide classification results for data from the spICP-TOFMS analysis of a mixture containing only the three natural particle types. In these data, we should not record any Eng particles because they were not in the sample; thus, any Eng classifications are deemed to be FPs. Likewise, any unclassifiable classifications are FNs. As seen, with just the first SSML model, 5.2% of the detected particles are incorrectly identified as Eng. By adding the second SSML model, FP Eng classifications decrease by a factor of 2, though 20% of particle events are recorded as ‘unclassifiable’. Clearly, there is a trade-off between type I and type II errors (i.e. FPs and FNs)—reducing FPs results in an increase in FNs. For our application, we claim that it is more desirable to allow the model to produce FNs (i.e. to classify particles as uncertain) than to provide incorrect information to a user regarding the presence of engineered particles. While FPs are not entirely eliminated via the second-stage SSML training, we do see an improvement compared to the first model.
To test the classification performance of E171 particles in the presence of natural particles and vice versa via our two-stage SSML model, we applied the model to classify particle events in mixtures of engineered and natural particles at varying concentrations. Four dilution conditions were studied: (i) low and (ii) high natural Ti-containing particle backgrounds with E171 at different concentrations, as well as (iii) low and (iv) high E171 particle backgrounds with natural Ti-containing particles at different concentrations. In Fig. 4, we plot the number of particle events assigned to each class by our second SSML model versus the dilution amount of the analyte particles. In these results, the three natural particle-type classifications were summed together and presented, generically, as ‘natural’; the individual classifications are shown in Fig. S5.† For perfect recovery, the slope of the regression line should be equal to 1 when plotted on a log–log scale; deviations from this slope indicate incomplete or overestimated particle-type recoveries.
In Fig. 4, across the four dilution cases, the number of particle events identified as background remained constant with slopes not significantly different from zero; this is improved from the performance observed in the first stage of classification (Fig. S6†). Likewise, for the spiked Ti-particles (E171 or natural), the number of particle events identified as the analyte increased with increasing concentrations; however, none of the dilution cases demonstrated perfect classification. The deviations of the analyte regressions are likely caused by the high FNR observed in the second model, thus, slopes less than 1 are expected. It is possible that FPs, while generally estimated to be low in contribution, are dominant at low number concentrations and contribute to the overestimation in the cases in which natural particles are the analyte of interest. In general, recoveries between 80 and 90% were observed which indicates reasonable linearity for all studied conditions.
spICP-TOFMS | Single-particle inductively coupled plasma time-of-flight mass spectrometry |
NPs | Nanoparticles |
μPs | Microparticles |
SSML | Semi-supervised machine learning |
TEM | Transmission electron microscopy |
AF4 | Asymmetric flow field-flow fractionation |
tSNE | t-Stochastic neighbor embedding |
L C,sp | Single-particle element specific critical value |
Rut | Rutile |
Ilm | Ilmenite |
Bio | Biotite |
Eng | Engineered, E171 TiO2 |
ROC curve | Receiver operating characteristic curve |
PR curve | Precision-recall curve |
TP | True positive |
FP | False positive/prediction |
TN | True negative |
FN | False negative |
PP | Positive prediction |
FPR | False-positive rate |
TPR | True-positive rate |
fg | Femtogram |
UEng | Unclassifiable engineered |
UIlm | Unclassifiable ilmenite |
URut | Unclassifiable rutile |
UBio | Unclassifiable biotite |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5ja00108k |
This journal is © The Royal Society of Chemistry 2025 |