Josselyn
Mata Calidonio
a and
Kimberly
Hamad-Schifferli
*ab
aDept. of Engineering, University of Massachusetts Boston, Boston, MA 02125, USA. E-mail: kim.hamad@umb.edu
bSchool for the Environment, University of Massachusetts Boston, Boston, MA 02125, USA
First published on 15th March 2024
Optimizing paper immunoassay conditions for diagnostic accuracy is often achieved by tuning running conditions in a trial and error manner. We developed an approach to use machine learning (ML) in the optimization process, demonstrating it on a COVID-19 assay to detect IgG and IgM antibodies for both SARS CoV-2 spike and nucleocapsid proteins. The multiplexed test had a multicolor readout by using red and blue gold nanoparticles. Spike and nucleocapsid proteins were immobilized on a nitrocellulose strip at different locations, and the assay was run with red nanoparticles conjugated to anti-IgG and blue nanostars conjugated to anti-IgM. The spatial location of the signal indicated whether the antibody present was anti-spike or anti-nucleocapsid, and the test area color indicated the antibody type (IgG vs. IgM). Linear discriminant analysis (LDA) and ML were used to evaluate the test accuracy, and then used iteratively to modify running conditions (presence of quencher molecules, nanoparticle types, washes) until the test accuracy reached 100%. The resulting assay could be trained to distinguish between 9 different antibody profiles indicative of different disease cases (prior infection vs. vaccinated, early/mid/late stage post infection). Results show that supervised learning can accelerate test development, and that using the test as a selective array rather than a specific sensor could enable rapid immunoassays to obtain more complex information.
However, rapid diagnostic production possesses major bottlenecks that inhibit rapid response. With each new disease, an entirely new LFA must be developed for a corresponding target,2 requiring production of specific antibodies. The process of developing new antibodies can take 1–2 years and cost hundreds of millions of dollars.3 Moreover, every diagnostic has its own required sensitivity, time window, and biological fluids which cause different matrix effects with their own requisite sample preparation protocols. LFAs and paper immunoassays are tedious to optimize, where multiple parameters spanning the physical, chemical, and biological properties all must be systematically varied to arrive at conditions where a test can detect the target at a biologically relevant level, while also exhibiting no signal in its absence. The chemical and physical parameters are diverse, including the materials for the strips, absorbent pads, conjugate pads, chemical stabilizers and passivators, paper blocking agents, reagent concentrations, timing of reagent addition, and chemical strip dimensions.4 Another complicating factor is that the gold NPs responsible for the signal can suffer from nano-bio interface effects in commonly used biological fluids like blood and saliva, which have protein concentrations of ∼60–80 mg mL−1 and millimolar ionic concentrations.5,6 These conditions are ripe for non-specific adsorption, protein corona formation, and NP precipitation,7 which can ultimately cause both false positives and false negatives, confounding test results.8–10 Optimization of the nano-bio interface requires varying the NP synthesis protocol, surfactant types, and bioconjugation chemistries.
To accelerate the development process and accessibility to what is anticipated to be a $12.6B market in the US in 2026,11 there are several companies devoted to finding optimal conditions for LFA fabrication. However, these training courses and consulting services can cost several thousands of dollars. While there are some general guides on LFA construction in the literature, it is impossible to predict precipitation and undesirable surface effects, so ultimately one simply varies all of the parameters in an ad hoc manner to determine which are relevant. This represents a major bottleneck in LFA development, which is eventually borne out as delayed response times.
There is an opportunity in machine learning (ML), which has been demonstrated to be a powerful tool in creating molecules and materials. It can explore large areas of synthesis space efficiently, and thus has been used successfully to fabricate molecules with desired properties without prior knowledge of the required conditions, and can arrive at solutions that a human normally would not be able to access.12 ML has been extensively explored for fully automated systems, where autonomously completing the feedback loop yields an efficient synthesis engine. Consequently, the benefits of ML have been demonstrated mostly in fabrication/synthesis approaches that are high throughput systems, and only in a single phase, especially an entirely solution phase synthesis that can leverage liquid handling robotics.
Here, we show that an ML-assisted process can be used to optimize the test parameters for making a viable paper immunoassay. We demonstrated it on an antibody test for IgG and IgM of COVID-19 designed for antibody profiling different immunity states. The multiplexed assay leveraged gold NPs of two different colors, where identification of the target relied on not only the presence of a signal at a given location, but also their color. We optimized the assay properties by performing ML in an iterative loop (Fig. 1) to arrive at test conditions that could accurately distinguish a subset of antibodies. The system was trained using linear discriminant analysis (LDA), and the ML-assisted approach could reach a test accuracy of 100%. Then, we use ML to train the optimized test to distinguish 9 distinct antibody profiles representative of different disease histories, with the ability to discriminate between vaccinated and infected profiles.
For the biological reagents, goat anti-human IgM (αIgM), rabbit anti-goat IgG Fc, and human serum were purchased from Sigma-Aldrich. Goat anti-human IgG (αIgG) was purchased from Abcam. The spike (S) protein and human αS IgG were purchased from Native Antigen. Nucleocapsid (N) protein was purchased from Sino Biological. Human αS IgM, human αN IgG, and human αN IgM were purchased from Genscript. Gold NPs with a functional group for covalent attachment to amines were obtained commercially as a kit (Innova, Abcam).
Red-colored gold NPs were synthesized using a double boiling technique. A 600 mL beaker was filled with 275 mL diH2O and placed on top of a magnetic mixer. A 100 mL uncapped glass bottle containing 49.5 mL of 18 MΩ Milli-Q water and a magnetic stir bar were placed inside the beaker. The temperature was increased to allow the water to boil. Once boiling, the stirring setting was set to low, followed by addition of 500 μL of 25 mM HAuCl4·3H2O. The solution was left to stir for 10 min. Afterward, 500 μL of 34.1 mM sodium citrate dihydrate was added, which initiated NP formation and changed the solution color to red. After 10 min, the temperature and stir settings were turned off and the solution was left to cool for 30 min. Finally, ∼6.5 mg of BPS was added.
Prior to the formation of the NP–Ab conjugate, 1 mL of red NPs were centrifuged at 12312g for 12 min to remove excess reagents. After removing the supernatant, the NP pellet was resuspended in a solution of 140 mM HEPES (pH 7.48) and αIgG (10 μg) and left for incubation on an orbital shaker for 60 min. A PEG backfill was conducted by adding PEG-SH (1 × 10−9 mol) and incubating for 20 min on an orbital shaker. Lastly, the NP–Ab complex was centrifuged at 7607g for 12 min and resuspended in 50 μL of 0.01 M PBS buffer.
Conjugation to αIgG to the commercial NPs was achieved following the procedure outlined by the commercial supplier in the kit.
Optical absorption spectroscopy was performed on a Spectramax Molecular Devices plate reader in 1 cm cuvettes. The hydrodynamic diameters (DH) of the NPs and NP conjugates were obtained on a nanoparticle analyzer (SZ-100, HORIBA Scientific).
![]() | ||
Fig. 2 a) Antibody infection profile. b) Architecture of the immunoassay. c) Disease cases investigated. |
IgG and IgM tests can be challenging to make because a patient serum has a high concentration of antibodies, where prior exposure to antigens means that a patient can have a wide range of different IgGs and IgMs. Consequently, the test architecture can impact the test signal and ultimately the readout. One possible immunoassay architecture is to immobilize an antibody that binds IgG (αIgG) as a capture agent and use it as a label S conjugated to the NP.21–23 In this case, the αIgG at the test area would capture all the IgGs present in the sample, and then only the α-S antibodies would bind to the NP label. However, patients possess a large number of other IgGs, and this high abundance can result in the hook effect, where the signal decreases at high target concentrations. At high target concentration, both the immobilized capture antibody at the test line and the label antibody on the NP can bind to their own respective targets, reducing the probability of sandwich formation, thus compromising the signal.24,25
To reduce the impact of the hook effect, we chose to use an alternative binding geometry where the capture agents on the paper were immobilized S and N proteins (Fig. 2b). Thus, immobilized S would capture only αS antibodies and N only αN. Then, αIgG and αIgM served as label antibodies, where they are conjugated to the NPs that yield the visual signal. Elimination of the hook effect can also be assisted by running the NP–Ab conjugates in different steps, where the sample is not run at the same time as all of the NP–Ab conjugates. To distinguish which type of antibody is present (IgG vs. IgM), we used NPs of different colors for anti IgG and anti IgM, where αIgG was conjugated to red NPs and αIgM was conjugated to blue ones.
Because the assay targeted 4 different species, this architecture involves 4 different possible sandwiches. Ultimately, a given patient could potentially have a mixture of SARS-CoV-2 antibodies depending on where they are in the disease progression states and whether they have been infected or vaccinated (Fig. 2c). Determination of which antibodies are present was achieved by the combination of the color of the signal and the spatial location. The location distinguishes whether the antibody is αS or αN, and the color of the spot tells which type, so the readout is a pattern of colored signals as opposed to a presence or absence of a signal at a single test area.26
UV-vis of the particle solutions (Fig. 3a) showed that the red NPs had a peak at 526 nm and the GNS at 622 nm due to the surface plasmon resonance (SPR). DLS measured the hydrodynamic diameter (DH) of the particles, where the red NPs had 〈DH〉 = 77.0 ± 29.2 nm, and the blue GNS 〈DH〉 = 195.8 ± 37.2 nm (Fig. 3b, red).
Antibody conjugation to the blue GNS was achieved by physisorption, where the GNS was incubated in solution with the antibodies. DLS of the GNS–Ab exhibited an increase in DH to 277.4 nm (Fig. 3b, blue), confirming successful antibody conjugation. A similar effect was observed for the red NPs, which increased to a DH of 99.2 nm. UV-vis spectra of the red NPs showed minimal spectral changes, indicating that conjugation did not induce significant aggregation. For blue GNS, the SPR exhibited a slight redshift and broadening, indicating that some aggregation occurred upon conjugation. However, GNS–antibody conjugates were still stable in solution and thus still viable for immunoassays.
Finally, the red NPs and blue GNS had visually distinct colors, very different in the RGB space, as evidenced by the color of the solution and were also easy to visually distinguish when dropped on paper, the format in which they would be ultimately read out (Fig. 3c).
However, instead of optimizing the test for all 9 classes directly, we used ML to optimize the test first on a representative subset of the 4 classes representing the case for vaccinated early/mid/late: negative control, αS IgG, αS IgM, and αS IgG + αS IgM (Fig. 4a). We chose this subset because it required the ability to distinguish color on a single spot, which is more complex than distinguishing the location of a signal.
Tests were run on dipstick paper immunoassays which consisted of laser-cut nitrocellulose strips attached to an absorbent wick (Fig. 4a). N protein was spotted by pipetting onto position 2, and S protein on position 3. αFc was spotted onto the control line (position 4), which binds to NP–αIgG and GNS–αIgM directly regardless of whether the target is present, serving as a negative control and confirming that fluid flow occurred. To run the strips, they were immersed in a solution containing NP–αIgG and GNS–αIgM, running buffer, human serum, and targets (αS IgG, αS IgM, and αS IgG + αS IgM) depending on the running procedure. The solution was allowed to wick up the strip to the absorbent pad, which acted like a fluid sink.
The strip was designed such that if αS IgG is present, it should bind to the immobilized S at location 3. Then, the red NP–αIgG would bind to it to form the sandwich, resulting in a red spot at location 3. If αS IgM was present, it should bind to immobilized S at location 3. Then, the blue GNS–αIgM would bind to form the sandwich, resulting in a blue spot at location 3. If both αS IgG and αS IgM were present, they would bind to immobilized S at location 3. Both red NP–αIgG and blue GNS-αIgM would bind to it to form the sandwich, resulting in a purple spot (Fig. 4b).
The first trial consisted of running the immunoprobes separately, with wash steps in between (ESI† Fig. S1). First, a casein prewash was run as a blocking step. Then, GNS–αIgM was mixed with human serum (HS), running buffer (Tween/sucrose) and targets. This was followed by a mid-wash of casein. Then, red NP–αIgG was mixed with HS and running buffer, followed by a casein post wash. Strips were run in triplicate (ESI† Fig. S2). The first trial (Fig. 4c) had spots at the S location, indicating sandwich formation with the αS IgG and IgM antibodies. However, the colors were not exactly what was expected, as the red/blue color was not distinguishable by eye. The N location displayed no signal, indicating that there was no cross reactivity.
RGB values of intensities at both the N and S locations (locations 2 and 3) were obtained by image analysis of the scanned strips.19 This resulted in 2 sets of RGB results for each strip, or 6 components for the 12 training examples of this set. The 6 components were run through linear discriminant analysis (LDA, ESI† Fig. S3c) in order to maximize the ratio of between-class variance to within-class variance. The LDA plot showed clustering but with overlap between classes, indicating that the separability of the classes was suboptimal. αS IgM (blue diamonds) clustered but overlapped with the αS IgG + αS IgM cluster (yellow triangles). αS IgG (green squares) also overlapped with the others. The negative control (black circles) was distinct from the rest of the cases. A confusion matrix (ESI† Fig. S3d) was used to display the accuracy of the method, showing in a matrix format the true classes (rows) vs. the predictions from LDA (columns). The on-diagonal values in the 4 × 4 confusion matrix indicate correctly predicted classes, whereas off-diagonal values indicate errors. First, we evaluated the confusion matrix using the spot RGB values. LDA resulted in a confusion matrix with an overall accuracy of 50%, where it mistook αS IgG for αS IgG + IgM, αS IgM for αS IgG + IgM, and αS IgG + IgM for αS IgM. This can be attributed to the inability to distinguish the color at the S location, where what should be a red or blue spot appeared purple.
First, each pure stain by itself was imaged, where just the red NP–αIgG immunoprobe was dropped on nitrocellulose and imaged to define its stain vector S1, followed by the blue GNS–αIgM to define S2. A third vector, S3, was assigned as the cross product of S1 and S2 and thus orthogonal to both. Then, the colors at the S and N locations were deconvoluted into their S1, S2, and S3 components. Using the stain vectors for each of the spots in the LDA model (6 components) (Fig. 4d), the resulting confusion matrix had an accuracy of 92% (Fig. 4e), where it only mistook αS IgG for αS IgM, and was better than the LDA using RGB values of the spots (ESI† Fig. S3c and d).
To improve the test, additional trials using modified running conditions were performed. Conditions for a given trial were based on the analysis of the incorrect predictions of the previous trial. For trial 1, the mistake in prediction came from the inability to distinguish the color between blue and red (i.e., mistaking IgG for IgM). To address this, commercial red NPs (Innova) that are less susceptible to aggregation were used for the following trials. Because conjugation to commercial NPs relies on NHS–ester chemistry, TBS was added as a quencher as it contains amines (Fig. 4, trial 2). Resulting trial 2 strips (Fig. 4f) had a very weak signal at the S location on all strips.
LDA exhibited improved clustering (Fig. 4g) and ML analysis showed an accuracy of 50% (Fig. 4h), where there were multiple mistakes in which every class was mistaken for another. For example, αS IgM was erroneously classified as αS IgG + IgM, and αS IgG + IgM for the control and αIgM. Clearly, the weak signal made it difficult for the system to be trained to accurately distinguish the classes.
For the 3rd trial, we changed the running conditions by removing the casein wash in between running the blue GNS and red NPs, as it could be washing away NP–Ab and thus reducing the visible signal. Additionally, TBS was removed from the solution when running the red NPs to reduce particle aggregation. This time, the resulting signal intensity at the S area was much stronger, resulting in colors that were visually more distinct (Fig. 4i), especially between αS IgG and αS IgM. LDA (Fig. 4j) and ML reached an accuracy of 100% (Fig. 4k). Thus, all classes were correctly predicted, indicating that we converged on running conditions that can be used to successfully identify each of the four cases for this subset.
![]() | ||
Fig. 5 a) Strip runs for targets of N IgG and N subsets. b) Images of test strips. c) LDA. d) Confusion matrix showing 100% accuracy. |
These were run in triplicate (ESI† Fig. S4). Under these conditions, LDA and ML yielded a 3 × 3 confusion matrix with an accuracy of 100% (Fig. 5c and d), showing that the assay running conditions were also optimal for this subset.
When the strip was run with all four antibodies (αS IgG + αS IgM + αN IgG + αN IgM), representing an infected patient in the mid stage post infection, both S and N spots had color, and again both were purple in color.
Even though the test color patterns were not distinguishable by eye, ML could still be trained to distinguish the different cases. LDA clustering showed that there was minimal overlap except for this one error (Fig. 6c, brown x). The confusion matrix (Fig. 7a) had an accuracy of 96.3%, with only one error, where αS IgG + αS IgM + αN IgG + αN IgM was mistaken for αS IgM + αN IgM. This indicates that for one strip, the color differential between blue and purple for both S and N locations was not successful.
![]() | ||
Fig. 8 Limit of detection (LOD) of the test for a) αS IgG, b) αN IgG, c) αS IgM, and d) αN IgM. e) Scree plot of the assay. |
ML has already been demonstrated to be a powerful and versatile tool for arriving at a desired synthesis outcome, impacting a broad range of scientific areas. It has proven to be successful in the synthesis of organic molecules, soluble NPs, solid materials for batteries, natural compounds, fluorescent polymers, and many other species.12,33–36 The benefits of ML have been demonstrated for solution synthesis because all of the reagents are in a single phase, thus rendering the process suitable for completely autonomous control via microfluidics and liquid handling robots. In contrast, LFA development is less convenient to map to autonomous ML because the format is not in a single phase, where running assays involve utilizing solid materials such as paper strips in combination with solutions, so completing the feedback loop often requires human intervention. While modifications to running conditions were not performed autonomously, our results demonstrate the principle that the overall process can potentially use ML to aid development.
Even though the scale is small here, these results highlight the opportunity to utilize ML to rapidly create new tests for an emerging pathogen. Emergency preparedness for the next outbreak relies critically on diagnostics, which are the foundation of rapid response in disease surveillance and patient treatment.37 Delays in point of care test development can have dangerous consequences, especially early in an outbreak. Thus, any means to expedite this process could have a consequential impact on the availability of diagnostic tests.38
Furthermore, this approach can be easily shared and used to improve development time in areas with limited or no access to antibody production.39,40 This could potentially enhance the detection capabilities of reagents that are limited to what is currently available. Thus, this approach can expand access to diagnostic development and aid in confining the spread of a disease. We have already demonstrated that using multicolor NPs in combination with ML can be used to repurpose antibodies of one flavivirus to make a diagnostic for another.30 It has the potential to be applied to other diseases as well as emerging pathogens, and consequently help reduce response time during the critical stages of an outbreak, ultimately improving emergency preparedness.
IgG and IgM tests can increase the reach of disease surveillance tools and get a better handle of the impact of an outbreak. With the emergence of new variants,41 one can introduce reagents to be able to distinguish antibodies of different variants, and the test can yield historical information on what a patient has been exposed to. Here for SARS CoV-2, they can be used to determine past vaccination and/or infection status. Looking back to the early stages of the COVID-19 pandemic, if we had a point of care diagnostic for either diagnosing current infections by detecting the protein biomarkers, or past infections by antibodies at an earlier point in the outbreak, maybe it could have enabled containment and quarantining and resulted in different outcomes. The frequency of infectious diseases emerging is increasing due to a variety of factors,42 and using ML in LFA development could better prepare us for the next outbreak43,44 by expediting the response with diagnostics. Future work involves work on patient samples, potentially variant differentiation so that one can determine what someone has been infected with, and extension to other disease biomarkers.
Footnote |
† Electronic supplementary information (ESI) available: Experimental workflow, strip replicates, and color deconvolution comparison. See DOI: https://doi.org/10.1039/d3sd00327b |
This journal is © The Royal Society of Chemistry 2024 |