Henry E.
Lanyon
,
Benjamin P.
Todd
and
Kevin M.
Downard
*
Infectious Disease Responses Laboratory, Prince of Wales Clinical Research Sciences, Sydney, Australia. E-mail: kevin.downard@scientia.org.au
First published on 8th November 2023
A selected ion monitoring (SIM) approach combined with high resolution mass spectrometry is employed to identify and distinguish common SARS-CoV2 omicron and recombinant variants in clinical specimens. Mutations within the receptor binding domain (RBD) within the surface spike protein of the virus result in a combination of peptide segments of unique sequence and mass that were monitored to detect BA.2.75 (including CH.1.1) and XBB (including 1.5) variants prevalent in the state's population in early 2023. SIM detection of pairs of peptides unique to each variant were confidently detected and differentiated in 57.3% of the specimens, with a further 10 or 17.5% (for a total of 74.8%) detected based on a single peptide biomarker. The BA.2.75 sub-variant was detected in 18.7%, while recombinant variants XBB and XBB.1.5 were detected in 13.3% and 25.3% of the specimens respectively, consistent with circulating levels in the population characterised by RT-PCR. Virus was detected in 75 SARS-CoV2 positive specimens by mass spectrometry down to the low or mid 104 copy level, with a single false positive and no false negative identified. This article is the first paper to characterise recombinant strains of the SARS-CoV2 virus by this, or any other, MS method.
Viral recombination plays a major role in increasing the variability of RNA viruses, enhancing their viral fitness, and accelerating their adaptation within new hosts. It can occur in both segmented and non-segmented RNA viruses such as SARS-CoV22 and type A influenza respectively. Both in silico3 and in vivo4 studies have presented evidence for recombination within SARS-CoV-2 strains.
Recombinant strains generally emerge when a major strain, associated with a wave of infections, begins to decline in the population at the same time that a new variant emerges.5 Strains associated with the recombination of delta and omicron variants (so-called deltacron) were of particular concern given their widespread circulation in the world's population and that these variants represented deadliest and most mutated forms respectively.6 An analysis of SARS-CoV2 genome sequences collected from late 2019 to mid 2022 showed a substantial increase in the emergence of such SARS-CoV-2 recombinant lineages during the omicron wave.7
In early 2022, the World Health Organisation (WHO) reported8 three recombinant variants of SARS-CoV-2 that posed the greatest threat to public health. These were denoted XD, derived from both delta (AY.4) and omicron BA.1 variants, XE from two omicron variants (BA.1 and BA.2), and XF from a delta and omicron BA.1 variant. As of March 2023, the WHO declared the need to monitor additional common XBB and XBF recombinant variants. The XBB variants are a recombinant of BA.2.10.1 and BA.2.75 sub-lineages, while the XBF variant combines BA.5.2.3 and BA.2.75.3 sub-lineages (Fig. 1 and Table 1).
Fig. 1 Representation of the SARS-CoV2 genome of common delta-omicron (“deltacron”) and omicron recombinant variants. |
Recombinant designation | Originating strains | Spike protein mutations | Date first identified | |
---|---|---|---|---|
Sources: Pavan et al. 2022;28 World Health Organisation (https://www.who.int/activities/tracking-SARS-CoV-2-variants), and NSW Government Agency for Clinical Innovation (https://aci.health.nsw.gov.au/covid-19/critical-intelligence-unit/sars-cov-2-variants). | ||||
Delta-omicron recombinants | XD | Delta AY.4 and BA.1; omicron S gene incorporated into a Delta genome | T19R; A27S; T95I; G142D; R158G; L212I; G339D; S371L; S373P; S375F; K417N; N440K; G446S; S477N; T478K; E484A | 01/2022 |
XF | Delta AY.4 and BA.1 | A67 V; T95I; Y145D; L212I; G339D; S371L; S373P; S375F; K417N; N440K; G446S; S477N; T478K; E484A | 02/2022 | |
Omicron recombinants | XE | BA.1 and BA.2; majority of S gene from BA.2 | T19R; A27S; G142D; V213G; G339D; S371L; S373P; S375F; T376A; D405N; R408S; K417N; N440K; S477N; T478K; E484A | 03/2022 |
XBB | BA.2.10.1 and BA.2.75 sublineages | G339H, R346T, L368I, V445P, G446S, N460K, F486S, F490S | 08/2022 | |
XBB1.5 | BA.2.10.1 and BA.2.75 sublineages | XBB + F486P | 01/2022 | |
XBD | BA.5 and BA.2.75 sublineages | R346T; G446S: N450L; N460K; F486S; R493Q | 08/2022 | |
XBF | BA.5.2.3 and CJ.1 (a distant descendant of BA.2.75) | BA.5 + K147E, W152R, F157L, I210 V, G257S, G339H, R346T, G446S, N460K, F486P, F490S | Late 2022 |
Viral recombination can be difficult to detect whenever sub-lineages have minimal mutational differences.9 A typical strategy combines whole-genome sequencing with phylogenetics to detect unusual combinations of single nucleotide polymorphisms (SNPs) which are phylogenetically diverged.10 However, the limited numbers of SNPs that distinguish certain clades are also often clustered within short regions of the genome that, together with reversion events, restrict the ability to identify potential recombinant forms. Algorithms designed to detect recombination from such genetic data assess all SNPs equally, regardless of how phylogenetically significant they are, and thus are prone to misassignments.11
Other strategies to identify recombination events focus on the identification of mutations in the receptor binding domain (RBD) within the surface spike protein.12,13 Recombination events have been inferred to occur disproportionately in the 3′ portion of the genome,14 which contains the spike protein gene. The accrual of variant-defining spike protein mutations has been attributed to recombination events.9 Multiple amino acid mutations in the spike protein have been used to identify recombination events15
In this laboratory, recombinant influenza viruses have been detected at the protein level with high resolution mass spectrometry using specifically-designed, in-house software. Two algorithms, known as FluShuffle and FluResort, were shown to be able to identify reassorted influenza viruses from protein mass map data generated from whole virus digests.16 FluShuffle considers different combinations of viral protein identities that match the mass map data using a Gibbs sampling algorithm. FluResort maps those identities onto phylogenetic trees, constructed from viral protein sequence alignments, to calculate the weighted distance of each across two or more different trees. Each weighted mean distance value is normalized by conversion to a Z-score that is used to establish the probability of a reassorted strain. A combination of two strains represents a reassorted virus from one reassortment event, a combination of three strains represents a reassorted virus from two reassortment events and so on. Minimum composite Z-scores were compared across differing numbers of reassortment events to determine whether or not the virus was reasserted.16
In parallel, we advanced a proteotyping strategy for typing and subtyping viruses17,18 to identify reassortment in 2009 pandemic strains of influenza19 through the detection of unique human and swine host-specific signatures co-circulating in viruses during the same period.20 Given the SARS-CoV2 virus contains a single genome segment, rather than a segmented one in the case of influenza, evidence for reassortment can be uncovered using this simpler viral protein strategy that identifies unique peptide biomarkers. These derive from the accumulation of multiple spike protein mutations that originate from different variants21,22 and combine into a single recombinant form. Those that are ionised and detected in expressed spike protein forms can be used to detect and distinguish recombinant viruses in proteolytically digested whole virus clinical specimens. This article represents the first paper to characterise recombinant SARS-CoV2 virus strains by this, or any other, MS method.
Virus was precipitated from the remainder of the solution through the addition of 95% ethanol solution cooled to −20 °C, collected on a 300 K molecular weight cut-off (MWCO) filter (Pall Corporation, Cheltenham, Victoria, Australia), washed with purified water, to recover the retentate. The filter recovered virus was resuspended in a digestion buffer solution (100 μL of 50 mM ammonium bicarbonate, 10% acetonitrile (99.9% purity), 2 mM dithiothreitol, and 5 mM octyl β-D-glucopyranoside) at pH 7.5, the solution sonicated for 15 min, incubated for 2 h at 37 °C, and digested overnight following the addition of 15 ng μL−1 sequencing-grade modified trypsin (Promega Corporation, Sydney, Australia).
XBB is the most widespread inter-lineage omicron recombinant variant. It represents a recombination of two BA.2 lineages; BJ.1 (or BA.2.10.1.1) and a BA.2.75 variant (namely BA.2.75.3.1.1.1). XBB inherited the 5′ part of its genome from BJ.1 and the 3′ end of its genome from BA.2.75, with a single breakpoint within the receptor binding domain (RBD) of the surface spike protein. This breakpoint provides potent antigenic RBD mutations from both BJ.1 and BA.2.75.
Sequences comprising the RBD of the spike protein (residues 320–541, numbered according to the NCBI reference sequence, accession QHD43416.1) for the BA.2.75 subvariants and recombinant forms XBB.1 and XBB.1.5 are shown in Fig. 2. Because of the associated different mutations, and the propensity of proline to prevent tryptic cleavage where it resides C-terminal to lysine or arginine residues, a number of tryptic peptides of unique mass are produced. These are shown both boxed and shaded (each with a unique colour) together with their protonated monoisotopic masses located above or below their sequence.
Fig. 2 Receptor binding domain sequences (residues 320–541) of spike protein of common omicron and recombinant SARS-CoV2 variants showing peptide segments of unique sequence and mass. |
Those that exclusively distinguish the BA.2.75 subvariant and recombinant XBB forms span residues 329–346 in the BA.2.75 subvariant at m/z 2120.0382 and residues 329–355 in the XBB recombinant forms at m/z 3159.5145, and residues 467–509 at m/z 4815.2507, 4695.1779 and 4705.1986 in the BA.2.75, XBB.1 and XBB.1.5 forms respectively. Subsequent MALDI-MS analyses on the trypsin-digested, recombinant spike protein for each variant were used to establish the ionisability and detectability of these ions.
These ions not appear in the spectrum for XBB 1.5 (Fig. 4). This spectrum displays variant-specific peptide ions at m/z 3159.5163, 3436.5914 and 4705.0770 (shown in bold); the latter unique to the XBB.1.5 form. These ions corresponding to tryptic peptide residues 329–355, 425–454 and 467–509.
Subsequent narrow-band excitations, in selected ion monitoring (SIM) experiments, were performed to detect these ions in clinical specimens for diagnostic purposes. SIM analysis reduces the total spectral acquisition time, and improves sensitivity, by focussing only on the detection of the variant-specific peptide markers of interest. This improves the sensitivity of analysis, to aid peptide detection at low virus titre levels, compared with full-scan MS measurements.25
A narrow mass range of 20 mDa. (±0.01 m/z) centred on the theoretical monoisotopic values (Fig. 2) for each of the respective protonated peptide ions. The detection of at least two variant-specific ions, and not only a single peptide marker, was adopted for diagnostic purposes. This helps reduce, if not completely eliminate, the chance that another single viral peptide, unique to a particular strain, or specimen contaminant, is mistaken for one of the variant-specific mass markers. The high mass accuracy, employed in these high mass resolution experiments, also significantly reduces this possibility.
The data is represented in box-plot format (Fig. 5), as employed in a recent study of SARS-CoV2 sub-variants.22 The box edges indicate the first and third quartile of intensities, and the central line indicates the median value. Of the 75 SARS-CoV2 RT-PCR positive specimens, ions at both m/z 2120 and 4815 (nominal mass for BA.2.75) were detected within 2 ppm of the theoretical values, with intensities above the 15 negative control specimens, in 14 (or 18.7%) of the samples. Ions with intensities above the controls at both m/z 3159 and 3436 (for XBB) were detected in 10 (or 13.3%) of the samples, while those at m/z 4695 and 4705 (for XBB.1.5) in 19 (or 25.3%) of the samples. Virus titres across all samples were quantified at between 104 to 106 copies per mL by RT-PCR and this is reflected in the significant deviation in the ion intensities to the box edges in Fig. 5.
Fig. 5 Selected ion monitoring (SIM) MALDI-MS analysis of variant specific peptide ion biomarkers in trypsin digested, whole virus SARS-CoV2 positive and negative (control) samples. Intensities measured above those in the negative (controls) for each of the ions, detected within 2 ppm of the theoretical values (of Fig. 2), are deemed positive. |
The results indicate that the BA.2.75, XBB and XBB.1.5 variants could be confidently detected and differentiated in 57.3% of the clinical specimens, while a further 10 or 17.5% (for a total of 74.8%) were identified based on the detection of one (and not two) of the peptide biomarkers. The remaining 25.2% contain either other variants, or were positive for BA.2.75, XBB and XBB.1.5 variants but at levels lower than could be detected by mass spectrometry. A single false positive result (1.3%) was obtained for one of the control samples, based on RT-PCR analysis, with a single m/z 4695 ion associated with the XBB.1.5 variant detected above background control levels. No false negative results were found.
Contaminants unique to the unassigned specimens, or at higher concentration levels in these specimens, may also mask the detection of the variants under study by suppressing the ionization and detection of the peptide biomarkers of interest. Further, some additional strain-specific mutations within the segments studied may alter the masses of peptides, just as mutations at the nucleotide level can confound PCR based hybridization assays.
These, and separate studies, have revealed that mass spectrometry approaches can detect virus components to the low or mid 104 copies per mL level, but are challenged below this level.18,25,26 A SARS-CoV-2 RNA quantitation PCR assay, in contrast, has a limit of detection (LoD) of some 25 copies per mL These limits remain a challenge for mass spectrometry based studies which directly detect virus present in clinical specimens without the benefit of amplification afforded by PCR analytical approaches. However, given that a viral load greater than approximately 103 copies per mL is typically required for a patient to be infectious,27 mass spectrometry has a role to play as a diagnostic tool as infection takes hold. Indeed, the ability to reliably detect and distinguish the most common BA.2.75, XBB and XBB.1.5 SARS-CoV2 variants in up to 75% of positive specimens analysed in this study demonstrate the very real and powerful applicability of high resolution mass spectrometry in the analysis of SARS-CoV2 viruses within clinical specimens and settings.
Here a mass-only detection approach, with high mass accuracy and assignment confidence, are employed. Any high resolution MALDI based mass spectrometer, such as extended-path TOF32,33 or Orbitrap (Kingdon trap)34 instrument, could be used over an ICR for this purpose. This is coupled to SIM analysis versus full-scan experiments to improve sensitivity. The SIM of multiple peptides associated with specific variants21 and sub-variants22 can also be conducted to differentiate these from recombinant forms. The pre-MS analysis steps are minimal, with the virus specimen filtered, to remove extraneous contaminants, then precipitated and proteolytically digested. The digestion step could be accelerated using immobilized enzymes35 or other reported approaches.36
The confident detection of BA.2.75, XBB and XBB.1.5 variants in the majority of SARS-CoV2 positive clinical specimens analysed in this study, at percentages consistent with those characterised by RT-PCR sequencing that are in circulation in the local state (NSW) population, demonstrate the real power and benefits of a mass-based detection approach to follow variant-specific mutations in the surface spike protein. The ability to confidently detect and differentiate omicron sub-variants and emerging recombinant forms, without the need and considerable time (days) required for gene or peptide and protein sequencing, is of vital importance to manage outbreaks of the virus and develop effective clinical responses. The approach overcomes limitations of RT-PCR approaches where limited numbers of SNPs and reversion events, restrict the ability to identify potential recombinant forms.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3an01376f |
This journal is © The Royal Society of Chemistry 2023 |