Lynda J.
Donald
a,
Maureen
Spearman
a,
Neha
Mishra†
a,
Emy
Komatsu
b,
Michael
Butler
ac and
Hélène
Perreault
*b
aDepartment of Microbiology, University of Manitoba, Winnipeg, Manitoba R3T 2N2, Canada
bDepartment of Chemistry, University of Manitoba, 144 Dysart Road, Winnipeg, Manitoba R3T 2N2, Canada. E-mail: Helene.Perreault@umanitoba.ca
cNational Institute of Bioprocessing Research & Training (NIBRT), Fosters Avenue, Dublin 4, A94 X099, Ireland
First published on 4th March 2020
Electrospray mass spectrometry (ESI-MS) was used to measure the masses of an intact dimeric monoclonal antibody (Mab) and assess the fucosylation level. The Mab under study was EG2-hFc, a chimeric human–camelid antibody of about 80 kDa (A. Bell et al., Cancer Lett., 2010, 289(1), 81–90). It was obtained from cell culture with and without a fucosylation inhibitor, and treated with EndoS which cleaves between the two core N-acetyl glucosamine (GlcNAc) residues. It is the first time that this combined approach with a unique mass spectrometer was used to measure 146 Da differences as part of a large intact dimeric antibody. Results showed that in the dimer, both heavy chains were fucosylated on the core GlcNAc of the Fc Asn site equivalent to Asn297. In the presence of the fucosylation inhibitor, fucosylation was lost on both subunits. Following reduction, monomers were analyzed and the masses obtained corroborated the dimer results. Dimeric EG2-hFc Mab treated with PNGase F, to deglycosylate the protein, was also measured by MS for mass comparison. In spite of the success of fucosylation level measurements, the experimental masses of deglycosylated dimers and GlcNAc–Fuc bearing dimers did not correspond to masses of our sequence of reference (A. Bell et al., Cancer Lett., 2010, 289(1), 81–90; http://www.uniprot.org; http://www.expasy.org), which prompted experiments to determine the protein backbone sequence. Digest mixtures from trypsin, GluC, as well as trypsin + GluC proteolysis were analyzed by matrix-assisted laser desorption/ionization (MALDI) MS and MS/MS. A few variations were found relative to the reference sequence, which are discussed in detail herein. These measurements allowed us to build a new “experimental” sequence for the EG2-hFc samples investigated in this work, although there are still ambiguities to be resolved in this new sequence. MALDI-MS/MS also confirmed the fucosylation pattern in the Fc tryptic peptide EEQYNSTYR.
One functional problem of Mabs used in treatment is their large size (typically 150 kDa) which may make penetration into tissues more difficult, thus reducing efficacy.24 Genetic engineering was previously used to produce a truncated hybrid Mab with properties similar to camelid antibodies (produced in bactrian camels, dromedaries, and llamas) which contain heavy chains only, and lack the CH1 domain of the heavy chain that binds to the light chains through disulphide bonds.25 The resulting Mab, EG2-hFc is an 80 kDa chimeric antibody with a camelid fragment variable (Fv) attached by a hinge to a humanized fragment crystallisable (Fc) region.26 The latter region was derived from a cloned portion of human IgG1, but no sequence information was provided.27 We have used the sequences of EG2 and part of IgG1 (P01857) to create a reference EG2-hFc sequence because it was important to know in advance the expected mass of the various components. Fig. 1 shows the reference sequence with the position of the glycosylation sites analogous to the Asn297 site of full-length IgG1. EG2-hFc has also retained, at Asn208, the glycosylation pattern of IgG at its Asn297 site.9,28 In addition, the antigenic target of EG2-hFc, the epidermal growth factor receptor (EGFR), is able to bind through the camelid Fv region (VHH) and Fc receptor binding can occur through the human Fc to afford initiation of the ADCC response.19,29 The smaller size of EG2-hFc may allow better penetration and increased efficacy for the treatment of EGFR expressing cancer cells.26
Fig. 1 Reference sequence of EG2-hFc compiled from the literature. “Asn297” is in bold type. (A) The constructed sequence: Glu5 was revised to Val.26,27 Val97 of IgG1 becomes Ala126 in this hybrid. Calculated mass is 39946 Da for the nonglycosylated monomer (compute pI/MW, http://www.expasy.org); (B) N-terminus: amino acids 1–125 (NCBI ABX79392.1 anti-EGFR single domain antibody, partial [Lama glama]);26 (C) PCR primer hFc6, 3′5′ frame 2 translation27 to overlap llama and human sequences at the hinge region, thus creating the hybrid; (D) C-terminus: amino acids 98–330 from IgG1 (UniprotKB|P01857|IGHG1_HUMAN Immunoglobulin heavy constant gamma 1 (http://www.expasy.org)). Reference peptides are underlined (http://www.proteomicsdb.org). |
Even though EG2-hFc is smaller than the usual Mabs, the presence of highly variable glycosylation makes the complete protein dimer a difficult study for mass spectrometry measurements. All the glycans can be removed by incubation of the Mab with peptide:N-glycosidase F (PNGase F), an endoglycosidase which leaves a deaminated protein or peptide and a free glycan.30 The mass of the protein, about 80 kDa,26 can be ascertained by running the samples on an SDS-PAGE gel and the glycans can undergo structural analysis through a variety of methods.31 This is an excellent method for recovery, and representative analysis of total glycans using hydrophilic interaction liquid chromatography (HILIC) glycan analysis32 or MS.33 However, both methods give total percent fucosylation, and offer no information on the percentage of fucosylation on individual dimers of the intact protein. Therefore, we have also used a specific endoglycosidase, endoglycosidase S (EndoS) which removes the majority of the glycan leaving only the core GlcNAc ± fucose. This allows MS analysis of the intact protein while removing the microheterogeneity of the glycan.
Intact proteins and their noncovalent complexes can be preserved in the gas phase if suitable conditions can be found.34,35 Electrospray ionization from a volatile buffer produces a mist of ionized droplets that are desolvated and focussed as they pass into the high vacuum conditions of the mass spectrometer. Folded proteins and their complexes usually carry less charge than denatured proteins because they should have less surface available to take up the charge. If a denaturing solvent is used to disrupt their folded conformation, the shape of the spectrum should change to a lower m/z range, enabling an accurate measurement of the mass of a purified protein. Our Manitoba-built time-of-flight mass spectrometer has been used successfully to determine stoichiometry of large complexes such as tetrameric AmpR protein with DNA and a repressor molecule,36 dsRNA and OAS with and without mercaptoethanol adducts,37 and others with high m/z values well beyond the range of most commercial instruments.38
Our primary aim was to develop methodology to differentiate the core fucosylated species (nonfucosylated, monofucosylated and difucosylated) on the full-length dimeric form of the protein. This was necessary to monitor the effects of limiting fucose addition by fucosylation inhibitors and genetic manipulation.39 Previous work has shown that the most abundant form of the dimer has two fucose molecules,9,28 consistent with our analysis. However, detailed analysis indicated that there was a small portion with no fucose and that the major protein species was considerably smaller than that calculated from the reference amino acid sequence illustrated in Fig. 1. Our secondary aim was to identify which amino acids are missing or modified from the reference sequence.
The EG2-hFc Mab was purified from the culture media using a HiTrapTM Protein A HP 1 mL column (GE Healthcare, Fairfield, CT). Cell culture supernatant was applied to the column, washed with 10 mL PBS (pH 7.4) and then the protein was eluted with 0.1 M glycine–HCl, pH 2.7 (Sigma-Aldrich, St. Louis, MO). The protein solution was neutralized immediately using 1 M Tris–HCl, pH 9.0 (Thermo Fisher Scientific, Waltham, MA) and concentrated using an Amicon Ultra 30 kDa MWCO filter (EMD Millipore, Etobicoke, ON).
Two different enzymatic digests were used to decrease the amount of glycosylation on the protein prior to sample preparation for mass spectrometry analysis. Analysis of the glycans is normally done after a PNGase F (Promega, Wisconsin, USA) digestion on the Protein A column, removing all glycan residues,40,41 and allowing subsequent elution of a fully deglycosylated protein. In order to retain the core GlcNAc and fucose residues, the purified protein was bound to immobilized endoglycosidase S (deGlyIT, Genovis, Cambridge MA) to hydrolyze the β-1,4 linkage between the variable glycans and the core GlcNAc residue.42,43 One preliminary sample of EG2-hFc from another experiment39 was prepared from media containing 40 μM 2-fluoroperacetylated fucose (2FF), a competitive inhibitor of the FUT8 enzyme that normally adds the fucose to the core GlcNAc.44 SDS-PAGE gels were routinely run to ascertain purity and recovery of the protein at each stage of protein preparation.
Fig. 2 Electrospray spectra of the complete denatured EG2-hFc (A) after EndoS digest. The protein was 10 μM in 50% methanol/1% acetic acid and sprayed at 200 V declustering voltage. Capillary voltage was 3000 V. Deconvolution is shown in Fig. 3A: (B) after PNGase F digestion. The protein, at 10 μM in 50% methanol/1% acetic acid, was sprayed at 150 V declustering voltage, 2127 V capillary voltage. Deconvolution is shown in Fig. 3B: (C) after EndoS digest from media with 2FF. The protein, at 10 μM in 50% methanol/1% acetic acid, was sprayed at 120 V declustering voltage, 3000 V capillary voltage. Deconvolution is shown in Fig. 3C. ■ GlcNAc, ◀ fucose. |
The deconvolutions are shown together in Fig. 3, and the data are summarized in Table 1. Deconvolution of the ions in the envelope at higher charge states (29+ to 36+) from the spectrum shown in Fig. 2A (EndoS) revealed more than one mass, but the major species, at 80096 Da, agreed with the “about 80 kDa” cited in the literature.26 Deconvolution of the 17+ to 19+ ions gave a mass of 80098 ± 20 Da for the major species showing that it is from the same protein, but less well resolved. The most prominent minor component of 79848 Da did not fit with the loss of either fucose (146 Da) or GlcNAc (203 Da). The very small minor component was about 79670 Da. For the spectrum shown in Fig. 2B (PNGase F), deconvolution of the 16+ to 20+ ions gave a pattern quite similar to that of the EndoS sample with the exception of lower mass values, a major species at 79397 Da, and a minor one of 79158 Da. For the spectrum shown in Fig. 2C, from a protein expected to have no fucose on the core GlcNAc residues, the deconvolution of the ions of higher charge states has a pattern similar to those shown in Fig. 3A and B, but with intermediate masses, 79798 Da for the major species, and 79552 Da for the less abundant species. There may be other, minor species in the spectra but their contribution to the high baseline interferes with further identifications.
Fig. 3 Deconvolutions of the spectra shown in Fig. 2. (A) The 29+ to 36+ ions of the dimeric protein after EndoS digestion: (B) the 16+ to 20+ ions of the dimeric protein after PNGase F digestion: (C) the 29+ to 36+ ions of the dimeric protein after EndoS digest from culture media with 2FF. Detailed analysis, error measurements, and comparison with the expected masses, are in Table 1. ■ GlcNAc, ◀ fucose. |
Enzyme | Inhibitor | Core glycan | Mass of monomer (Da) | Mass of dimer (Da) | Fig. | ||
---|---|---|---|---|---|---|---|
Expected | Observed | Expected | Observed | ||||
PNGase F | — | None, Asn208Asp | 39947 | 39699 ± 7 | 79892 | 79397 ± 15 | Fig. 2B and 3B |
79158 ± 27 | |||||||
EndoS | — | None | 39946 | — | 79890 | — | |
2FF | 1GlcNAc | 40149 | 39905 ± 8 | 80093 | — | ||
39780 ± 17 | |||||||
2FF | 2GlcNAc | n/a | 80296 | 79798 ± 14 | Fig. 2C and 3C | ||
79552 ± 25 | |||||||
— | 1GlcNAc, 1fucose | 40295 | 40050 ± 7 | 80239 | — | ||
39929 ± 11 | |||||||
2FF | 2GlcNAc, 1fucose | n/a | 80442 | — | |||
— | 2GlcNAc, 2fucose | n/a | 80588 | 80096 ± 14 | Fig. 2A and 3A | ||
79848 ± 30 | |||||||
ca. 79670 |
The difference in mass between the major ions shown in Fig. 2A, B and 3A, B (Table 1) was very close to that expected for two GlcNAc and two fucose residues (expected 698 Da, observed 699 Da). Additionally, minor ions differed by nearly the same Δmass (690 Da) and agree with major ions, if one considers the error of measurement. The difference in mass between the most prominent species in samples Fig. 2A, C and 3A, C is 298 Da and 296 Da, close to that expected for two fucose residues. Similar calculations between the pairs of species in Fig. 2B, C and 3B, C is 401 and 394 Da, close to what is expected for two GlcNAc residues. There are more precise methods of measuring the mass of pure GlcNAc and fucose, but these are 203 and 146 Da additions to an 80 kDa protein! The difference in mass between the major and minor species was about the same in all three spectra, and did not fit with either GlcNAc or fucose.
Fig. 4 Electrospray spectrum of the same sample shown in Fig. 2A after DTT treatment. The protein was at 10 μM in 50% methanol/1% acetic acid and sprayed at 220 V declustering voltage and 2611 V capillary voltage. Inset is the deconvolution of the 12+ to 14+ ions. ■ GlcNAc, ◀ fucose. |
These measurements showed that the spectrum in Fig. 2A was from a protein with GlcNAc and fucose residues on both halves of the dimer, although the observed mass is almost 500 Da smaller than expected from the reference sequence in Fig. 1. The protein with no fucose on the core (Fig. 2C), and the one lacking any core glycans (Fig. 2B) were similarly about 500 Da smaller than expected from the calculated values. The measureable minor component in each case is about 250 Da smaller than the major one, a value that could be explained by a pair of lysine residues, and borne out by the measurements of the monomers. An extra, optional, lysine could easily be lost from the C-terminus of the protein,48 but this does not take into account the 500 Da discrepancy in the total mass of the dimer, and half that amount in the total mass of the monomers. Preliminary experiments had verified the sequence of the initial tryptic peptide and the unique glycosylation site, but afforded little information on the rest of the protein. Thus we digested the protein with trypsin alone, with GluC alone, and with a mixture of GluC and trypsin.
The results of the three kinds of digests are shown, together with our experimentally determined protein sequence, in Fig. 5. Three of the five referenced tryptic peptide sequences of P01857, (underlined in Fig. 1D) were confirmed by MS/MS analysis. The ions matching peptide 204–212, with the unique glycosylation site, are illustrated in Fig. 6. The missing reference peptides were located at each end of the P01857 protein – the first and last peptides of IgG1. The first is located in the part of IgG1 replaced by the EG2 sequence, and there was no ion matching that expected for the final tryptic peptide, although there was an ion from the GluC digest that might represent that larger, final peptide for which there was insufficient material for further analysis.
We found six laboratory-determined differences (Fig. 5) from the reference sequence given in Fig. 1: Gln1 is pyroglutamate (Ƿ), Lys3 is Gln, Val126 is Ala, Asp181 is Gly, Tyr189 is His and Ala342 is Gly. The first pair of changes is in the initial peptide, where an abundant ion at m/z 1895.982 did not match any predicted from the published sequence. This was fortuitous, because MS/MS revealed it to be the first 19 amino acids of the protein, with the last y ion and all the b ions 17 Da too small, most likely from deamidation of the first glutamine residue to pyroglutamic acid.49,50 If amino acid three was lysine, the measured ion should have been at m/z 1896.019, and we would expect to observe another tryptic digest ion at m/z 1557.824. Thus, we believe the initial sequence of most of the hybrid protein is pVQLVESGGGLVQAGDSLR where Ƿ represents pyroglutamate. However, the expected ion at m/z 1913.009 was also present at about 10% of the abundance of the ion at m/z 1895.982, indicating that the conversion to pyroglutamate was incomplete. The Val126Ala and Tyr189His were identified by overlapping fragments, and then confirmed by MS/MS analysis. The Asp181Gly and Ala342Gly were inferred by overlapping fragments, but the ions that fit with our expected sequences were of insufficient abundance for successful MS/MS analysis. There were no ions matching those expected for peptides having Asp at position 181 or Ala at position 342.
The peptide EEQYNSTR with Asn208, (analogous to Asn297 of IgG) could have been present without glycosylation (at m/z 1189.513), with GlcNAc (at m/z 1392.592), and with both GlcNAc and fucose (at m/z 1538.650). In the sample prepared with EndoS, the ion at m/z 1538.650 was present at high abundance, and there was a very small amount of an ion at m/z 1392.610, evidence that some small amount of the protein was missing the fucose. In the sample prepared with PNGase F, there was no ion at m/z 1189.513 but rather a prominent ion at m/z 1190.510. Fig. 6 shows the fragmentation patterns of three forms of the tryptic peptide EEQYNSTYR, characteristic of EG2-hFc. The top MS/MS spectrum (A) was obtained after PNGase F and tryptic treatments, denoting the sequence EEQYDSTYR. The second spectrum (B) shows an m/z 50–626 range similar to that of (A), while the higher m/z peaks are shifted upwards by 203, the residual mass of GlcNAc. In (C), this increment is larger as it corresponds to GlcNAc–Fucose, following Endo S and tryptic digestion. In (B) and (C), the higher m/z portions feature the characteristic (peptide − 16)+, (peptide + 1)+ and (peptide + 84)+ ions of glycopeptide MS/MS spectra,51 where peptide = mass of nonglycosylated peptide, which is 1188 in this case for EEQYNSTYR. The peptide−16, +1 and +84 ions appear at m/z 1172, 1189 and 1272 in both spectra.
Another approach had been developed for fucosylation studies on Mabs. Upton et al. were interested in the core afucosylation levels of HerceptinTM Fc glycans, which they were able to correlate with receptor binding affinity studies by surface plasmon resonance (SPR) and ADCC.53 Their approach consisted of treating intact HerceptinTM with IdeS enzyme, releasing two separate truncated Fc chains with the glycans of interest. These were then treated with endoglycosidase EndoS, leaving either GlcNAc or GlcNAcFuc on each chain, as measured by mass spectrometry.53 A similar process was applied in Industry, showing a high level of interest for fucosylation in the scientific community.54 Although this approach is effective at quantifying total fucosylation or afucosylation levels, it cannot specifically determine if, within each antibody molecule, both chains are fucosylated or not. We believe that our method of measuring the mass on the intact Mab after EndoS treatment brings in this level of specificity as demonstrated by our results (Fig. 2 and 3).
Mass spectrometry technology is at the forefront of Mab analysis, after starting with simple experiments to determine biotin additions on a small Mab,55 to antibody–drug conjugates where a mass difference of 20 Da could be detected on the fully glycosylated Mab.56 Instrument development using commercial standards of IgG has shown the potential applications of microfluidics,57 variable temperature58 and OrbitrapTM MS56 to the characterization of intact antibodies. Electrospray ionization has been particularly useful in determining the mass and structure of intact proteins and non-covalent complexes.35,59–61
It would have been possible to perform the type of experiments reported here on a commercial high resolution ESI-MS instrument with a restricted m/z range, e.g. up to 4000. For example, Jacobs et al. developed a high resolution method for the quantitative study of glycosylation of intact Mabs and IgG from whole serum, using nano liquid chromatography chip technology on a Q-TOF instrument.62 Chip ESI led to multiply charged intact antibodies on a m/z range of 2000–3200 for the specific analyses of TrastuzumabTM and BevacizumabTM62.
This is the first mass spectrometry analysis of a Mab where the primary purpose was to measure the fucosylation on the intact EG2-hFc dimer. We have analyzed the protein with no glycan (PNGase F-treated), and also an EndoS-treated protein which removes most of the terminal glycan, leaving only the GlcNAc core with or without the core fucose. The EndoS-treated protein has been derived from Mab protein produced under normal conditions which has a high fucosylation index (70–80%),11,19 or in the presence of a fucosylation inhibitor (40 μM 2FF) which produces a nonfucosylated protein.39 Our experiments went beyond measuring only the fucosylation level of intact EG2-hFc, and allowed us to assess differences in protein folding because there were well-defined ion species up to m/z 5500. Our results suggest that the absence of glycosylation prevents the natural unfolding process that was expected to happen in the presence of acid.
The first step in examining any protein should be to determine its nominal mass, which is not always exactly as expected from the provided amino acid sequence, and we were interested in measuring a mass difference of one or two fucose residues on the “about 80 kDa” protein.26 Thus, we used a methanol/acetic acid mixture to disrupt the secondary structure of the protein. This was not expected to affect the glycans, as glycopeptides analysis by MALDI-MS has previously shown intact glycan structures in the Fc region of EG2-hFc.33 We expected to see a single ion envelope at high charge state, such as shown by the ions in the lower m/z region in Fig. 2A and C. The presence of ions at lower charge state from those samples that had retained some part of the core glycan were continuous with the higher charge state ions. Those ions could be explained as not quite denatured, or more folded because the reduced surface area of a folded protein cannot hold as much charge as a typical denatured protein. However, the sample treated with PNGase F (Fig. 2B) had a spectrum typical of a folded protein, and no amount of dilution or rough treatment changed the spectrum. A reasonable explanation could be that the PNGase F-treated protein is not properly folded, but rather collapsed, so that the acid/methanol mixture is ineffective for solvating or unfolding it. Davis et al.63 reported that fully deglycosylated immunoglobulin superfamily I (IgSFI) tended to aggregate although it still bound ligand and antibody. Preparation of antibody for crystallisation studies was successful when cells were made sensitive to Endo H, but the use of PNGase F was reported to cause aggregation.64 Folding studies on a mono-N-glycosylated immune cell receptor has shown that the core GlcNAc is a major contributor to its stability, stability that is enhanced when more of the glycan is present.65 Therefore, although PNGase F treatment is ideal for the preparation of the glycans, the residual protein is probably not properly folded. An EndoS preparation would leave at least the core GlcNAc, and a protein with more of its natural structure.
There are at least three different isoforms of EG2-hFc; the variable N-terminus, the variable C-terminus, and the variation in the core glycosylation. None of these fully explained the difference between the measured mass of the nonglycosylated monomer and that calculated from either the reference sequence, (mass 39946 Da, Fig. 1), or the experimentally determined sequence (mass 39831 Da, Fig. 4). The new calculated mass of a dimer with 2GlcNAc and 2fucose, (80358 Da) is still 262 Da larger than the measured mass (Table 1). With the exception of the small dipeptide GR (213.1226 Da), there was apparently complete coverage of the protein when we used the three digestion approaches. Loss of a C-terminal Lys residue has been observed and is considered common,48 but we cannot prove that the apparently lost lysine is the one shown in the sequence as residue 358. Additionally, the final tryptic peptide, one of the key reference peptides for P01857, was not observed, and, except for a weak ion of unconfirmed sequence from the GluC digest, there was no evidence for its presence, suggesting some C-terminal proteolysis beyond the loss of lysine. IgG1 has been shown to have at least three different C-termini, at Pro, Gly, and Lys66 although these variants do not quite account for the 262 Da of missing sequence in our EG2-hFc. There is a great deal of heterogeneity in both the N- and C-termini of many Mabs, but these did not impact the potency or efficacy of the Mabs used clinically,67 so the final peptide is probably not important to the functionality of EG2-hFc. We did not anticipate such variability in the protein itself, variation that we are unable to explain completely at the present time.
Footnote |
† Current address: Horizon Discovery, 8100 Beach Drive, Waterbeach, CB25 9TL, UK. |
This journal is © The Royal Society of Chemistry 2020 |