Lei
Li‡
a,
Yunpeng
Liu‡
a,
Cheng
Ma‡
a,
Jingyao
Qu
a,
Angie D.
Calderon
a,
Baolin
Wu
b,
Na
Wei
a,
Xuan
Wang
a,
Yuxi
Guo
a,
Zhongying
Xiao
a,
Jing
Song
a,
Go
Sugiarto
c,
Yanhong
Li
c,
Hai
Yu
c,
Xi
Chen
c and
Peng George
Wang
*a
aDepartment of Chemistry and Center of Diagnostics & Therapeutics, Georgia State University, 50 Decatur St SE, Atlanta, GA 30303, USA. E-mail: pwang11@gsu.edu
bChemily, LLC, 58 Edgewood Ave NE, Atlanta, GA 30303, USA
cDepartment of Chemistry, University of California, Davis, One Shields Avenue, Davis, CA 95616, USA
First published on 23rd June 2015
Quantification, characterization and biofunctional studies of N-glycans on proteins remain challenging tasks due to the complexity, diversity and low abundance of these glycans. The availability of structurally defined N-glycan (especially isomer) libraries is essential to help solve these tasks. We report herein an efficient chemoenzymatic strategy, namely Core Synthesis/Enzymatic Extension (CSEE), for rapid production of diverse N-glycans. Starting with 5 chemically prepared building blocks, 8 N-glycan core structures containing one or two terminal N-acetyl-D-glucosamine (GlcNAc) residue(s) were chemically synthesized via consistent use of oligosaccharyl thioethers as glycosylation donors in a convergent fragment coupling strategy. Each of these core structures was then extended to 5 to 15 N-glycan sequences by enzymatic reactions catalyzed by 4 robust glycosyltransferases. Success in synthesizing N-glycans with Neu5Gc and core-fucosylation further expanded the ability of the enzymatic extension. Meanwhile, high performance liquid chromatography with an amide column enabled rapid and efficient purification (>98% purity) of N-glycans in milligram scales. A total of 73 N-glycans (63 isomers) were successfully prepared and characterized by MS2 and NMR. In summary, the CSEE strategy provides a practical approach for “mass production” of structurally defined N-glycans, which are important standards and probes for glycoscience.
N-glycans found in nature possess an inherited complexity and diversity. These are mainly due to the variable and multiple connectivity of glycan building blocks (monosaccharides) and the processes by which they are assembled in biosystems. In mammalian glycomes, numerous glycan structures can be formed including branched-, regio- and stereo-isomers from only 10 common monosaccharide building blocks.3 Unlike precise template directed transcription/translation of nucleic acids/proteins, glycan structures are determined by the activities of glycosyltransferases (GTs), glycosidases, and other glycan biosynthetic enzymes, as well as the availability of donor substrates. For example, more than 30 GTs and glycosidases in the Golgi complexes of human cells are involved in processing N-glycans.4 The expression, activity, substrate specificity, and localization of each enzyme each has the potential to influence the assembly of N-glycans. It is thus understandable that N-glycans are extremely micro-heterogeneous even in one particular N-glycosylation site. For example, 58 different complex N-glycan structures were identified at one N-glycan site in mouse zona pellucida glycoprotein 3.4 As a result, despite decades of effort in developing novel approaches for glycan analysis,5 absolute quantification and characterization of complex mixtures of N-glycans remain challenging tasks. At present, the main approach for characterizing N-glycan isomers is ion-trap mass spectrometry (MS) analysis of permethylated glycans, which requires large quantities of samples, and is therefore not suitable for low abundance glycans and rare biological samples. The availability of libraries of structurally defined N-glycans (especially isomers) provides essential standards and probes for MS-based N-glycan analysis and glycan microarray studies of carbohydrate binding proteins.
Given the difficulties in separating structurally defined glycans from natural resources, chemical or chemoenzymatic approaches have been developed for the synthesis of mostly symmetric N-glycans in the last two decades.6 Among chemically synthesized N-glycans, only a few contain terminal sialic acid (Sia) due to difficulties in sialic acid chemistry,7 which were only lately overcome by enzymatic glycosylation using sialyltransferases.8 Most recently, Boons developed a strategy for chemoenzymatic synthesis of asymmetrical N-glycans, and 14 tri-antennary complex N-glycans were obtained.9 Nevertheless, only a few N-glycan structures were prepared in each report, mainly due to their complexity and diversity. A simple and robust strategy for efficient production of large numbers of N-glycan structures is still highly desirable.
Another roadblock to the rapid access of glycans in high purity is the purification strategy, which now largely relies on gel filtration chromatography (usually Sephadex G-25 or Bio-gel P2).8,9 However, even though gel filtration has been applied for decades in purifying glycans, it is time-consuming, less efficient, and may waste a significant portion of products in preparing small quantities of precious N-glycans. Therefore, a more reliable and rapid N-glycan purification approach is yet to be developed.
LewisX [LeX, Galβ1,4-(Fucα1,3-)GlcNAc] and sialyl LewisX [SLeX, Siaα2,3-Galβ1,4-(Fucα1,3-)GlcNAc] are among the most biologically significant glycan epitopes. For example LeX, also known as CD15 antigen or SSEA-1 trisaccharide, plays a role in the development of the central nervous systems of vertebrates,10 and interferes with pathogen transfer in breastfed infants.11 SLeX is a specific ligand on human leukocytes for E-, L-, and P-selectins, and has been shown to mediate leukocyte recruitment.12 SLeX (on both N- and O-glycans of glycoprotein zona pellucida) has also been shown to mediate human sperm binding during fertilization.13 In addition, LeX and SLeX are usually overexpressed on the surface of cancer cells.14 Despite its high significance, N-glycans containing these epitopes were not synthesized until recently.9 In this study, we describe an efficient Core Synthesis/Enzymatic Extension (CSEE) strategy and a HPLC based purification approach for rapid preparation of N-glycans with/without (S)LeX epitopes. In this strategy, 8 N-glycan core structures with GlcNAc residue(s) at the non-reducing terminal were first synthesized by convergent assembly of 5 building blocks. A set of robust GTs were then used to elongate these cores to yield a library of 73 N-glycans (Fig. 1 and 3). The development of a HPLC based approach using an amide column enabled rapid purification of milligrams of the chemoenzymatically synthesized N-glycans to a minimum 98% purity. In addition, MS2 analysis of selected N-glycans yielded unique fragmentation patterns that may be used for distinguishing certain isomers.
We envisaged that trisaccharide 1 (Fig. 2) containing a crucial β-mannoside would be a versatile precursor for the synthesis of the core structures, as the C4,C6-hydroxyl groups (OH) of the β-Man are protected with benzylidene and the C3-OH is unprotected to allow further chemical glycosylation. Installation of β-mannoside, the most challenging task in N-glycan synthesis, was accomplished using Crich–Kahne conditions with satisfactory yield and β-selectivity.17 The benzylidene acetal ring can be selectively opened at either C6-OH9 or C4-OH18 of the β-mannoside for further chemical assembly. In order to prepare the target N-glycans, 8 GlcNAc terminated core structures (Fig. 2) were designed. Among these, N110 and N210 were partially protected by peracetylation of the GlcNAc residue on either the α1,6Man or α1,3Man branch for the synthesis of asymmetric bi-antennary N-glycans. After enzymatic extension of the unprotected branch, the acetyl groups can be removed easily for further elongation.
Fig. 2 The versatile trisaccharide precursor 1 and four donor fragments (2, 3, 4, 5) for the assembly of the 8 core structures. |
The versatile precursor trisaccharide 1 (ref. 7b) and donor fragments 2,193,204 and 5 (ref. 21) were prepared as previously reported (ESI†). Using these building blocks, the syntheses of the 8 core structures were performed in a convergent strategy (Scheme 1). For example, to synthesize core structure N010, thioether donor 4 was first stereoselectively installed onto C3-hydroxyl of trisaccharide 1 in the presence of N-iodosuccinimide (NIS)/AgOTf with a yield of 93%. Pentasaccharide 7 was then obtained in an excellent yield (96%) by selective opening of the benzylidene ring at C6 using Et3SiH/PhBCl2. The octasaccharide 8 was assembled by stereoselective installation of 5 onto C6-hydroxyl of β-Man of acceptor 7 with a yield of 85%. The two phthalimides of 8 were then converted into acetamides, followed by the global deprotection of Bn by catalytic hydrogenolysis with Pd(OH)2/H2 in MeOH/H2O (10:1). The core structure N010 was produced in a total yield of 63% over the three steps.
Similarly, cores N000, N020, N030, N050, N110, and N210 were synthesized by first installing 2, 3 or 4 onto C3-hydroxyl of β-man of 1, followed by installation of the corresponding building blocks onto the α1,6Man branch. For the synthesis of N040, simple 3-O-benzylation and controlled reductive cleavage of the benzylidene acetal of 1 was performed to afford acceptor 15, which was then glycosylated with 4 to yield pentasaccharide 16 in 91% yield and with satisfactory stereoselectivity (α/β = 3.5:1). Compound 16 was further deprotected to yield N040 as previously described. The structures and stereochemistry of all glycosidic linkages was confirmed by NMR (ESI†).
Utilizing the enzymatic extension strategy, N011–N015 were prepared starting with the chemically prepared core N010 (Fig. 3A). Firstly, in a 1.5 mL reaction system, 9 mg of N010 (4 mM) was incubated with UDP-Gal (8 mM), MnCl2 (5 mM), and B4GALT1 (20 μU per μmole acceptor). One microliter of the reaction mixture was aliquoted every hour for analysis. MS analysis showed a peak at m/z = 719.7645, corresponding to N011 [M + 2H]2+. Meanwhile, on the HPLC-ELSD (Evaporative Light Scattering Detector) profile, a new peak (TR = 16.79 min) was observed, of which the area underneath grew while that of the peak corresponding to N010 (TR = 14.86 min) became smaller. After 6 h of incubation, the reaction was freeze-quenched at −80 °C for 30 min, and condensed into 300 μl for HPLC purification using a water/acetonitrile gradient elution, yielding 9.4 mg of N011 (94% yield). The purified N011 (99% pure) was then utilized for the syntheses of N012, N013, and N014 (Fig. 3A) catalyzed by PmST1m, Pd2,6ST, and Hpα1,3FT, respectively (see ESI† for details). It is worth noting that the reaction for the synthesis of N012 was only allowed to proceed for 30 min due to the sialidase activity of PmST1m. N015 was then synthesized from N012 using Hpα1,3FT. The reaction took 20 h to achieve complete conversion (Fig. S4†). Similarly, starting with other chemically synthesized cores (N000, N020, N030, N040 and N050), N-glycans N001–N005, N021–N025, N031–N035, N041–N045 and N051–N055 were prepared in a manner analogous to that described above. All prepared N-glycans were analyzed by HPLC-ELSD, ESI/MALDI-MS, and NMR to confirm their purity and structure (ESI†).
The syntheses of asymmetric bi-antennary N-glycans N1xx and N2xx (Fig. 1) were carried out by enzymatic extension of the unprotected antenna first and then the other. The synthesis of N1xx is illustrated in Fig. 3B. Firstly, Gal was added by B4GALT1 to the GlcNAc residue in the α1,3Man branch of N110 to form N110a, galactosylation on the α1,6Man branch was avoided by peracetylation of the corresponding GlcNAc residue. It should be noted that partial de-acetylation was observed when the reaction was incubated for over 12 h. After HPLC purification, N110a was de-acetylated using 30% of ammonium hydroxide:H2O (1:10) to afford N111, which was then used as a substrate for synthesizing the other N1xx glycans in a strictly controlled sequential manner. For example, to obtain N155, the α1,3Man branch was first extended by PmST1m (Step 1) and Hpα1,3FT (Step 2) to yield N115, the α1,6Man branch was then extended by B4GALT1 (Step 3) and Hpα1,3FT (Step 4) (Fig. 3B). Such synthetic routes were designed according to the substrate specificities of GTs to avoid undesirable glycosylation. Particularly, N144 was not designed to be synthesized from N124 to avoid potential sialylation on the α1,3Man branch by Pd2,6ST. Instead, N-glycan N244 was synthesized from N123 catalyzed by Hpα1,3FT (Fig. 3B). Similarly, N-glycans N2xx and N144 were synthesized from N210 (Fig. S5†).
N-Glycolylneuraminic acid (Neu5Gc), often found in mammalian glycans, is another common sialic acid molecule besides N-acetylneuraminic acid (Neu5Ac).25 Even though human cells cannot produce Neu5Gc because of the inactivation of the gene encoding CMP-Neu5Ac hydroxylase,26 it is frequently detected in glycans of cancer cells, probably due to metabolic incorporation from Neu5Gc-containing structures in the diet.25 Previously, the two sialyltransferases (PmST1, Pd2,6ST) used in the enzymatic extension were shown to be extremely promiscuous towards sugar donors, and were applied in an efficient synthesis of a number of sialosides and their derivatives.23,27 To further expand the current library, N-glycans with the Neu5Gc residue (N012G and N013G) were synthesized via a one-pot two-enzyme system (Fig. 3C). In detail, for the synthesis of N012G, 3 mM of N011 was incubated with 5 mM of Neu5Gc and cytidine 5′-triphosphate (CTP), 5 μg mL−1 of PmST1m, and excessive amounts of CMP-Sia synthetase (NmCSS). After 30 min of incubation at 37 °C (94% conversion as detected by HPLC), the reaction mixture was concentrated and subjected to HPLC purification. The synthesis of N013G was achieved by simply replacing PmST1m with Pd2,6ST. Surprisingly, it was found that the incorporation of a Neu5Gc residue resulted in a longer retention time shift (>1 min) on the amide column compared to the Neu5Ac-counterpart (N012, TR = 17.93 min; N012G, TR = 19.13 min; N013, TR = 19.09 min; N013G, TR = 20.16 min) (ESI†). Fucosylation of N012G (to generate N015G) was shown to be as efficient as that of N012, indicating that Hpα1,3FT can tolerate substrates with Neu5Gc. Theoretically, another set of 57 N-glycans can be easily synthesized by simply replacing Neu5Ac of the glycans in Fig. 1 with Neu5Gc.
Core-fucosylated N-glycans were widely found in mammalian glycoproteins and are particularly abundant in brain tissues.28 The alteration of core-fucosylation was proven to be associated with human cancers, chronic hepatitis, etc.29 Thus, the ability to prepare homogenous core-fucosylated N-glycans was believed to be important. The N-glycans prepared above are perfect substrates for specificity studies of α1,6-fucosyltransferase (FUT8), the sole enzyme responsible for the core-fucosyaltion of N-glycans, and for preparing a core-fucosylated N-glycan library. Specifically, 4 N-glycans with an identical α1,3Man branch but a different α1,6Man branch were selected for FUT8-catalyzed core-fucosylation (Fig. 3D). The results showed that FUT8 was highly active in using all 4 N-glycans as acceptors. Corresponding core-fucosylated N-glycans (N6030, N6000, N6211, N6212) (0.5–1 mg each) were synthesized accordingly. Further substrate specificity studies showed that FUT8 may have stricter requirement for structures on the α1,3Man branch than that of the α1,6Man branch (detailed study is ongoing).
Hydrophilic interaction liquid chromatography (HILIC) provides a rapid and effective strategy for separating small polar compounds, and has been used extensively in glycan analysis.30 In these cases, N-glycans from biological samples have usually been fluorescently labeled via reductive amination and then detected on a picomole scale by UPLC-HILIC. However, HILIC has not been applied to milligram scale N-glycan purification. Using an analytical HILIC column (XBridge BEH amide column, 5 μm, 4.6 mm × 250 mm, waters) under a gradient running condition (solvent A: 100 mM ammonium formate, pH 3.4; solvent B: acetonitrile; flow rate: 1 mL min−1; B%: 70–50% within 50 min), the abovementioned Bio-gel P2-purified products were analyzed. Four peaks were observed in the HPLC profile using an evaporative light scattering detector (Fig. S1†). Peaks 1 (TR = 21.68 min) and 2 (TR = 22.16 min) were next to each other and partially overlapped. The same observation was found for peaks 3 (TR = 24.51 min) and 4 (TR = 25.04 min). These peaks were collected in a parallel run monitored at A210 nm and subjected to MS analysis. The same m/z values were observed for peaks 3 (821.2997) and 4 (821.2991) (Fig. S1†), implying that both peaks represented N-glycan N001 [M + 2H]2+, possibly for α and β anomers, which is common for free glycans due to the process of mutarotation in water. This was confirmed by 1H NMR analysis that showed chemical shifts of both α and β anomer protons (ESI†). Similarly, peaks 1 and 2 represented α and β anomers of N001 minus a Gal residue [M + 2H]2+.
These results encouraged us to purify N001 using a semi-reparative HILIC column (10 × 250 mm). Under a similar gradient running condition (solvent A: 100 mM ammonium formate; solvent B: acetonitrile; flow rate: 4 mL min−1; B%: 70–50% within 50 min; monitored at A210 nm), 10.5 mg of N001 was separated by 3 injections (Fig. S2†) with a purity of higher than 98% as analyzed by HPLC-ELSD (ESI†). Different solvent combinations were later tested for N-glycan purification (Fig. S6†). The results showed that 100 mM ammonium formate/acetonitrile gradient elution gave the best separation of all N-glycans tested. In addition, N-glycans without Sia residues were separated to a similar level using water/acetonitrile gradient elution, whereas sialylated N-glycans were eluted rapidly (TR < 3 min). Furthermore, it was found that a shorter running time with a narrower B% gradient (65–50% in 25 min) was able to achieve a similarly good separation level. Such running conditions were applied to separate enzymatically synthesized N-glycan to 98% purity (ESI†).
Under a standard running condition (solvent A: 100 mM ammonium formate; solvent B: acetonitrile; flow rate: 1 mL min−1; B%: 65–50% within 25 min), all purified N-glycans were analyzed by HPLC-ELSD (ESI†). It was found that when different sugar residues were added to N-glycans, the retention time shifts of peaks on HPLC chromatograms generally decreased in the following order: Neu5Gcα2,3 with Fucα1,3 > Neu5Gcα2,6 > Neu5Acα2,3 with Fucα1,3 > Neu5Gcα2,3 > Neu5Acα2,6 > Galβ1,4 > Fucα1,3 > Neu5Acα2,3 > Fucα1,6. For example, the retention times of N015G, N013G, N015, N012G, N013, N014, N012 are 20.66, 20.16, 19.39, 19.13, 19.09, 18.59, and 17.93 min respectively. Such regularity may be found useful in HILIC-based profiling and identification of N-glycans.
This work found answers to the above three obstacles. Firstly, a highly efficient strategy was developed based on the consistent use of oligosaccharyl thioether for the convergent installation of branched GlcNAc-terminated antennae to achieve high stereoselectivity with excellent yields. This approach minimized synthetic steps and maximized yield, and proceeded very efficiently with fewer glycosyl donor (1.3 equivalents) and mild conditions (at 0 °C). Notably, when the (Ac3)GlcNAcβ1,2-Man disaccharide thioether 3 and Bn-GlcNAcβ1,2-Man disaccharide thioether 4 were used as donors, installations on 3-OH of trisaccharide 1 were achieved with excellent yield and high stereoselectivity. We were also able to install the Man3 thioether donor 5 on the 6-OH of the β-Man in good yield and with high stereoselectivity, as seen before in our previous report.18,32 Using this strategy, 8 N-glycan core structures with 5–8 monosaccharide residues were convergently synthesized. We expect this strategy would allow us to prepare more N-glycans with various glycoforms for enzymatic extension.
Secondly, a general enzymatic extension strategy is developed that can extend any GlcNAc terminated glycans to 5 more glycans (including LeX and SLeX) using B4GALT1 and three robust bacterial GTs. Such a strategy enabled the generation of 5–15 more N-glycans from each chemically synthesized core structure. During the synthesis of these N-glycans, each of the GTs was tested towards 10 to 21 N-glycan acceptors. For example, PmST1m showed comparably high activities towards N001, N011, N021, N031, N041, N051, N111, N123, N124, N125, N211, N223, N224 and N225 (which share a common Galβ1,4-GlcNAc motif), and efficiently catalyzed the formation of the corresponding α2,3sialylted N-glycans. In addition, the successful synthesis of Neu5Gc terminated N-glycan N012G indicated that PmST1m is also promiscuous towards sugar donors. Furthermore, a substrate specificity study revealed that Hpα1,3FT can well accept various N-glycans terminated with LacNAc or Siaa2,3LacNAc (ESI, Table S1†). Similarly relaxed substrate specificities were also found for B4GALT1 and Pd2,6ST towards various N-glycan acceptors. These results clearly indicate that: (a) the 4 robust GTs only recognize the adjacent one or two monosaccharide residues in the glycosylation reactions, and thus have a great potential to extend various N-glycans; (b) the promiscuity of the bacterial GTs towards the sugar donors is not affected by acceptors, no matter if simple oligosaccharides23,27,33 or complex N-glycans were used, and thus have a great potential to synthesize N-glycan derivatives.
Thirdly, instead of generally used gel-filtration, each N-glycan was purified to >98% by HPLC utilizing a HILIC column on milligram scales (up to 4 mg per run). This HPLC-based approach could well separate complex N-glycans with only one monosaccharide difference, and takes only 30 min per injection.
Among the synthesized structures, only a few (e.g.N011, N001, N002, N003, N6000) were previously synthesized via chemical7b,15c,34 or chemoenzymatic approaches.8b,31,35 This library covers a number of low molecular weight N-glycans which have or have not been identified,36 including most common hybrid and bi-antennary complex types. More importantly, this work represents the first report preparing high pure N-glycan isomers. This N-glycan library contains 21 groups of isomers (Fig. S7†) (2 to 6 distinct structures in each group), e.g. glycans N125, N134, N144, N225, N234 and N244 are isomers with the same molecular weight of 2077.7455. These groups of isomers are valuable standards that may be applied in absolute quantification and structural identification of N-glycans by MS.
Footnotes |
† Electronic supplementary information (ESI) available: Materials, detailed experimental protocols, supporting schemes and figures, synthetic methods, analytical data including HPLC profiles, mass and NMR spectra of synthesized N-glycans. See DOI: 10.1039/c5sc02025e |
‡ Contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2015 |