Madhusudhan Reddy
Gadi‡
a,
Congcong
Chen‡
ab,
Shumin
Bao‡
a,
Shuaishuai
Wang
a,
Yuxi
Guo
a,
Jinghua
Han
a,
Weidong
Xiao
c and
Lei
Li
*a
aDepartment of Chemistry and Center for Diagnostics & Therapeutics, Georgia State University, Atlanta, GA 30303, USA. E-mail: lli22@gsu.edu
bShandong Academy of Pharmaceutical Science, Key Laboratory of Biopharmaceuticals, Engineering Laboratory of Polysaccharide Drugs, National-Local Joint Engineering Laboratory of Polysaccharide Drugs, Jinan 250101, China
cDepartment of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
First published on 18th January 2023
All O-GalNAc glycans are derived from 8 cores with 2 or 3 monosaccharides linked via α- or β-glycosidic bonds. While chemical and chemoenzymatic syntheses of β-linked cores 1–4 and 6 and derived glycans have been well developed, the preparation of α-linked rare cores 5, 7, and 8 is challenging due to the presence of this 1,2-cis linkage. Meanwhile, the biosynthesis and functional roles of these structures are poorly understood. Herein, we synthesize 3 α-linked rare cores with exclusive α-configuration from a versatile precursor through multifaceted chemical modulations. Efficient regioselective α2-6sialylion of the rare cores was then achieved by Photobacterium damselae α2-6sialyltransferase-catalyzed reactions. These structures, together with β-linked cores 1–4 and 6, and their sialylated forms, were fabricated into a comprehensive O-GalNAc core microarray to profile the binding of clinically important GalNAc-specific lectins. It is found that only Tn, (sialyl-)core 5, and core 7 are the binders of WFL, VVL, and SBA, while DBA only recognized (sialyl-)core 5, and Jacalin is the only lectin that binds core 8. In addition, activity assays of human α-N-acetylgalactosaminide α2-6sialyltransferases (ST6GalNAcTs) towards the cores suggested that ST6GalNAc1 may be involved in the biosynthesis of previously identified sialyl-core 5 and sialyl-core 8 glycans. In conclusion, we provide efficient routes to access α-linked O-GalNAc rare cores and derived structures, which are valuable tools for functional glycomics studies of mucin O-glycans.
All O-GalNAc glycans initiate with a GalNAc residue α-linked to the hydroxyl group of serine (Ser), threonine (Thr), or seldomly tyrosine (Tyr).10 This initial GalNAc can then be extended with one or two additional sugar residues (Gal, GalNAc, or GlcNAc) to generate 8 core structures (Fig. 1A).11 Among these, cores 1–4 are commonly observed with relatively high abundance, whereas cores 5–8 are rare cores with restricted occurrence and low abundance.12 Structurally, the initial GalNAc of cores 1–4 and 6 are extended with β-linked residues, which can be further elongated to present various glycan epitopes.13 Cores 5, 7, and 8, on the other hand, are extended with an α-linked GalNAc or Gal residue. To date, further glycosylations of these cores other than sialylation of cores 5 and 8 have not been observed in mammals.
Fig. 1 The 8 O-GalNAc glycan core structures (A) and extended rare cores 5, 7, and 8 found in mammalian systems and fish (B). |
In mammalian systems, core 5 (GalNAcα1-3GalNAcα) was only identified in human gastric mucins and meconium, as well as Toxoplasma gondii mucin-like glycoprotein.14,15 Its sialylated form GalNAcα1-3(Neu5Acα2-6)GalNAcα (Fig. 1B) was identified in human colon mucosa, rectal adenocarcinoma, and meconium,16–18 as well as bovine submaxillary mucin.19 In contrast, core 5 and elongated glycans (Fig. 1B) are abundant in skin mucus of fish including rainbow trout and Atlantic salmon,20–23 suggesting that the corresponding biosynthetic enzymes may be highly expressed or upregulated in such species. So far, core 7 (GalNAcα1-6GalNAcα) has only been reported in bovine submaxillary mucin, accounting for 3% of total glycans by mass,24 and sialyl-core 8 (Galα1-3(Neu5Acα2-6)GalNAcα) was solely identified in human bronchial mucin.25 Limited reports on these α-linked cores may be ascribed to their low abundance in mammals and the lack of appropriate methods to enrich/distinguish them from highly abundant cores. Furthermore, the biosynthesis and potential functions of cores 5, 7, and 8 are still unknown. Such studies require well-defined structures as standards and probes.
Our interest lies in the facile synthesis of O-GalNAc glycans for glycomics studies. We have recently developed an efficient modular assembly strategy to rapidly prepare diverse cores 1–4 and 6 glycans starting from a versatile precursor (Fig. 2, compound 4).26 In this work, we focused on rare cores 5, 7, 8 and their sialylated forms. Different from cores 1–4 and 6, the presence of 1,2-cis linkages in cores 5, 7, and 8 poses a major challenge in devising a synthetic route. All early attempts to synthesize core 5,27–30 sialyl-core 5,28 core 7,27,29 and core 8 (ref. 31) gave 1,2-cis (α-linkage)/1,2-trans(β-linkage) mixtures or low overall yields. An efficient and stereoselective strategy to access these rare cores and their natural sialylated forms is still lacking. Here, we report the convergent synthesis of cores 5, 7, and 8 with exclusive α-selectivity using the same precursor 4, followed by regioselective preparation of their sialylated forms using a bacterial α2-6sialyltransferase.
Facile stereoselective synthesis of 1,2-cis linkages has been a significant challenge in the synthesis of oligosaccharides.36 The most common approach is to place a non-participating group at the C2 position of a glycosyl donor.37 However, such non-participating groups did not afford exclusive α-selectivity (data not shown). In addition, we investigated solvent-assisted glycosylation with pre-activation of the donor38 to obtain the desired stereochemical outcomes. For example, to synthesize cores 5 and 8, thioglycosyl donor 5 or 6 was first activated with p-TolSCl in diethyl ether at −78 °C using AgOTf as the promoter (Scheme 1). To the activated donor, the acceptor 4 in diethyl ether was slowly added to provide protected disaccharide 1 or 2, respectively, in very good yields with exclusive α-selectivity (Scheme 1).35,39,40 Solvent assistance from the β-side provided by diethyl ether ensured exclusive α-selectivity, whereas our initial attempts of using a mixture of dichloromethane and diethyl ether resulted in a mix of α/β-anomers. Going forward, the azide functional group on 1 and 2 was reduced to an amine with zinc and in situ protected as acetyl amide using acetic anhydride. The crude reaction mixture was then subjected to hydrogenation using 10% Pd/C under acidic conditions to afford deprotected core 5 (Scheme 1, 8) and core 8 (9). Finally, deprotection of the tBu ester using trifluoroacetic acid afforded Fmoc protected cores 5 (10) and core 8 (11) quantitively.
To synthesize core 7 (Scheme 1, 14), the versatile precursor 4 was first converted to the diol 12 by protecting the C3–OH as benzyl ether and subsequent deprotection of the benzylidene acetal under acidic conditions. The higher nucleophilicity of the C6–OH over the C4–OH allowed regioselective glycosylation using glycosyl donor 5 or 7 under similar solvent-assisted glycosylation conditions for the synthesis of cores 5 and 8. However, when glycosyl donor 5 was used, a mixture of anomers (α:β = 7:3) was formed, presumably due to the higher nucleophilicity of the acceptor. We then tried glycosyl donor 7 with benzylidene protection, which afforded the disaccharide 3 in good yield (75%) with exclusive α-selectivity (Scheme 1). It is likely that the benzylidene acetal assisted in blocking the β-face attack of the acceptor 12. Finally, successive azide reduction and amine acetylation followed by global deprotection provided compound 13, which was converted to the Fmoc protected core 7 (14) under acidic conditions. Collectively, the introduction of non-participating groups at C2 of glycosyl donors and the pre-activation of donors, together with solvent-assisted glycosylation realized perfect stereoselectivity in the synthesis of α-linked O-GalNAc cores starting from the versatile acceptor 4. The overall yields for cores 5, 7, and 8 are 58% (4 steps), 33% (5 steps), and 59% (4 steps), respectively.
Donor | Concentration of core | Acceptor | ||
---|---|---|---|---|
10 (core 5) | 14 (core 7) | 11 (core 8) | ||
Product | ||||
a Reactions were performed in 20 μL systems in 100 mM Tris–HCl (pH 8.0), containing varying concentrations of cores (1, 5, and 10 mM), donors CMP–sialic acid (2 equivalents), and 80 μg of purified Pd26ST. All reactions were incubated at 37 °C for 1 h unless otherwise stated. Conversion of cores is monitored by HPLC. b One-pot two-enzyme system was used to generate CMP–Kdn in situ (ESI). O/N, overnight reaction. | ||||
CMP–Neu5Ac | 10 mM | |||
CMP–Neu5Ac | 5 mM | 15, 62.1% | 16, 16.6% | 17, 67.0% |
CMP–Neu5Ac | 1 mM | 15, 15.3% | 16, <1% | 17, 11.7% |
CMP–Neu5Gc | 10 mM | |||
CMP–Kdnb | 10 mM | |||
Surprisingly, for cores 5 and 8, only MS peaks corresponding to mono-sialylated core 5 (m/z = 1023.3695, [M–H]−) and core 8 (m/z = 982.3695, [M–H]−) were observed in Pd26ST-catalyzed reactions. Meanwhile, HPLC analyses of the reactions showed a single new peak (TR = 14.87 min for core 5 and TR = 15.07 min for core 8) corresponding to the products in both reactions, with high conversion rates of 85.5% and 84.4%, respectively (ESI†). Note that elongated reaction times did not result in di-sialylated forms, instead a significantly lower conversion rate (49.0% and 42.7% for core 5 and core 8 respectively) was observed (Table 1, ESI†), indicating that sialylated forms of cores 5 and 8 may be labile under the reaction conditions. Nevertheless, these results suggested that one specific Gal/GalNAc residue in cores 5 and 8 was sialylated. These sialylated core 5 and core 8 were then synthesized in very good yields (92%, 7 mg for core 5; 90%, 6 mg for core 8) and purified for NMR characterization. The 2D-HMBC NMR spectra (ESI†) of both products showed a positive correlation between the anomeric C2 of Neu5Ac and the C6 of GalNAc at the reducing end, revealing that only the initial GalNAc was α2-6sialylated in both cases. These results confirmed that the sialylated cores are natural sialyl-core 5 and sialyl-core 8 identified in mammalian systems.16,17,25 Such a strict regioselectivity may stem from the distorted structures of cores 5 and 8 where the C6–OH of the distal GalNAc/Gal residue is adjacent to and thus masked by the proximal C2–N-acetyl of the initial GalNAc residue (Fig. S1†). Similar to those observed in core 7 reactions, the conversion rates of cores 5 and 8 again showed positive correlations to the concentration of acceptor cores (Table 1). These results may be explained by high Km values of Pd26ST towards α-linked GalNAc residues.43
N-Glycolylneuraminic acid (Neu5Gc) and deaminated sialic acid (Kdn) are two other common sialic acid forms found in eukaryotes besides Neu5Ac.44 Sialyl-core 5 structures with Neu5Gc and Kdn were previously identified in salmon.21,22 We performed activity assay of Pd26ST using CMP–Neu5Gc and CMP–Kdn as donors and cores 5, 7, and 8 as acceptors (Table 1, ESI†). Pd26ST efficiently catalyzed the sialylation of cores 5 and 8 using CMP–Neu5Gc as the donor (70.6% and 74.2%). The conversion rate of core 7 (18.9%) is low but comparable to that using CMP–Neu5Ac as the donor (32.2%). On the other hand, the conversion rates are much lower towards Kdn, which may be partially due to the use of a one-pot multi-enzyme system to generate the donor CMP–Kdn in situ (ESI†) instead of pure CMP–Kdn. Nevertheless, mg scale reactions with excess amounts of Pd26ST afforded Neu5Ac and Kdn modified cores 5, 7, and 8 in good overall yields. The compounds were purified by HPLC and analyzed by HPLC, MS, and/or NMR (ESI†).
Enzyme | Acceptor | ||||
---|---|---|---|---|---|
Core 1 | 10 (core 5) | 14 (core 7) | 11 (core 8) | ||
Product | |||||
a Reactions were carried out in a 10 μL system in 100 mM MES buffer (pH 7.0), containing 0.1 mM acceptor, 10 mM CMP–Neu5Ac, and 2 μg of enzymes. Reactions were incubated at 37 °C for 48 hours. b GTs with a N-terminal GFP-tag expressed in CHO cell lines were obtained from Glyco Expression Technologies, Inc. (Athens, GA). c ST6GalNAc4 with a C-terminal His6-tag expressed in a mouse myeloma cell line was obtained from R&D Systems. ND, not detected. | |||||
ST6GalNAc1b | 6-Sialyl-T | ND | 15 | 16 | 17 |
21.2% | 47.5% | 9.4% | 33.8% | ||
ST6GalNAc2b | 6-Sialyl-T | ND | ND | ND | ND |
4.3% | |||||
ST6GalNAc4b | 6-Sialyl-T | Disialyl-T | ND | ND | ND |
11.8% | 52.9% | ||||
ST6GalNAc4c | 6-Sialyl-T | Disialyl-T | ND | ND | ND |
58.0% | 3.7% | ||||
ST6GalNAc5b | 6-Sialyl-T | Disialyl-T | ND | ND | ND |
2.4% | 3.9% | ||||
ST6GalNAc6b | 6-Sialyl-T | Disialyl-T | ND | ND | ND |
6.2% | 7.1% |
As expected, both ST6GalNAc1 and ST6GalNAc2 catalyzed the sialylation of core 1 to generate 6-sialyl-T, despite low conversion rates of 21.2% and 4.3%, respectively. This may be explained by the possible low activity of ST6GalNAcs to Fmoc-protected glyco-amino acids, as parallel reactions using a MUC1 glycopeptide bearing core 1 gave excellent yields (Fig. S2†). Most surprisingly, HPLC profiles of reactions catalyzed by ST6GalNAc4–6 showed two new peaks, one (TR = 14.48 min) corresponding to 6-sialyl-T as expected, while the other one (TR = 11.45 min) corresponding to disialyl-T (Neu5Acα2-3Galβ1-3(Neu5Acα2-6)GalNAcα) (Fig. S3†). The conversion rate of the ST6GalNAc4-catalyzed reaction was much higher than those of ST6GalNAc5–6 catalyzed ones, with 11.8% of 6-sialyl-T and 52.9% of disialyl-T (Table 2, Fig. S3†). Controversially, ST6GalNAc4 bearing a His6-tag yielded 58.0% of 6-sialyl-T but only 3.9% disialyl-T. Such disparate activities may result from fused tags (N-terminal GFP vs. C-terminal His6) and glycosylation patterns derived from expression cell lines (CHO vs. mouse myeloma). Nevertheless, both 6-sialyl-T and disialyl-T were observed in all reactions catalyzed by ST6GalNAc4–6, suggesting that they could recognize core 1 (T antigen) besides their preferred substrate 3′-sialyl-T,46 and surprisingly may also possess weak α2-3sialyltransferase activity (at least in vitro) to sialylate T or 6-sialyl-T antigens, which subsequently converted to disialyl-T.
Activity assay of these ST6GalNAcs towards rare cores revealed that only ST6GalNAc1 is active towards α-linked cores 5 and 8, generating sialyl-core 5 and sialyl-core 8 with moderate conversion rates of 47.5% and 33.8%. Interestingly, ST6GalNAc1 also sialylated core 7 with a low conversion rate of 9.4% (ESI†). These results double-confirmed that ST6GalNAc1 has a broad substrate specificity.46 None of other ST6GalNAcTs showed activity towards the 3 α-linked rare cores. ST6GalNAc3 was not tested due to unavailability and fails in heterogeneous expression attempts. Nevertheless, our results indicated that it is highly likely that ST6GalNAc1 rather than other ST6GalNAcTs is involved in the biosynthesis of sialyl-core 5 and 8. Accordingly, putative biosynthetic routes for these glycans were proposed (Fig. S4†).
Our results showed that WFL (Fig. 3B) and VVL (Fig. 3C) strongly bind all structures with a terminal unmodified GalNAc residue, including Tn antigen (24), core 5 (10), sialyl-core 5 (15, 18, and 21), and core 7 (14), concordant with previous reports.49 Modifications on either C3–OH (e.g., cores 1, 3, and 8) or C6–OH (e.g., sialyl-T, core 6, and sialyl-core 7) of GalNAc completely abolished the bindings. SBA has a similar binding profile towards the O-glycan core array and prefers terminal GalNAc (Fig. 3D). Unlike WFL and VVL, α2-6sialylation of proximal residues (e.g., sialyl-core 5) significantly diminished bindings (Fig. 3D).
DBA has been used as a probe for terminal α-GalNAc residues and to bind blood group A antigen. However, Forssman antigen (GalNAcα1-3GalNAcβ) was identified to be the best binder of DBA, whereas only weak binding was observed to other α-GalNAc terminal glycans.49,50 In consonanance with this, we found that DBA very weakly binds or didn't bind Tn (24) or core 7 (14) with a terminal α-GalNAc residue (Fig. 3E). In contrast, it strongly binds core 5 (10) that contains an α-linked Forssman disaccharide (GalNAcα1-3GalNAcα). Surprisingly, DBA well tolerated α2-6sialylation on the initial GalNAc residue (15, 18, 21) (Fig. 3E). These results identified GalNAcα1-3GalNAc disaccharide instead of the β-linked Forssman antigen as the minimum binding motif of DBA.
Jacalin is generally considered a T-antigen binder, but several reports concluded that the lectin primarily binds C3–OH substituted GalNAcα motifs.26,49,51 It also binds Tn antigen but substitution at the C6 of core GalNAc is not tolerated according to Consortium for Functional Glycomics (CFG) microarray data.49 Consistent with these, our results (Fig. 3F) showed strong binding to Tn (24), T (26), 6′-sialyl-T (27), core 3 (31), and core 8 (11) but not sialyl-core 8 (17, 20, and 23). In addition, Jacalin didn't recognize core 5, further confirming that this Forssman antigen-like structure is not a binder.49 Interestingly, Jacalin showed comparable strong binding to core 7 (14) that contains an α1-6GalNAc modification on the initial GalNAc, against previous conclusions.26,49,51 It is possible that the binding site of Jacalin on core 7 is the distal α-GalNAc instead of the initial α-GalNAc. Collectively, O-GalNAc glycan recognized by Jacalin include Tn, core 1, core 3, core 8, their extended structures without C6-modification on the core GalNAc, and core 7. In contrast, PNA is a strict T-antigen binder (Fig. 3G), which recognizes the Galβ1-3GalNAc motif that is devoid of any modification on the Gal residue (26, 28, and 30).49
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2sc06925c |
‡ Equal contribution. |
This journal is © The Royal Society of Chemistry 2023 |