Joel A.
Cain
ab,
Ashleigh L.
Dale
ab,
Zeynep
Sumer-Bayraktar
ab,
Nestor
Solis†
a and
Stuart J.
Cordwell
*abcd
aSchool of Life and Environmental Sciences, The University of Sydney, 2006, Australia
bCharles Perkins Centre, The University of Sydney, Level 4 East, The Hub Building (D17), 2006, Australia. E-mail: stuart.cordwell@sydney.edu.au; Tel: +612-9351-6050
cDiscipline of Pathology, School of Medical Sciences, The University of Sydney, 2006, Australia
dSydney Mass Spectrometry, The University of Sydney, 2006, Australia
First published on 22nd April 2020
Campylobacter jejuni is a major cause of bacterial gastroenteritis in humans that is primarily associated with the consumption of inadequately prepared poultry products, since the organism is generally thought to be asymptomatic in avian species. Unlike many other microorganisms, C. jejuni is capable of performing extensive post-translational modification (PTM) of proteins by N- and O-linked glycosylation, both of which are required for optimal chicken colonization and human virulence. The biosynthesis and attachment of N-glycans to C. jejuni proteins is encoded by the pgl (protein glycosylation) locus, with the PglB oligosaccharyltransferase (OST) enabling en bloc transfer of a heptasaccharide N-glycan from a lipid carrier in the inner membrane to proteins exposed within the periplasm. Seventy-eight C. jejuni glycoproteins (represented by 134 sites of experimentally verified N-glycosylation) have now been identified, and include inner and outer membrane proteins, periplasmic proteins and lipoproteins, which are generally of poorly defined or unknown function. Despite our extensive knowledge of the targets of this apparently widespread process, we still do not fully understand the role N-glycosylation plays biologically, although several phenotypes, including wild-type stress resistance, biofilm formation, motility and chemotaxis have been related to a functional pgl system. Recent work has described enzymatic processes (nitrate reductase NapAB) and antibiotic efflux (CmeABC) as major targets requiring N-glycan attachment for optimal function, and experimental evidence also points to roles in cell binding via glycan–glycan interactions, protein complex formation and protein stability by conferring protection against host and bacterial proteolytic activity. Here we examine the biochemistry of the N-linked glycosylation system, define its currently known protein targets and discuss evidence for the structural and functional roles of this PTM in individual proteins and globally in C. jejuni pathogenesis.
Human disease is generally self-limiting and symptoms present as fever and abdominal cramping that progress from mild to, in some cases, severe diarrhoea.2,3 Relapse is possible in the absence of medical intervention, and is likely due to gut persistence for up to 3 weeks.9C. jejuni infection is also an established antecedent for an increasing number of debilitating conditions including Guillain–Barré Syndrome (GBS), Miller–Fisher Syndrome (MFS), immunoproliferative small intestine disease, reactive arthritis and Sweet's syndrome.10,11 The basis for these post-acute immune-mediated disorders is thought to be largely based on cross-reactivity between antibodies directed against C. jejuni surface lipooligosaccharide (LOS) and human cell surface gangliosides, and this relationship has been reviewed extensively.12–14
Several C. jejuni genomes have been sequenced from laboratory-adapted and clinical strains and several features remain consistent; the organism encodes ∼1620–1650 genes, a large proportion of which encode membrane-associated proteins that are poorly functionally annotated.15–17 Human infection is not completely understood but involves bacterial adherence to gut epithelial cells, followed by invasion and subsequent toxin production. Several factors are critical in C. jejuni host colonization, including flagellar-based motility, cell shape, chemosensing and chemotaxis mediated by transducer-like proteins (Tlps), as well as a number of adhesins including the fibronectin-binding proteins Campylobacter adherence factor CadF and fibronectin-like protein FlpA, the surface-exposed lipoprotein JlpA and the PEB antigens [reviewed in18–20]. The ability to survive in the hostile environment encountered during gut infection, consisting of for example low pH, presence of bile salts and competitive factors from established microflora, is paramount to establishing disease and C. jejuni is adapted to utilize nutrients, such as amino and organic acids as primary carbon sources, that are in rich supply in the gut micro-environment (e.g. serine and proline from mucins, organic acids produced as a by-product of metabolism by resident microorganisms).21,22C. jejuni lacks typical virulence-associated type III/IV secretion systems (T3SS/T4SS) employed by many other enteric bacteria to secrete toxins and proteases that directly interact with host cells, although it is now well-established that extracellular virulence factors (e.g. the Campylobacter invasion antigens [Cia] and cytolethal distending toxin [CDT]) are secreted via the flagellar export apparatus that acts as a pseudo-T3SS.20,23 Another mechanism by which C. jejuni virulence determinants can interact with host cells is via their packaging into outer membrane vesicles (OMVs;24–26). Finally, despite the somewhat small size of the genome, C. jejuni devotes considerable resources to post-translational modification (PTM) of proteins by N- and O-linked glycosylation, both of which are considered established virulence determinants.
It has been suggested that the major outer membrane protein (MOMP), which accounts for ∼40–50% of the total membrane protein in C. jejuni,17 can also be O-glycosylated58 with a glycan unrelated to the flagellin modification described above. MOMP may be modified at Thr-268 with the tetrasaccharide Gal-β1,3-GalNAc-β1,4-GalNAc-β1,4-GalNAc-α1, although intact glycan-peptide MS validation is yet to be generated. MOMP modification was further indicated by Whitworth and colleagues in C. jejuni strain 81–176 by galactose oxidase (GalO)-mediated selective biotinylation and subsequent enrichment of GalNAc containing cell surface glycoconjugates.59 Site-directed mutagenesis of Thr-268 indicated that this residue is important for autoagglutination, biofilm formation and colonization of both human Caco-2 cells and chickens,58 a phenotype also consistent with observations on the roles of FlaA glycosylation. It remains to be seen whether additional O-glycoproteins are present in C. jejuni and whether this PTM occurs as a widespread presence on proteins secreted by the organism during infection.
Biosynthesis and transfer of the N-glycan to proteins (Fig. 1) involves the actions of 10 Pgl proteins (an eleventh member of the locus, pglG, does not appear to contribute to the process and remains functionally undefined68) and begins with the cytoplasmic synthesis of nucleotide-activated (uridine diphosphate; UDP) UDP-diNAcBac from UDP-GlcNAc, which is catalyzed by the activities of (in order) the PglF dehydratase (conferring the rate-limiting step in the Pgl pathway), PglE aminotransferase and PglD acetyltransferase.69–74 Synthesis of diNAcBac has been reviewed extensively elsewhere.75,76 The potential for cross-talk between the N- and O-linked pathways is evidenced by shared nucleotide-activated precursors and by the activity of PglD, which can form intermediates from within the legionaminic acid biosynthetic pathway, albeit at substantially reduced catalysis compared with LegH.38 DiNAcBac is attached to the cytoplasmic side of an inner membrane spanning lipid carrier (undecaprenyl-pyrophosphate [Und-P]) by the PglC glycosyl-1-phosphate transferase, and Und-P then serves as the carrier for the nascent N-glycan.77,78 Continued synthesis of the glycan on Und-P-diNAcBac involves the sequential addition of 5 N-acetylgalactosamine (GalNAc) residues by three pgl-encoded glycosyltransferases (the first by PglA, the second by PglJ and the final three by PglH).79 Glycan length is controlled by increased competitive inhibition of the PglH active site relative to the number of GalNAc residues, and is considered limited by the final GalNAc(x5) product.80 The PglH tertiary structure also contains a novel ‘ruler helix’ that binds the pyrophosphate of Und-P and limits PglH catalysis to 3 GalNAc.81 Glycan synthesis is completed by the PglI glucosyltransferase, which adds a single glucose (Glc) branch to the third GalNAc in the N-glycan.68 This last Glc residue is not a strict requirement for N-glycosylation as, unlike all previous steps, addition of the complete N-glycan (without the Glc branch) to proteins still occurs in the absence of pglI,77,82 albeit at lower catalytic efficiency. Deletion of other pgl genes results in either complete loss of the N-glycan or the presence of significantly truncated N-glycans (e.g. pglD82), as well as compromised protein transfer efficiency.77 Once the heptasaccharide has been completed, the PglK flippase translocates the Und-P-linked glycan from the cytoplasm into the periplasmic space utilizing a mechanism dependent on the hydrolysis of two molecules of ATP.83,84 The mature glycan is then transferred en bloc from Und-P onto target proteins by the PglB oligosaccharyltransferase (OST),68,85 which recognizes both the Und-P-N-glycan complex and peptide acceptor as substrates.86
The PglB OST is also capable of releasing the N-glycan from Und-P into the periplasm as a ‘free oligosaccharide’ (fOS),82,87,88 although the exact proportion of N-glycan as Asn-bound:fOS remains a point of contention. Nothaft et al. reported a ratio favouring high fOS at ∼1:10, while Scott et al. reported a distribution of 4.5:1 in favour of protein-bound N-glycan.82,89 While there are a number of technical considerations that may help explain this discrepancy,89,90 there may also be dynamic control of fOS production based on environmental conditions and the kinetics of PglB. The fOS itself has been shown to provide protection against osmotic stress, further supporting the notion that the cellular fate of the N-glycan may be determined to some degree by environmental sensing.82 Unlike protein N-glycosylation, the free N-glycan is highly dependent on the synthesis of the complete heptasaccharide, as pglI deficient strains produce ∼55% less fOS.90 The N-glycan itself can also be further modified with a phosphoethanolamine (pEtN) group, which is added to the terminal GalNAc of the heptasaccharide at a small number of glycosites by the sole C. jejuni pEtN transferase, EptC.91 An inability to detect pEtN-modified fOS suggests that variation of the glycan by EptC occurs post-attachment to protein targets.91
Despite our knowledge of the structure and function of PglB and other OSTs, a number of elements still remain poorly understood. Firstly, observations of N-glycosylation at non-canonical sequons89,97 are not consistent with the above model, particularly considering that it has previously been demonstrated that such substitutions are catalytically unfavourable.92 It has been suggested that these atypical or non-canonical occupied sequons may reflect observations that peptide binding is not necessarily the rate-limiting step in the PglB reaction.95 Evidence for this can be seen in similar turnover rates between sequons containing Thr or Ser at the +2 position despite an apparent 4-fold reduced affinity of PglB for Ser,92 and this is further supported by similar propensities for the two amino acids at this position in vivo.89 Additionally, while PglB can modify glutamine (Gln) at very low rates in in vitro peptide-based assays,94 no glycosite at Gln has been demonstrated in any C. jejuni glycopeptide identified thus far, with the very small number of non-canonical sequons limited to differences at the −2 and +2 positions.89 Additionally, the exquisite sensitivity of MS-based approaches for glycopeptide identification may mean that even experimentally verified non-canonical sequons could occur at extremely low occupancy and have potentially little biological value. Finally, the current model does not address how PglB is able to perform fOS release given the catalytic importance of also binding a peptide substrate. In yeast, purified STT3 can generate fOS by hydrolyzing the lipid (dolichol rather than Und-P) linked oligosaccharide irrespective of peptide binding,98 however an observation that the WWD motif is required for PglB-mediated fOS release82 suggests a peptide substrate is necessary in Campylobacter.
Addition of the N-glycan does not appear to be coupled to any particular membrane translocation pathway as CmeA was modified when shuttled into the periplasm via either the secretory (Sec) or twin-arginine translocation (Tat) bacterial translocation systems.93 Kowarik et al. however, did demonstrate differences in N-glycosylation site occupancy when proteins were transported via these different systems in E. coli,93 and consistent with their findings that showed lower N-glycosylation efficiency with Tat-translocated proteins, only 1 identified C. jejuni N-glycoprotein is predicted (by SignalP99) to be translocated using this system.100
Site-specific glycopeptide analysis firstly relied on collision-induced dissociation (CID) MS-based fragmentation, however the highly labile nature of the glycosidic bonds resulted in very poor peptide backbone sequence coverage and therefore an inability to identify the modified sites. The advent of higher energy collisional dissociation (HCD) fragmentation enabled switching between CID (for glycan confirmation) and HCD for peptide fragmentation and sequencing,35 while concurrent advances in hydrophilic interaction liquid chromatography (HILIC) facilitated better enrichment and separation of glycosylated C. jejuni peptides compared to previous studies employing SBA affinity and gel electrophoresis. An optimized workflow employing HCD tandem MS (MS/MS) provides glycan-derived diagnostic oxonium ions from the C. jejuni N-glycan (e.g. GalNAc, 204.08 mass:charge [m/z]) and peptide sequence.89,91 In addition to improvements in MS-based glycan site identification, glycoprotein analysis can also be coupled to a multi-protease digestion strategy (e.g. employing alternatives to trypsin, including pepsin and chymotrypsin) that improve N-glycosite coverage and provide independent site verification in many cases.35
This approach has now yielded the identification of 134 sites of C. jejuni N-glycosylation from 78 membrane-associated proteins that have been experimentally confirmed (predominantly by MS), including periplasmic proteins, lipoproteins, inner membrane proteins and at least one protein that is thought to be surface-exposed (the lipoprotein JlpA101) across 5 C. jejuni strains,33,35,61,63,89 meaning that C. jejuni is likely to be the most complete glycoproteome yet described in the literature (Table 1). Some glycoproteins are modified at multiple sites; for example, the Cj0152c putative membrane protein (which shares significant sequence similarity with the Helicobacter pylori neuraminidase/sialidase) contains 6 occupied canonical sites, as well as a single atypical site (Fig. 2A). Cj0152c also contains an additional pseudo-sequon (70ENNPT74) that is not occupied, and that is predicted to be located in the cytoplasmic region of the protein. Cj0610c (encoding the peptidoglycan O-acetyltransferase PatB) is potentially the most ‘modified’ protein in C. jejuni since it contains 5 confirmed N-glycosites and 10 N-sequons in total, all of which are predicted to be located within the periplasm; structural elucidation of this protein could be particularly useful in determining the three-dimensional constraints involved in N-glycan site occupancy (see below). A further 5 proteins (Cj0114, Cj0592c, Cj0843c, Cj1013c and Cj1670c) each contain 4 verified N-glycosites (Table 1). Additionally, eight proteins have been identified with the pEtN-modified N-glycan attached.91 Although the function of the pEtN-glycan remains completely unknown, the proteins displaying this modification are amongst the most immunogenic in C. jejuni, including the major antigen PEB3 (Cj0289c), and the previously identified immunogens CjaC (Cj0734c), CjaA (Cj0982c) and JlpA (Cj0983c).17,34,101–103 Despite this, deletion of the eptC pEtN transferase responsible for pEtN modification of the N-glycan did not influence the reactivity of these proteins with human serum.91 Further work is required to better understand the occupancy levels of non- to pEtN-modified N-glycan on these glycosites and thus to assist in determining the biological role of the pEtN group in this context. It is also important to note that a second Campylobacter species, C. gracilis, exclusively modifies proteins with an N-glycan displaying a terminal pEtN group,63 however again, the role of this modification remains to be elucidated.
Cj | 81–176 | Gene | Identification | Sequence/site# | Location | Topology |
---|---|---|---|---|---|---|
#Where glycosylation site was identified only in another strain this sequence is shown in (brackets), +non-canonical sequon denoted by underlining of atypical amino acid at −2 or +2 position, Asn (N) highlighted in bold and shaded in italicized bold is also modified by pEtN-modified N-glycan; location, predicted subcellular localization as determined by PSORTb (vers. 3.0.2.)106 and Lipo P 1.0,139 (x) number of predicted transmembrane regions (TMR), or presence of signal peptide (SP), unless experimentally proven all lipoproteins were considered anchored to OM or IM (dependent on Lipo P use of the ‘+2 rule’, Asp at +2 from lipo-Cys predicts IM anchoring, all other amino acids predict OM anchoring) with protein facing into the periplasm; topology, predicted location of the N-glycosylation site as determined by TmPred (https://embnet.vital-it.ch/software/TMPRED) and TOPCONS.140 Cyto, cytoplasm; E, extracellular; IM, inner membrane; LP, lipoprotein; OM, outer membrane; PP, periplasm; SE, surface exposed; Unk, unknown. ^Site identified by expression in E. coli containing the pgl cluster and over-expression in C. jejuni [H. M. Frost, PhD Thesis, University of Manchester, 2015], not seen in any wild-type C. jejuni glycoproteome studies.a Cj0017c localization depends on correct prediction of orientation for N- and C-terminus of protein.b Cj0371 co-localises to the poles of C. jejuni cells and thus co-localises with flagella.134 Thus, the protein is potentially surface-exposed (SE).c Cj0592c PSORT b predicts unknown localization; Lipo P predicts a lipoprotein signal peptide with OM anchor (Asp at +2 position to SpII cleavage site); protein is described as ‘putative periplasmic protein’.d Cj0599 PSORT b predicts unknown localization; protein contains C-terminal OmpA domain suggesting OM localization and therefore topology could be PP or SE.e Cj0776c PSORT b predicts cytoplasmic localization; 1 predicted TMR; TOPCONS predicts 1 TMR with the majority of the protein localized to the periplasm.f Cj0864 Ding et al.141 reported this sequence as DMoxNVS (where the methionine is methionine sulfoxide), however the NCTC11168 sequence indicates the −2 position is an alanine. This sequence was also low scoring as discussed in the text.g Cj0944c PSORT b predicts cytoplasmic localization; Lipo P and TOPCONS predict 1 SP and periplasmic location.h Cj0982c PSORT b predicts periplasmic localization; TOPCONS and Lipo P predict lipoprotein with IM anchoring. Experimental evidence in ref. 103.i Cjj81176_1263 was originally described in ref. 89 and 91 as CJE1384. | ||||||
Cj0011c | 0037 | cj0011c | Putative non-specific DNA-binding protein (competence ComEA-like; natural transformation protein) | 49EANFT53 | IM (1) | PP |
Cj0017c | 0044 | dsbI | Disulfide bond formation protein DsbI | 3EINKT7 | IM (5) | Cytoa |
Cj0081 | 0118 | cydA | Cytochrome bd oxidase subunit I | 283DNNES287 | IM (9) | PP |
351EN(S)NDT355 | PP | |||||
Cj0089 | 0124 | cj0089 | Putative lipoprotein (TPR tetricopeptide repeat-like helical domain protein) | 73DFNKS77 | LP/IM (SP) | PP |
Cj0114 | 0149 | cj0114 | Putative periplasmic protein (TPR tetricopeptide repeat-like helical domain protein; putative Tol-Pal system protein YbgF/putative cell division coordinator CpoB) | 99ENNFT103 | OM | PP |
153DA(V)NLS157 | PP | |||||
171DSNST175 | PP | |||||
177ENNNT181 | PP | |||||
Cj0131 | 0166 | cj0131 | Putative peptidase M23 family protein/putative zinc metallopeptidase (putative Gly–Gly endopeptidase) | 73DDNTS75 | Unk (1) | PP |
Cj0143c | 0179 | znuA | Putative periplasmic ABC transport solute-binding protein (zinc-binding ABC transporter ZnuA) | 26E(D)QNTS30 | PP | PP |
Cj0152c | 0188 | cj0152c | Putative membrane protein (45.3% similarity to H. pylori sialidase A/neuraminidase) | 126EQNNT130 | Unk (1) | PP |
157DNNK161+ | PP | |||||
163ETNRT167 | PP | |||||
182DKNIS186 | PP | |||||
188ENNIS192 | PP | |||||
193ENNTT197 | PP | |||||
250DFNIS254 | PP | |||||
Cj0158c | 0194 | cj0158c | Putative haem-binding lipoprotein (cytochrome c oxidase Cbb3-like protein) | 119DKNHS123 | LP/OM (SP) | PP |
Cj0168c | 0204 | cj0168c | Putative periplasmic protein | 26DVNQT30 | PP (SP) | PP |
Cj0176c | 0212 | cj0176c | Putative lipoprotein | 29DLNKT33 | LP/OM (SP) | PP |
Cj0177 | ND | ctuA/chaN | Putative iron transport protein (putative iron-regulated lipoprotein) | 83EGNLS87^ | IM (1) | PP |
Cj0182 | 0213 | cj0182 | Putative transmembrane transport protein (ABC transporter transmembrane family; long chain fatty acid ABC transport protein; peptide antibiotic transport protein SbmA) | 58DSNST62 | IM | PP |
70ENNAT74 | PP | |||||
Cj0199c | 0230 | Putative periplasmic protein | 126DINLS130 | Unk (1) | PP | |
Cj0200c | 0231 | cj0200c | Putative periplasmic protein | 33DNNKT37 | Unk (SP) | PP |
Cj0235c | 0260 | secG | Uncharacterized protein (preprotein translocase subunit SecG) | 87ENNNT91 | IM (2) | PP |
118DVNSS122 | PP | |||||
Cj0238 | 0263 | cj0238 | Putative mechanosensitive ion channel family protein (MscS family membrane integrity protein) | 24DANIS28 | IM (5) | PP |
56DENSS60 | PP | |||||
Cj0256 | 0283 | eptC | Putative sulfatase family protein (phosphoethanolamine transferase EptC; lipid A/lipooligosaccharide pEtN transferase EptC) | 213ENNHT217 | IM (5) | PP |
Cj0268c | 0295 | cj0268c | Putative transmembrane protein (SPFH domain/band 7 family protein; FtsH protease regulator HflC) | 274EANAT278 | Unk (1) | PP |
Cj0277 | 0304 | mreC | Homolog of E. coli rod-shape determining protein MreC | 91DQNST95 | Unk (1) | PP |
Cj0289c | 0315 | peb3 | Major antigenic peptide PEB3 (thiosulfate/sulfate-binding protein) | 88DFNVS92 | Unk (SP) | PP |
Cj0313 | 0335 | cj0313 | Putative integral membrane protein (putative lipooligosaccharide export ABC transporter permease LptG) | 173DLNLS177 | IM (6) | PP |
196DGNIT200 | PP | |||||
Cj0365c | 0388 | cmeC | Outer membrane channel protein CmeC (multi-drug antibiotic efflux system CmeABC protein) | 30EANYS34 | OM (SP) | PP |
47ENNSS51 | PP | |||||
Cj0366c | 0389 | cmeB | Efflux pump membrane transporter CmeB (Multi-drug antibiotic efflux system CmeABC protein) | 634DRNVS638 | IM (12) | PP |
Cj0367c | 0390 | cmeA | Periplasmic fusion protein CmeA (multi-drug antibiotic efflux system CmeABC protein) | 121DFNRS125 | IM (1) | PP |
271DNNNS275 | PP | |||||
Cj0371 | 0395 | cj0371 | UPF0323 lipoprotein Cj0371 (putative secreted protein involved in flagellar motility) | 75DLNGT79 | LP/OM (SP) | PP/SEb |
Cj0376 | 0400 | cj0376 | Putative periplasmic protein | 50DKNQT54 | Cyto | PP |
Cj0397c | 0420 | cj0397c | Uncharacterized protein | 105DFNNT109 | Unk (1) | PP |
Cj0399 | 0422 | cvpA | Colicin V production protein homolog CvpA | 179DLNNT183 | IM (4) | PP |
Cj0404 | 0428 | dedD | Putative transmembrane protein (SPOR sporulation domain-containing protein; putative cell division protein DedD) | 101EQNNT105 | Unk (1) | PP |
Cj0454c | 0479 | cj0454c | Putative membrane protein | 91ENNKS95 | IM (1) | PP |
Cj0455c | 0480 | cj0455c | Putative membrane protein | 60QNQT64+ | IM (1) | PP |
Cj0494 | 0515 | cj0494 | Putative exporting protein | 26DNNIT30 | Unk | PP/SE |
Cj0508 | 0536 | pbpA | Penicillin-binding protein PbpA (penicillin-binding protein 1A; peptidoglycan transpeptidase PBP1A) | 312DANLS316 | IM (1) | PP |
Cj0511 | 0539 | ctpA | Putative secreted protease (protease family S41; carboxy-terminal protease CtpA) | 67DQNIS71 | IM (1) | PP |
Cj0515 | 0543 | cj0515 | Putative periplasmic protein | 207ELNAT211 | IM (3) | PP |
234DFNAS238 | PP | |||||
Cj0530 | 0555 | cj0530 | Putative periplasmic protein (AsmA family protein DUF3971 domain) | 519DFNAS523 | OM (1) | PP/SE |
617DSNKT621 | PP/SE | |||||
Cj0540 | 0565 | cj0540 | Putative exporting protein | 173ENNNS177 | Unk (0) | PP/SE |
Cj0587 | 0615 | cj0587 | Putative integral membrane protein | 282DNNLS286 | IM (8) | PP |
Cj0592c | 0620 | cj0592c | Putative periplasmic protein (putative lipoprotein; Cj0591 paralog) | 96DINQS100 | Unk (SP)c | PP |
103ENNES107 | PP | |||||
127ENNQS131 | PP | |||||
137DVNMT141 | PP | |||||
Cj0599 | 0627 | cj0599 | Putative OmpA family membrane protein (putative chemotaxis protein MotB; Putative flagellar motor motility protein MotB; Cj0336c MotB paralog) | 97EANIT101 | Unk (1) | PP/SEd |
109DLNST113 | PP/SE | |||||
168DNNIT172 | PP/SE | |||||
Cj0608 | 0637 | cj0608 | Putative outer membrane efflux protein (putative TolC-like outer membrane protein; putative antibiotic efflux CmeC paralog) | 35DLNLT39 | OM (2) | PP |
Cj0610c | 0639 | cj0610c | Putative periplasmic protein (SNGH family hydrolase; putative lipase/esterase; peptidoglycan O-acetyltransferase PatB) | 82DENLS86 | Unk (1) | PP |
98DENTS102 | PP | |||||
113DANIS117 | PP | |||||
296ENNRS300 | PP | |||||
331EENAS335 | PP | |||||
Cj0633 | 0661 | cj0633 | Putative periplasmic protein (putative polysaccharide deacetylase; putative glycoside hydrolase/deacetylase) | 73DNNKS77 | Cyto (1) | PP |
123DTNLT127 | PP | |||||
129DQNLT133 | PP | |||||
Cj0648 | 0676 | cj0648 | Putative membrane protein (putative lipooligosaccharide transport system substrate-binding protein LptC) | 49ESNTS53 | IM (1) | PP |
103EGNVT107 | PP | |||||
Cj0652 | 0680 | pbpC | Penicillin-binding protein PbpC (pencillin-binding protein PBP2; peptidoglycan transpeptidase PBP2) | 99DLNAS103 | IM (1) | PP |
467ENNNT471 | PP | |||||
Cj0694 | 0717 | ppiD | Putative periplasmic protein (SurA domain-containing outer membrane protein folding protein; peptidyl-prolyl cis/trans isomerase PpiD) | 132DFNKT136 | IM (1) | PP |
306DQNIS310 | PP | |||||
426DQNSS430 | PP | |||||
Cj0734c | 0757 | hisJ | Probable histidine-binding protein (periplasmic lipoprotein CjaC; solute transport protein HisJ) | 26EN(S)NAS30 | IM (PP) | PP |
Cj0776c | 0797 | cj0776c | Putative periplasmic protein | 87DENQS91 | Cyto (1)e | PP |
103ENNQS107 | PP | |||||
111DTNTS115 | PP | |||||
Cj0780 | 0801 | napA | Periplasmic nitrate reductase NapA (catalytic subunit of the NapAB complex) | 385DDNES389 | IM (PP) | PP |
Cj0783 | 0804 | napB | Periplasmic nitrate reductase NapB (electron transfer subunit of the NapAB complex) | 48EANFT52 | IM (PP) | PP |
Cj0843c | 0859 | slt | Putative secreted transglycosylase (soluble lytic murein peptidoglycan transglycosylase) | 97DANLT101 | IM (PP) | PP |
173DLNTG(S)177 | PP | |||||
327DANAS331 | PP | |||||
374DYNKT378 | PP | |||||
Cj0846 | 0862 | cj0846 | Uncharacterized metallophosphoesterase (Ser/Thr phosphatase family protein) | 280DLNTS284 | IM (3) | PP |
Cj0864 | 0880 | cj0864 | Putative periplasmic protein (putative thiol: disulfide interchange protein DsbA homolog) | 50MNVS54+f | IM (PP) | PP |
Cj0906c | 0915 | pgp2 | Putative periplasmic protein (peptidoglycan L-D-carboxypeptidase Pgp2) | 53DKNIS57 | IM (SP) | PP |
Cj0944c | 0968 | cj0944c | Putative periplasmic protein (putative flagellar protein FliL; chemotaxis-associated protein) | 219ENNAS223 | Cyto (0)g | PP |
238DENST242 | ||||||
Cj0958c | 0981 | yidC | Membrane protein insertase YidC (integral membrane protein assembly/folding protein YidC) | 40EQNIT44 | IM (5) | PP |
48QNTS52+ | PP | |||||
154DENGS158 | PP | |||||
Cj0982c | 1001 | cjaA | Putative amino acid transporter periplasmic solute-binding protein CjaA | 137DSNIT141 | IM (PP/LP)h | PP |
Cj0983 | 1002 | jlpA | Uncharacterized lipoprotein Cj0983 (surface-exposed lipoprotein JlpA) | 105E(K)ANAS109 | OM (SE) | SE |
144DINAS148 | SE | |||||
Cj1007c | 1025 | cj1007c | Putative mechanosensitive ion channel family protein (MscS family osmotic stress resistance protein) | 17DVNRT21 | IM (4) | PP |
Cj1013c | 1032 | cj1013c | Putative cytochrome c biogenesis protein CcmF/CycK/CcsA family protein CcsB | 178ENNNS182 | IM (14) | PP |
230DENLT234 | PP | |||||
530DLNST534 | PP | |||||
731DGNWT(I)735 | PP | |||||
Cj1032 | 1051 | cmeE | Membrane fusion component of antibiotic efflux system CmeDEF | 199DQNGT203 | IM (1) | PP |
Cj1053c | 1073 | cj1053c | Putative integral membrane protein (amino acid/carbohydrate/antibiotic transport permease motifs protein; lipooligosaccharide ligase-like motif protein) | 75DINVS79 | IM (2) | PP |
96DNNQS100 | PP | |||||
Cj1055c | 1075 | cj1055c | Putative sulfatase family protein (putative arylsulfatase; putative phosphoglycerol transferase lipooligosaccharide synthesis protein homolog) | 616ESNDT620 | IM (5) | PP |
Cj1126c | 1143 | pglB | Undecaprenyl-diphosphooligosaccharide-protein glycosyltransferase (PglB oligosaccharyltransferase) | 532DYNQS536 | IM (12) | PP |
Cj1219c | 1232 | cj1219c | Putative periplasmic protein (uncharacterized protein involved in outer membrane biogenesis assembly) | 47DVNIT51 | OM (1) | PP |
Cj1345c | 1344 | pgp1 | Putative periplasmic protein (peptidoglycan D-L-carboxypeptidase Pbp1) | 59DYNIT63 | Cyto (1) | PP |
159EINAS163 | PP | |||||
348DGNET352 | PP | |||||
Cj1373 | 1376 | cj1373 | Putative integral membrane protein (antibiotic resistance sterol-sensing domain protein; RND superfamily export protein MmpL family) | 134DINRT138 | IM (12) | PP |
497DQNTS501 | PP | |||||
Cj1444c | 1438 | kpsD | Capsule polysaccharide export system periplasmic protein KpsD | 37DQNLS41 | IM (PP) | PP |
50ENNLT54 | PP | |||||
Cj1496c | 1488 | cj1496c | Putative periplasmic protein (putative magnesium transporter MgtE-like protein; putative motility chaperone MotE; putative flagellar protein FliG) | 71EVNAT75 | Cyto (PP) | PP |
167DNNAS171 | PP | |||||
Cj1565c | 1550 | pflA | Paralysed flagellar motility protein A PflA | 456DNNAS460 | Cyto (PP) | PP |
495EGNFS499 | PP | |||||
Cj1621 | 1608 | cj1621 | Putative periplasmic protein | 197DLNKT201 | E (1) | PP |
Cj1661 | 1652 | cj1661 | Putative ABC transport system permease (putative antibiotic macrolide export protein MacB; putative cell division protein FtsX) | 188ENNQS192 | IM (4) | PP |
Cj1670c | 1666 | cgpA | Putative periplasmic protein (campylobacter glycoprotein A; AMIN-domain containing protein, membrane protein assembly protein) | 26DQNIT30 | Unk (0) | PP |
71DVNKS75 | PP | |||||
104EKNSS108 | PP | |||||
111ESNST115 | PP | |||||
ND | 0063 | sirA | Dissimilatory sulfite reductase SirA/MccA | 213DGNLS217 | IM (1) | PP |
ND | 0701 | kdpC | Potassium-transporting ATPase KdpC subunit | 83DTNES87 | IM (1) | PP |
ND | 1263 | 1263 | Uncharacterized proteini | 26EQNGS30 | Unk (SP) | PP |
VirB10 | pVir0003 | virB10 | Type IV secretion system protein VirB10 | 30EENVS34 | OM (SP) | PP |
95DNNIT99 | PP |
Fig. 2 Modelling of predicted surface topologies of 3 C. jejuni N-glycoproteins. (A) Cj0152c; positions of experimentally verified N-glycosites (Asn; N) are shown in red circles with occupied sequons shown in blue fill, the position of a non-canonical, but occupied sequon is shown in green fill; (B) Cj0179 (ExbB1); positions of two sequons (not experimentally verified) are shown in blue with the Asn residues in red, the N-terminal signal peptide that overlaps the first sequon is in green; (C) Cj1087c; positions of two sequons (not experimentally verified) are shown in blue with the Asn residues in red, the sequon at position 12DINGS16 is predicted to reside in the cytoplasm and hence cannot be glycosylated. All topologies were visualized using Protter.142 |
As discussed above, despite PglB showing a preference for Thr at the +2 position,92 there is no obvious bias towards Thr in the identified N-glycosites; in fact only 60 of 134 identified sites contain sequons with a Thr in this position (44.8%), with 73 containing Ser (54.5%) and the final sequon displaying alanine (Ala) in a non-canonical sequon (Table 1).89 Conversely, there is clear preference for Asp at the −2 position with 84 sequons displaying this amino acid (62.7%) compared with only 47 displaying Glu (35.1%). The final 3 sequons were non-canonical (Table 1). These data align with previous studies that have tested various sequon compositions and their glycosylation efficiency by the PglB OST and found DQNAT to be the optimal sequon, as well as an ∼5-fold preference for Asp, rather than Glu, at the −2 position.104
Beyond localization and topology, the next major influence on sequon occupancy is the tertiary conformation of the protein, with the three-dimensional structure of both the target protein and PglB itself dictating site accessibility.85,93,107 Unlike in eukaryotes, where N-glycosylation occurs in the endoplasmic reticulum (with further processing in the Golgi apparatus) prior to or during folding (and hence partially dictates the final conformation), the prevailing viewpoint is that the C. jejuni N-glycan is added to already, or at least partially, folded substrates,93,100 meaning that existing tertiary structural constraints are a major factor in the final attachment and kinetics of the modification. Sequons buried within the tertiary structure are therefore inaccessible to PglB and cannot be modified, irrespective of their sub-cellular location. The earliest structural consideration of C. jejuni N-glycosylation was based on the crystal structure of the major antigen PEB3 (Cj0289c),108 which showed the N-glycosite at 88DFVS92 occurs in a flexible exposed loop region readily accessible to the PglB OST. Therefore, without determining structures of glycoproteins it remains difficult to predict which sequons will be occupied and the likely level of site occupancy, and there are only very few N-glycoproteins for which three-dimensional structures are currently available. In addition to PEB3, and PglB itself,85 structures of components of the tripartite antibiotic efflux system CmeABC (Cj0365c–Cj0367c) have also been elucidated,109,110 and all 3 are N-glycoproteins (Table 1). CmeA is the periplasmic membrane fusion family protein, with 2 N-glycosites both predicted to be located within the periplasm (Table 1). CmeC is the outer membrane channel and examination of the crystal structure109 shows that both experimentally verified glycosylated sequons (30EAYS34 and 47ENSS51) are located in a periplasmic disordered exposed loop region that leads from the membrane-embedded N-terminal lipidated cysteine (following removal of the signal peptide) to the first structured part of the protein. Therefore both sequons are consistent with the known structural requirements for N-glycosylation.93,99,108
CmeB, which is the inner membrane efflux pump, contains one well characterized N-glycosite (634DRVS638). A second sequon (653DRNAS657) is located proximal to this confirmed site, but no experimental evidence exists for this site being occupied in any C. jejuni strain, and hence CmeB is the only protein with both an occupied and unoccupied glycosite for which structural information can currently be determined. These two sites are also of interest since their sequons are near identical and hence, any effects of differences at the −2 and +2 positions, as described above, are likely to be negligible (indeed the arginine [Arg] at the −1 position is shared, while the +1 position is a semi-conservative substitution from valine [Val] to alanine [Ala], which are both aliphatic amino acids) and most likely do not influence site occupancy. Interrogation of the CmeB tertiary structure shows that both sequons are located in the large periplasmic section of the protein located between the sixth and seventh transmembrane-spanning regions (TMR; residues 554–867, with CmeB predicted to contain 11 TMR, excluding the N-terminal signal peptide) and are found in short disordered exposed loop regions separated by a small alpha-helix (Fig. 3A). Tertiary structure modelling shows that 634DRVS638 is located close to the membrane and the modified Asn is highly solvent accessible, while 653DRNAS657 is located further into the periplasm. Although solvent accessible, Asn-655 is partially occluded by Arg-654 (Fig. 3A). The CmeB structure was next modelled in protein complex with the PglB OST, using the model sequon DQNAT104 to provide the PglB binding conformation. CmeB/PglB docking clearly demonstrated a preference for the Asn-636 site, consistent with the identification of this site in several MS-based studies (Fig. 3B and C), while the Asn-655 site does not appear to readily interact with the PglB model, and hence therefore is likely to either not be glycosylated or glycosylated at only very low site stoichiometry.
Fig. 3 CmeB modeling with PglB highlighting N-glycosylation sequons. (A) CmeB trimer (Protein Data Bank [PDB]: 5LQ3) has a transmembrane domain (highlighted in blue with 11 TMR) and periplasmic domain. The experimentally validated sequon (634DRVS638) is labelled red and the non-identified sequon (653DRNAS657) is labelled green, with both Asn labelled cyan. Both Asn are located on the periplasmic side on exposed loops and are solvent-accessible with Asn-636 more accessible than Asn-655; (B) The PglB OST (PDB: 3RCE) shown in yellow has a transmembrane spanning domain (highlighted in blue with 12 TMR) and a larger periplasmic region where the catalytic domain is located. The sequon recognition site is highlighted in orange and facing towards CmeB with the glycan-binding site located behind. Sequon 634DRVS638 is in closer proximity and has better accessibility to the PglB catalytic site; (C) (left) PglB viewed from the front (90° counter-clockwise rotation to upper panels) reveals the sequon-binding surface in orange, (Middle) PglB fitted with the model peptide mimic DQNAT, (right) alignment of CmeB to PglB (90° counter-clockwise rotation to panel B) reveals that sequon 634DRVS638 is more spatially likely to fit into the active site of PglB suggesting this sequon is more readily glycosylated than 653DRNAS657. Analysis was performed in UCSF Chimera 1.14 (build 42094). |
Further evidence for structural constraints determining optimal glycosylation have been shown for the doubly glycosylated surface-exposed glycoprotein JlpA.101 Scott et al. showed that JlpA must be glycosylated at one site (144DIAS148) before a second site (105EAAS109) can be glycosylated, inferring that structural modifications to JlpA conferred by Asn-146 glycosylation open the protein conformation and allow PglB to add the N-glycan to the second site. These structural constraints have since been confirmed using structural predictions and crystallography.111 Finally, nuclear magnetic resonance (NMR) analysis of a recombinant C. jejuni CmeA domain indicates that the N-glycan itself adopts a rigid rod conformation112 that appears to fold back over the exposed protein (thus suggesting a role in protection from proteolysis), although it remains to be seen how well conserved this is in vivo. Although no examples have been shown in the literature, the converse may also be true in that the N-glycan itself may hinder accessibility of a second site in a given protein to the PglB OST. Despite this possibility, proteins such as Cj0152c (Fig. 2A) have multiple sites in close sequence space; occupied sequons are found at 7 sites, with 3 (Asn-184, Asn-190 and Asn-195) located within 20 amino acids. To determine if N-glycan steric hindrance of PglB occurs, better understanding of individual site occupancy, in the context of tertiary structures, is needed.
A final structural/topological consideration is the role of N-glycosylation in OMVs that have been associated with C. jejuni virulence.24–26 OMVs package cytoplasmic, periplasmic, outer membrane-associated and N-glycoproteins in a ‘bleb’-like structure.24 PglB is located in the cytoplasmic/inner membrane, which is not typically associated with OMVs. It is possible however, that inner membrane fragments may also be packaged into OMVs, and all C. jejuni OMV proteomics studies have demonstrated the identification of integral cytoplasmic membrane proteins (e.g. CmeB24). Packaging of PglB into OMVs may enable glycosylation of sites not typically found in the membrane; however despite this, we and others have observed no such cytoplasmic N-glycosites, even at low levels, which may imply that PglB does not occur in OMVs, or that OMVs are not induced (or collected) under the culture conditions employed in the N-glycosite discovery studies conducted thus far. C. jejuni OMV composition is however, altered in pgl-negative compared with wild-type C. jejuni,113 suggesting N-glycosylation does impact protein packaging, although no differences were observed in the ability of OMVs from either pgl positive or negative bacteria to induce an immune response.113
The membrane-associated targets of the pgl N-glycosylation system are largely functionally uncharacterized ‘putative’ proteins. The remaining proteins share some degree of sequence identity with well characterized proteins from other organisms, while only a very small number have been experimentally validated. Examination of the relationships between glycoprotein identifications (Table 1) highlight several clusters of potentially functionally related classes of protein, including those involved in antibiotic resistance (all 3 members of the CmeABC antibiotic efflux system are glycosylated, as is CmeE of the CmeDEF efflux system), and antibiotic resistance has been strongly associated with the pgl system.117 Additionally, proteins with putative functions, or sequence similarity to proteins, involved in peptidoglycan biosynthesis, modification and C. jejuni helical cell morphology (Pgp1, Pgp2, MreC, PatB [Cj0610c], Cj0843c and the penicillin-binding proteins PbpA and PbpC), LOS and capsular polysaccharide (CPS) transport and assembly (Cj0313/LptG, Cj0648/LptC, Cj1053c, Cj1055c and KpsD), and membrane protein translocation and assembly (SecG, Cj0238, PpiD, YidC, Cj1219c, CgpA) are also enriched in the 78 identified N-glycoproteins, however these phenotypes have not yet been tested in pgl negative C. jejuni or N-glycosite mutants.
While several of the above studies have examined phenotypes from the perspective of pgl negative and positive C. jejuni, comparatively fewer studies have attempted to exploit site-directed mutagenesis to understand the role of the N-glycan in individual proteins. This is mainly due to the difficulty in generating site mutants in C. jejuni, which is considered poorly tractable and somewhat recalcitrant to molecular biology approaches considered standard in species such as E. coli. Despite this, a limited number of studies have been performed.122–124N-Glycosite point mutants in cmeA (CmeA contains 2 glycosites; Table 1) have increased susceptibility to several antimicrobials including bile salts and ciprofloxacin, and are attenuated for chicken colonization.125 The PglB OST is also capable of transferring the N-glycan to itself,61 however recombinant PglB expressed in otherwise non-glycosylating E. coli remains capable of catalyzing the transfer of N-glycans to proteins,62 suggesting PglB does not strictly require modification with the heptasaccharide to maintain function. Plasmid encoded VirB10 (as well as CmeA, discussed above) was reported to require N-glycosylation to perform its function (in natural transformation) at wild-type levels.123 VirB10 is not universally distributed among strains of C. jejuni, however observations of impaired natural transformation in the absence of N-glycosylation have also been observed in studies of the Cj0011c N-glycoprotein.126 Several confirmed N-glycoproteins (including DsbI, JlpA, PEB3, EptC, Cj0268c, Cj0371, Cj0454c, Cj0511c/CtpA, Cj0587 and Pgp1/Pgp2) have been associated with host colonization;127–134 however, these focused studies of individual glycoproteins have only rarely attempted to provide evidence of a contribution from the N-glycan, rather than testing gene-specific deletion mutants. In vitro expression and functional analysis of C. jejuni N-glycoproteins in non-pgl-containing E. coli suggest that N-glycosylation is not required for the function of a number of glycoproteins,128,130,135,136 however, without site mutants or comparative expression in pgl-positive expression systems, it is not possible to compare the functional efficiency of these proteins when glycosylated.
While evidence that C. jejuni protein N-glycosylation occurs on folded substrates indicates that the modification is not a driver of protein folding, there is a mounting body of evidence to suggest that the N-glycan may be important for protein stability. Mansell et al. demonstrated that the glycoproteins PEB3, CjaA and PatB/Cj0610c displayed differences in protein stability in an N-glycosylation competent, pgl system-containing E. coli.137 These proteins also showed altered folding when glycosylated, further supporting the JlpA evidence that indicates glycan attachment can alter conformational state.111 Similarly, Min et al. showed an increase in thermostability for recombinant expressed PEB3 engineered to have an additional N-glycosylation site in comparison to an unmodified variant.138 Finally, Alemka et al. showed that a pgl-negative strain displayed reduced viability when cultured under physiological levels of human- and chicken-derived proteases,120 which also supports the notion that N-glycosylation is involved in conferring protein stability.
Footnote |
† Current address: Centre for Blood Research, Department of Oral Biological and Medical Sciences, University of British Columbia, Vancouver, Canada. |
This journal is © The Royal Society of Chemistry 2020 |