Ana
García-García
a,
Thomas
Hicks
b,
Samir
El Qaidi
c,
Congrui
Zhu
c,
Philip R.
Hardwidge
c,
Jesús
Angulo
*bde and
Ramon
Hurtado-Guerrero
*afg
aInstitute of Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, Mariano Esquillor s/n, Campus Rio Ebro, Edificio I+D, Zaragoza, Spain. E-mail: rhurtado@bifi.es
bSchool of Pharmacy, University of East Anglia, Norwich Research Park, Norwich, NR4 7TJ, UK
cCollege of Veterinary Medicine, Kansas State University, Manhattan, KS 66506, USA
dDepartamento de Química Orgánica, Universidad de Sevilla, Sevilla, 41012, Spain. E-mail: jangulo@us.es
eInstituto de Investigaciones Químicas (CSIC-US), Sevilla, 41092, Spain
fCopenhagen Center for Glycomics, Department of Cellular and Molecular Medicine, School of Dentistry, University of Copenhagen, Copenhagen, Denmark
gFundación ARAID, Zaragoza, Spain
First published on 19th August 2021
NleB/SseK effectors are arginine-GlcNAc-transferases expressed by enteric bacterial pathogens that modify host cell proteins to disrupt signaling pathways. While the conserved Citrobacter rodentium NleB and E. coli NleB1 proteins display a broad selectivity towards host proteins, Salmonella enterica SseK1, SseK2, and SseK3 have a narrowed protein substrate selectivity. Here, by combining computational and biophysical experiments, we demonstrate that the broad protein substrate selectivity of NleB relies on Tyr284NleB/NleB1, a second-shell residue contiguous to the catalytic machinery. Tyr284NleB/NleB1 is important in coupling protein substrate binding to catalysis. This is exemplified by S286YSseK1 and N302YSseK2 mutants, which become active towards FADD and DR3 death domains, respectively, and whose kinetic properties match those of enterohemorrhagic E. coli NleB1. The integration of these mutants into S. enterica increases S. enterica survival in macrophages, suggesting that better enzymatic kinetic parameters lead to enhanced virulence. Our findings provide insights into how these enzymes finely tune arginine-glycosylation and, in turn, bacterial virulence. In addition, our data show how promiscuous glycosyltransferases preferentially glycosylate specific protein substrates.
A rare PTM described a few years ago, arginine-glycosylation, is catalyzed by Gram-negative bacterial GTs.5 This unusual PTM occurs on the arginine guanidinium group, a very poor nucleophile. In Pseudomonas and Neisseria species, arginine-glycosylation is catalyzed by EarP, while in enteropathogens, it is catalyzed by the type III secretion system effectors arginine GTs NleB and SseK.6–9 EarP is a rhamnosyl-transferase that uniquely glycosylates the bacterial translation elongation factor P (EF-P) to activate its function and drive bacterial pathogenicity.6 The NleB/SseK GTs transfer GlcNAc to arginines of several mammalian proteins and to at least five bacterial proteins.7–12 The NleB/SseK GTs are not classified in the CAZy database (GTnc).13
While C. rodentium only encodes one NleB, most E. coli strains encode two NleB proteins named NleB1 and NleB2. For several years, the role of NleB2 was unclear10 until a recent publication reported that NleB2 is an arginine GT that preferably transfers glucose to RIPK1, inhibiting host protein function similarly to other NleB/SseK GTs.14 The change in sugar donor preference was attributed to Ser252NleB2, which corresponds to a Gly residue in all homologous sequences.14CrNleB is highly conserved in the attaching/effacing pathogens enterohemorrhagic Escherichia coli (EHEC) and enteropathogenic E. coli (EPEC) NleB1s.15 Particularly, the identity between these enzymes is ∼89% between CrNleB and NleB1s, and 98% between NleB1s (Fig. 1a). Salmonella enterica strains encode up to three functional NleB orthologs named SseK1, SseK2, and SseK3. When the CrNleB/NleB1sEPEC/EHEC are compared to SseK1/2/3, the sequence identities drop significantly, ranging from 51 to 57% (Fig. 1a). At the structural level, these enzymes show a high degree of similarity and are built by two conserved major domains and a C-terminal lid, which is also required for the catalytic activity of the enzyme. The GT-A fold-adopting catalytic domain is the largest domain and includes the essential DxD and HEN (His–Glu–Asn) motifs. The helix–loop–helix (HLH) domain comprises two helices, α3 and α4, connected by a loop16–19 (Fig. 1a and b).
Fig. 1 Analysis of the interacting residues in the NleB1EPEC–FADDDD interface. (a) Multiple sequence alignment of CrNleB, NleB1EPEC, NleB1EHEC, SseK1wt, SseK2wt and SseK3wt. Residues are color-coded by their degree of sequence conservation where black, grey and white colors denote identity, high similarity and dissimilarity, respectively. Shown above the NleB1EPEC sequence, in gray (catalytic domain) and brown (HLH domain), are the secondary structure elements (α-helices and β-strands) based on the NleB1EPEC–UDP–Mn2+–FADDDD structure (PDB entry 6ACI19). The residues forming part of the C-terminal lid are indicated within a red box while a blue rectangle determines the five C-terminal residues. The five inverted green triangles indicate the residues in NleB GTs that are non-conserved or partly conserved with the SseK GTs and are engaged in FADDDD interaction. These residues were targeted for site-directed mutagenesis in SseK1wt and SseK2wt. (b) Cartoon representation of the NleB1EPEC–UDP–Mn2+–FADDDD. The catalytic and HLH domains of NleB1EPEC are shown in gray and brown, respectively. The FADDDD is shown in cyan. Residues are shown as sticks with carbon atoms with the corresponding colors indicated above. UDP and Mn2+ are shown as green carbon atoms and as a pink sphere, respectively; hydrogen bond interactions are shown as dotted orange lines but only for residues interacting at the interface of the complex. The interactions with UDP and Mn2+ have been extensively discussed before16,17 and will not be further discussed. |
NleB GlcNAc-transferase activity is essential to bacterial virulence.7 Multiple host protein substrates for the CrNleB and NleB1EPEC/EHEC have been described and include the death domains (DD) of tumor necrosis factor receptor 1 (TNFR1), the TNFR1-associated death domain protein (TRADD), the receptor-interacting serine/threonine-protein kinase 1 (RIPK1), the TNF-receptor superfamily member 25 (DR3 or TNFRSF25), and FAS-associated death domain protein (FADD).9,19,20 In addition, these enzymes glycosylate other proteins non-containing DDs. CrNleB and NleB1EHEC, but not NleB1EPEC, glycosylate GAPDH,10 and NleB1EPEC glycosylates HIF-1α.21 Overall, NleB1 disrupts TNFR-associated factor (TRAF) signaling, leading to inhibition of the pro-inflammatory NF-κB pathway.7,9,10 SseK1, SseK2, and SseK3 have a narrower protein substrate selectivity; SseK1 glycosylates TRADD9 and GAPDH,10 but not FADD;10 SseK2 glycosylates FADD10 but not TRADD17 or GAPDH;10 SseK3 glycosylates TNRF1,18 TRAIL,18 and the small GTPase Rab1,22 but not GAPDH10 or FADD.10 This illustrates that although the SseK and NleB GTs are highly similar at the sequence and structural level, they display dissimilarities in their protein substrate selectivity. Furthermore, the SseK GTs are inactive towards some NleB-specific protein substrates.10,17
To address the molecular basis of NleB/SseK protein substrate selectivity and to determine why some SseK GTs do not glycosylate particular protein substrates, we report herein a multidisciplinary approach on wild type (wt) NleB1EHEC, SseK1, and SseK2, combined with the characterization of different SseK1 and SseK2 mutants, which reveals that a single second-shell residue near to the catalytic machinery, finely tunes substrate selectivity and catalysis. We also show that optimal kinetic parameters are accomplished by the mutants S286YSseK1 and N302YSseK2, which contain a Tyr residue that replaces SseK1wt Ser286 and SseK2wt Asn302. We recover full activity with S286YSseK1 on FADDDD and N302YSseK2 on DR3DD. Finally, we demonstrate that the integration of S286YSseK1 and N302YSseK2 mutants in a Salmonella enterica strain devoid of all SseK enzymes increases Salmonella survival in macrophages, suggesting that better enzymatic kinetic parameters lead to enhanced virulence.
A thorough comparison of the grade of conservation of the NleB1EPEC fourteen interacting residues with aligned residues from other orthologs indicated that only five residues might be responsible for the null activity of SseK1wt on FADDDD (Table S1†). These residues were mostly non-conserved or partially conserved with SseK1wt residues. Tyr145 and Glu149 were quite close in the structure and located in a loop connecting α3 with α4 of the HLH domain (Fig. 1b). The other three residues, Tyr284, Lys289, and Lys292, were located in the catalytic domain and exclusively in α9 (Fig. 1a). While both Lys residues were proximal in the structure and established salt bridges with Asp123FADD and Glu130FADD, Tyr284 was more isolated from them and interacted through its side chain to Val121FADD backbone and Ile126FADD side chain (Fig. 1 and Table S1†). Tyr284 was also engaged in a CH–π interaction with the proximal Tyr283 side chain, which likely controls the interaction between the acceptor Arg117FADD and the catalytic base Glu253 (Fig. 1b). Hence, Tyr284 is a second-shell residue with respect to the catalytic machinery in which Glu253 is one of the key players.
To initiate this study, we determined the kinetic parameters of NleB1EHEC on UDP-GlcNAc and FADDDD (see Fig. 2a, left panel, ESI, Fig. S2, and Table S2†). NleB1EHEC displayed a clear substrate inhibition profile under the presence of variable concentrations of FADDDD (Fig. 2a; Ki = 793 ± 160 μM) that was not present in the other SseK2wt and mutant proteins. This behavior might be attributed to a different non-productive FADDDD structural arrangement under high concentrations of this substrate. On the contrary, substrate inhibition under variable concentrations of UDP-GlcNAc was barely present (Fig. S2†). The Kms for UDP-GlcNAc and FADDDD were 125 ± 33 and 13 ± 2.5 μM, respectively, and the kcat was ∼100 min−1 (Fig. 2b, left and middle panels, and Table S2†), a value in agreement with other previous reported kcat values for a similar protein GT such as EarP23 (kcat of 35 min−1), and an unrelated one such as PoFUT2 (ref. 24) (kcat of 144 min−1), both of which are glycosyltransferases that glycosylate other types of folded domains. Furthermore, these kcat values also match reported kcat values for unstructured peptides in the presence of other protein GTs (a range between 46–400 min−1, and 18–300 min−1 for different peptides using GalNAc-T3 (ref. 25) and N-glycosyltransferase,26,27 respectively). As expected from previous8,9 and our own studies,10 SseK1wt was inactive on FADDDD and SseK2wt was slow on FADDDD (Fig. 2a, middle and right panel, and Table S2†). Particularly, the Km, kcat, and catalytic efficiency of SseK2wt were 95-, 6.7-, and 630-fold worse than those of NleB1EHEC (Fig. 2b). Note that the kinetic parameters for SseK2wt are estimated due to its poor binding to FADDDD.
Fig. 2 Enzyme kinetics and ITC experiments of NleB1EHEC/SseK1wt/SseK2wt and mutants on FADDDD. (a) Glycosylation kinetics of NleB1EHEC/SseK1wt/SseK2wt and mutants against FADDDD. (b) Plots comparing the Km, kcat and catalytic efficiency (kcat/Km) of the different NleB1EHEC/SseK1wt/SseK2wt and mutants. Additional kinetic data are given in Table S2.† Asterisks indicate that the kinetic parameters for SseK2wt are estimated due to its poorer binding to FADDDD. (c) (left) Thermodynamic dissection of the interaction of the different enzyme forms with FADDDD. The binding Gibbs energy (ΔG), enthalpy (ΔH), and entropy (−TΔS) are in kcal mol−1. Any negative value represents a favorable contribution to the binding, whereas a positive value represents an unfavorable contribution (right) graph depicting the Kds of the different enzymes. |
Seeking to switch on the activity of SseK1wt towards FADDDD, we characterized the double mutants M147Y–K151ESseK1 and N291K–R294KSseK1, and the single mutant S286YSseK1. In all of these mutants, we replaced the SseK1wt residues by the corresponding positions in NleB1EPEC/EHEC. While the initial velocity of the double mutants was very slow (∼46-fold worse than the NleB1EHEC initial velocity at 800 μM FADDDD; Fig. 2a, middle panel), strikingly, the kcat for the single mutant S286YSseK1 matched that of NleB1EHEC (only 1.38-fold worse; Fig. 2b, middle panel), implying that with a single mutation we reached the optimal kcat found for NleB1EHEC. On the contrary, Km and the catalytic efficiency for S286YSseK1 were slightly worse (4.4- and 6.7-fold worse than the ones reported for NleB1EHEC; Fig. 2b, left and right panels, and Table S2†). To obtain an SseK1 mutant with similar kinetic parameters as those of NleB1EHEC, we combined the double mutants to generate a quadruple mutant (M147Y-K151E-N291K-R294KSseK1). We further added the S286YSseK1 mutation to the latter mutant generating a quintuple mutant (M147Y-K151E-S286Y-N291K-R294KSseK1). Additionally, we made another mutant combining the quintuple mutant with a deletion of the C-terminal last 5 residues only found in SseK1wt (Fig. 1a) named as quintuple-del mutant. The kcat, Km and catalytic efficiency for the quadruple mutant were 4.3-/2.7-, 6.1-/1.4-, and 27-/4-fold worse than the ones for NleB1EHEC and S286YSseK1, respectively (Fig. 2a, b, and Table S2†). This demonstrates that the kinetic parameters of the quadruple mutant are closer to those of S286YSseK1 than those of NleB1EHEC. Hence, either the combination of multiple changes (quadruple mutant) or just a single mutation (S286YSseK1 mutant) leading to higher affinity of SseK1 towards FADDDD are two different approaches to achieve kcat values close to that of NleB1EHEC. The kinetic parameters of the quintuple and quintuple-del, mainly kcat and kcat/Km, were highly similar to each other and very close to those of S286YSseK1 (Fig. 2b, middle and right panels). However, they differed slightly more in their Kms, with the quintuple/quintuple-del mutants Kms being ∼2-fold lower than that of S286Y (Fig. 2b, left panel). The reduction in Km for the quintuple/quintuple-del mutants enhanced their catalytic efficiencies, approximating their values to that of NleB1EHEC. Overall, the removal of the C-terminal 5 amino acids in SseK1wt and the addition of 5 mutations led to an SseK1 form with the closest kinetic parameters to those of NleB1EHEC, mainly due to the lowest Km towards FADDDD of these SseK1 mutants.
Having established that a single mutation (S286YSseK1) was sufficient to achieve optimal kinetic parameters and render the best kcat of all the characterized mutants, we mutated Ser286SseK1 and Asn302SseK2 to Asn/Ile, and Ser/Ile/Tyr, respectively, rendering the mutants S286NSseK1, S286ISseK1, N302SSseK2, N302ISseK2 and N302YSseK2. These single mutants are derived from the alignment of Ser286SseK1 with Asn302SseK2, Ile289SseK3 and Tyr284CrNleB/NleB1 (Fig. 1a). As expected from the previous results10 of SseK3wt on FADDDD, the S286ISseK1 and N302ISseK2 were completely inactive, and N302SSseK2 was also completely inactive on FADDDD, implying that both Ser or Ile residues are likely deleterious for catalysis (Fig. 2a, middle and right panel). In addition, S286NSseK1 showed poor glycosylation of FADDDD (∼15-fold worse initial velocity than the one reported for NleB1EHEC at 800 μM FADDDD) as found for SseK2wt (Fig. 2a, middle panel). Although N302YSseK2 achieved a highly similar kcat to those of NleB1EHEC and S286YSseK1, its Km towards FADDDD was worse (3- and 12-fold higher than those of S286YSseK1 and NleB1EHEC, respectively). This caused a drop in catalytic efficiency compared to that of NleB1EHEC that was more drastic than that for S286Y (3- and 20-fold worse to those of S286YSseK1 and NleB1EHEC, respectively; Fig. 2a, b, and Table S2†).
Overall, our data indicate that a single mutation, either from Ser286SseK1 or Asn302SseK2 to Tyr, is sufficient to switch on and improve SseK1 and SseK2 glycosylation on FADDDD, respectively. This mutation allows reaching kinetic parameters very close to those of NleB1EHEC.
Once we determined that FADDDD binding to these enzymes requires prior UDP binding, we measured the thermodynamic parameters for all mutants versus FADDDD under an excess of UDP. We could only get titration for NleB1EHEC and the S286YSseK1, N302YSseK2, quadruple, quintuple, and quintuple-del mutants (Fig. S3†).
Detailed analysis of the thermodynamic parameters of the interaction showed that the binding of FADDDD to NleB1EHEC and the mutants was largely entropy-driven (−TΔS), while the binding of UDP was favored by a gain in enthalpy (ΔH), with a reduced entropic component (Fig. 2c and Table S3†), implying distinct interaction behaviors between these molecules. The unique thermodynamic profile exhibited by FADDDD might be due to the release of a vast number of surface water molecules from both FADDDD and NleB1EHEC/SseK1/SseK2 mutants upon binding, promoting a favorable desolvation entropy. On the contrary, the significant reduction in donor substrate mobility upon binding to the enzyme, along with the large number of hydrogen bonds between UDP or UDP-GlcNAc and these enzymes are largely the major factors explaining the reduction in the entropic component and the favorable enthalpy.17,19 Interestingly, the single mutants S286YSseK1 and N302YSseK2 achieve favorable binding Gibbs energy to FADD by reducing the beneficial entropy component of the interaction, what is accompanied by a more favorable enthalpy. Binding to FADD thus globally follows a pattern of enthalpy–entropy compensation where multiple mutants show similar thermodynamic profiles to that of NleB1, with single mutants benefitting from enthalpy, suggesting that the solvation/desolvation process at the interface of interaction with FADD is more similar to NleB1 for the multiple mutants than for the single mutants (Fig. S5†).
The Kds are in the low μM range except for that of N302YSseK2. Although the Kds are much lower than the Kms, there is some correlation between the Kms and the Kds: those enzymes with lower Km values also possess lower Kds (Table S3†). Again, NleB1EHEC displays the highest affinity (Kd = 0.2 ± 0.04 μM), being 3-, ∼6-, 13-, and −472-fold better than those of quintuple-del, quintuple/S286YSseK1, quadruple, and N302YSseK2 mutants, respectively (Fig. 2c, right panel, and Table S3†).
Overall, our data show that the improvement in binding of SseK1 mutants and N302YSseK2 to FADDDD is essential to promote catalysis. Strikingly, this can be achieved by a single-mutation, S286YSseK1 or N302YSseK2, or by a combination of multiple mutations in different regions of the enzyme. Nevertheless, these single mutations are enough to account for the best kcat of all mutants, implying that the Tyr residue in that position might also play a catalytic role.
In the case of S286YSseK1, Tyr286 maintains similar favorable contacts as those in the complex of NleB1EPEC with FADDDD (Fig. 3), where its aromatic side chain is inserted into the groove formed by helices α2 and α3 and the loop connecting them, making close contacts with the backbone of Val121FADD and the side chain of Ile126FADD. However, for SseK1wt and S286ISseK1 the side chain at the point of mutation is either too small (Ser286) or too bulky (Ile286) to be properly allocated in the groove between helices α2 and α3, respectively (Fig. S7†). In the case of S286NSseK1, the side chain is equally not well allocated in the FADD groove, although a persistent hydrogen bond with Asp123FADD at helix α2 was observed (Fig. S7†).
These GaMD simulations results correlate very well with the kinetics measurements for SseK1wt and mutants, supporting a key role of the interaction of the side chain at the point of mutation with Ile126FADD from the FADDDD α2–α3 groove, most likely in the form of a favorable enthalpy contribution (Fig. S8†).
The GaMD simulations also allowed us to identify an important correlation between the side chain present at the point of mutation and catalysis, by analyzing the internal dynamics of the acceptor site (Arg117FADD side chain). In NleB1EPEC, the salt-bridge interaction between the proposed catalytic base (Glu253NleB1) and the guanidinium of Arg117FADD is maintained and holds the Arg117 in an orientation appropriate for the nucleophilic attack over the beta face. That interaction is not very conserved in the simulations of the other enzymes, and the role of Glu253NleB1 (Glu255SseK1) is replaced by the carboxylate of another residue, Asp188SseK1, which holds the Arg117 side chain in a rather rigid proper orientation all along the simulation time. This only occurs for the mutants that show glycosylating activity, S286YSseK1 and S286NSseK1, whereas that interaction is absent in the cases of S286ISseK1 and SseK1wt (Fig. S8†) where the Arg117 is more dynamic. This is reflected in the root-mean-square-fluctuations (RMSF) values of Arg117 in the complexes with NleB1EPEC, S286YSseK1, and S286NSseK1, which showed the lowest values (below 1 Å; Table S4†).
GaMD simulations also show that Phe187SseK1, Asp188SseK1, and Arg191SseK1 form a stable network of interactions with Arg117FADD (acceptor) and the sugar nucleotide (donor, Fig. S9†). Asp188SseK1 and Arg191SseK1 constitute the Asp/Arg dyad present in other bacterial GTs effectors.19 These results explain the need for a GT-bound sugar nucleotide to have efficient FADDDD binding (ordered bi–bi mechanism). The identified network of interactions leads to favorable contacts of the carboxylate side chain of Asp188SseK1 with the acceptor Arg117FADD, and of this with the sugar nucleotide. These interactions are conserved in NleB1-FADDDD complex (Fig. S9†).
The GaMD simulations support that a single mutation on Ser286SseK1 to Tyr leads to a favorable coupling between increased affinity and stability of Arg117FADD orientation, appropriated towards the nucleophilic attack of the anomeric carbon to render inversion of the configuration. This is achieved by a favorable interaction of the Tyr286SseK1 residue with Ile126FADD, leading to a stable salt-bridge of the Arg117FADD guanidinium polar head with the carboxylate of Asp188SseK1. The fact that Glu255SseK1 is far away from Arg117FADD and that Asp188SseK1 takes the role of the leading carboxylate in keeping the guanidinium on a proper orientation over the beta face of the GlcNAc residue of the donor substrate, strongly suggests that Asp188SseK1 might function as the catalytic base in S286YSseK1 and S286NSseK1 mutants. In fact, the D186A mutation in NleB1EPEC (the aligned residue in NleB1) has been also reported to be detrimental for NleB1 activity.19
Fig. 4 Enzyme kinetics of NleB1EHEC/SseK1wt/SseK2wt and mutants on DR3DD. (a) Glycosylation kinetics of NleB1EHEC/SseK1wt/SseK2wt and mutants against DR3DD. (b) Plots comparing the Km, kcat and catalytic efficiency (kcat/Km) of the different NleB1EHEC/SseK1wt/SseK2wt and mutants. Additional kinetic data are shown in Table S5.† Asterisks indicate that the kinetic parameters for N302YSseK2 are estimated due to its poorer binding to DR3DD. |
We then performed enzyme kinetics assays on mutants S286I/N/YSseK1 and N302I/S/YSseK2. As found for SseK2wt on DR3DD, S286NSseK1 was also inactive on DR3DD (Fig. 4a, right panel). The other mutants displayed different degrees of initial velocities. Although the initial velocity for the mutants S286ISseK1 and N302I/SSSeK2 was approximately half that of NleB1EHEC at 140 μM DR3DD, these mutants did not reach saturated kinetics, preventing us from determining their kinetic parameters. However, we could obtain kinetic parameters for S286YSseK1 and N302YSseK2 (Fig. 4a, b, Table S5†). Again, the mutation to Tyr in both enzymes provided DR3DD saturation curves. The Kms for NleB1EHEC and S286YSseK1 were similar while kcat and catalytic efficiency were ∼1.75-fold better for NleB1EHEC than S286YSseK1. On the contrary, kcat values were similar for NleB1EHEC and N302YSseK2, differing more in Km and catalytic efficiency (3.3- and 4.1-fold better constants for NleB1EHEC than those for N302YSseK2; Table S5†). Overall, our data with DR3DD are slightly more complex than the ones for FADDDD, and in particular suggest that a Tyr residue in positions 284CrNleB/NleB1, 286SseK1 and 302SseK2 is more beneficial for enzyme kinetics than a Ser, and the latter over Ile, being an Asn residue in those positions deleterious for activity.
Here, we have addressed the molecular basis of this narrowed substrate selectivity, which relies on a unique second-shell residue, variable between the SseK GTs and located in the interface of the NleB/SseK-protein substrate complex. The mutation of this second-shell residue, either Ser286SseK1 or Asn302SseK2, to TyrNleB/NleB1, leads to mutants with optimal kinetic and thermodynamic parameters. The mutants S286YSseK1 and N302YSseK2 become active on particular protein substrates such as FADDDD and DR3DD, respectively, leading potentially to mutants with a broader substrate selectivity as that of CrNleB/NleB1EPEC/EHEC. This is also supported by the increase in Salmonella abundance in macrophages by strains expressing these mutants. These mutants promote binding and catalysis, likely because the binding of this second-shell residue surrounding residues is coupled to the stability of the interaction between the acceptor Arg with the catalytic base residue. Therefore, the identity of the second-shell residue finely tunes protein substrate selectivity and, in turn, glycosylation, and might explain whether GT substrate selectivity is narrow or broad.
Promiscuous GTs act on multiple protein substrates and are found in all animal kingdoms.4 Several GTs mechanisms have been discovered by combining X-ray crystallography experiments with other biophysical and biochemical techniques. Initiating GTs such as POFUT1/POGLUT1 (also called Rumi) and POFUT2, require folded EGF and TSR repeats, respectively. These GTs share in common that the EGF and TSR repeats are tethered by direct hydrogen bonds, and require EGF and TSR repeats containing minimal consensus sequences that encompass mostly variable residues located in between two cysteine residues. However, POFUT2, and to a lesser extent POFUT1, apply additional strategies to recognize many different protein substrates, leveraging the water molecules present in the interface of the complexes to recognize different repeats. In addition, POGLUT1 and POFUT2 also interact with the repeats by hydrophobic interactions.31–33 Other initiating GTs, such as OGT and the large family of GalNAc-T isoenzymes (20 in humans) recognize mostly unfolded extended and compact structures of the peptide acceptor substrates, respectively, by establishing hydrogen bonds. Thus, for example, while for OGT a proline (Pro) residue in −2 (promoting the extended conformation) is required for optimal glycosylation,34 a Pro-X-Pro motif (promoting the compact conformation) contiguous to the preceding Ser/Thr residue usually favors glycosylation.35 In addition, GalNAc-Ts contain a flexible linker located in between the catalytic and the lectin domains, and a flexible loop that provides them with different behaviors due to their dissimilar amino acid sequences. While the flexible linker is behind the dynamics of these isoenzymes and the location of the GalNAc-binding site in the lectin domain,36,37 the flexible loop controls the catalytic cycle and is also behind the recognition of protein substrates.38,39 Together, both the flexible linker and loop determine whether several GalNAc-T isoenzymes are highly specific for particular protein substrates.38,40 The SseK GTs have evolved different features at molecular level to selectively recognize particular protein substrates. These features rely on variable residues located in the HLH domain and mostly in a particular variable second-shell residue with respect to the conserved Tyr284NleB/NleB1 residue.
In conclusion, and to our knowledge, this is a unique example of restoring the activity of enzymes that are inactive on particular protein substrates by a single site-directed mutagenesis. Overall, our finding provides the molecular basis of the differences between NleB/SseK GTs substrate selectivity and offers clues on the molecular pathogenesis of enteropathogens.
Footnote |
† Electronic supplementary information (ESI) available: Supporting methodology, 9 supporting figures and 5 supporting tables. See DOI: 10.1039/d1sc04065k |
This journal is © The Royal Society of Chemistry 2021 |