Milda
Kaniusaite
abc,
Robert J. A.
Goode
ad,
Julien
Tailhades
abc,
Ralf B.
Schittenhelm
ad and
Max J.
Cryle
*abc
aThe Monash Biomedicine Discovery Institute, Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria 3800, Australia. E-mail: max.cryle@monash.edu
bEMBL Australia, Monash University, Clayton, Victoria 3800, Australia
cThe Australian Research Council Centre of Excellence for Innovations in Peptide and Protein Science, Monash University, Clayton, Victoria 3800, Australia
dMonash Proteomics and Metabolomics Facility, Monash University, Clayton, Victoria 3800, Australia
First published on 24th August 2020
Non-ribosomal peptide synthesis is an important biosynthesis pathway in secondary metabolism. In this study we have investigated modularisation and redesign strategies for the glycopeptide antibiotic teicoplanin. Using the relocation or exchange of domains within the NRPS modules, we have identified how to initiate peptide biosynthesis and explored the requirements for the functional reengineering of both the condensation/adenylation domain and epimerisation/condensation domain interfaces. We have also demonstrated strategies that ensure communication between isolated NRPS modules, leading to new peptide assembly pathways. This provides important insights into NRPS reengineering of glycopeptide antibiotic biosynthesis and has broad implications for the redesign of other NRPS systems.
Given the importance of the products of natural megaenzyme synthases, an ability to alter such biosynthetic pathways to engineer the production of desired compounds would be of great value.15 In such endeavours, the modularity of NRPS machineries would appear to make reengineering such assembly lines highly feasible, given the shared enzymology and stepwise nature of peptide biosynthesis.1,2 However, the reality of reengineering these large and complex proteins has often shown that there are significant challenges yet to solve if we are to be able to perform such biosynthetic reengineering in a reliable and efficient way.15 Given the challenges of working with large proteins in vitro and the use of in vivo biosynthesis for eventual scale up and production of non-ribosomal peptides, it is unsurprising that the majority of efforts have been performed in vivo.16–23 Recent efforts in this regard have focused on C-domains as crucial junctions in modular redesign, either via module division within C-domains for linear systems from strains of Xenorhabdus and Photorhabdus19 or alternate module architecture for iterative fungal systems.24 Whilst in vivo approaches have made valuable contributions to NRPS redesign, few have been explored in vitro, which leads to difficulties in fully characterising these approaches especially in situations that are only partially successful or that show unexpected outcomes. Given this, our approach to NRPS redesign has focussed on the in vitro reconstitution of GPA biosynthesis from teicoplanin and related molecules (Fig. 1).6,7,14,25–28 Now, we explore the ability to generate alternate NRPS assembly lines from teicoplanin biosynthesis using a combination of approaches including module hybridisation, re-purposing extension modules as initiation modules and redesigning modules to control intermodule communication through the use of specific domain interaction interfaces. In this way, we present a set of instructions to tackle the reconstitution and reengineering of complex NRPS assembly lines in vitro.
Fig. 2 Summary of the modules and their domain composition in the teicoplanin NRPS together with the alternate constructs designed and tested in this study. |
Fig. 3 Initiation of peptide biosynthesis from elongation modules of the teicoplanin NRPS. Peptide biosynthesis assays included either 3 (M5–M7, A) or 4 (M4–M7, B and C) modules of the teicoplanin NRPS in various configurations together with ATP and the substrates of the A-domains of each module (M4, M5 – Hpg; M6 – ClBht; M7 – 3,5-Dpg). Peptides detected for M5–M7 are indicated in orange, for M4–M7 in red; yield is calculated for each species as a percentage of the total ion current determined by LCMS analysis. Peptide species indicated; for m/z data see ESI Table S3.† Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, X-P450-recruitment, TE – thioesterase, COM – communication. Module colour codes: M4 – green, M5 – pale blue, M6 – dark blue, M7 – violet. |
Given that the majority of the NRPS initiation modules begin with an A-domain (except modules that contain a C-domain to load acyl groups, for example) we first generated an M5-6 dimodule construct in which the initial C-domain had been deleted. When M5-6 was used in a peptide reconstitution assay together with M7, biosynthesis of tripeptide 3 (Hpg–ClBht–Dpg) was observed (Fig. 3A). Next, we attempted to extend 3 into tetrapeptide 7 (Hpg–Hpg–ClBht–Dpg) via the addition of module 4 (Fig. 3B and C). This was not achieved by adding module 4 constructs (M4, M4a) into the M5-6 + M7 assay (Fig. 3B), but was possible when replacing M4 + M5-6 with complete trimodule constructs (M4-6, M4-6a; Fig. 3C).
These experiments revealed three key findings: firstly, that initiation of NRPS biosynthesis is achievable using re-purposed extension modules. The second finding stems from the inability of a single module (M4) to compete for initiation with the M5-6 dimodule. This indicates that the affinity of the C/A domain interface between M4 and M5-6 is insufficient to allow peptide extension from a single amino acid, but that it is possible when these modules are fused in the M4-6 constructs. Thus, module fusion allows peptide synthesis to be initiated as they are restrained in the same construct, which dramatically increases the effective local concentration. The third finding stems from the fact that formation of 7 is obtained using both M4-6 and M4-6a, although M4-6a lacks the natural N-terminal C-domain. This indicates that the presence of such C-domains in an NRPS module is insufficient to prevent the initiation of peptide synthesis. Rather, it appears that interactions with neighbouring modules control this process. This provides clues that the interactions between modules split into separate proteins must be sufficiently high to prevent unwanted peptide chain initiation during normal peptide biosynthesis, which is clearly of interest for NRPS module redesign.
Fig. 4 Engineering modular interactions across E- and C-domains for modules M6 and M3. Peptide biosynthesis assay using the engineered dimodule M5-6a + M3 together with ATP and the substrates of the A-domains of each module (M3 – 3,5-Dpg; M5 – Hpg; M6 – ClBht); HRMS analysis shows the formation of 3a; for m/z data see ESI Table S3.† Double peak caused by epimerisation of C-terminal Dpg residue during methylamine offloading. Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, COM – communication. Module colour codes: M2 – orange, M3 – yellow, M5 – pale blue, M6 – dark blue. |
These experiments show that natural communication between M2 and M3 requires the M2 E-domain. The ability of M5-6a + M3 to generate tripeptide 3 demonstrates effective communication between reengineered M6 and M3, which are not adjacent within the natural NRPS assembly line (Fig. 4). As the same results were obtained with M5-6a and M5-6b, this shows that there was no need to transplant the PCP–E didomain in this case. It is important to note, however, that the activity of the transplanted E-domain appears unable to compete with peptide extension by M3. This is consistent with previous results that have showed the M3 C-domain is able to extend incorrectly configured peptides with little effect on extension rate.25
First, we explored the role of the individual M4 and M5 E-domains on peptide hydrolysis. As constructs, we generated active site mutants of the catalytic histidine residue for the E-domains in M4 (His to Ala, M4b) and M5 (His to Gln, M5a; His to Ala, M5b). Next, we prepared synthetic tetra- and pentapeptides (8, 9 respectively, see ESI†) matching the natural teicoplanin peptide sequence but in which the C-terminal residue of each peptide was as either in the L- or D-configuration. These peptides were converted into peptidyl-CoAs (ESI Fig. S5–S8†) and loaded enzymatically onto the apo-PCP domains in these M4 and M5 constructs using Sfp. Hydrolysis and epimerisation were then measured via LCMS by comparing the retention times of the product peptides with synthetic standards.
Using this assay setup, we first compared the effect of the E-domain on peptide epimerisation in M4 and mutant M4b constructs using synthetic tetrapeptides D/L-8 loaded onto these modules (ESI Fig. S9A–H†). After overnight incubation, M4 led to hydrolysis of both D- and L-8, whilst M4b displayed no hydrolysis in either case (ESI Fig. S9C–H†). Here, we were unable to assay epimerisation activity due to the co-elution of the tetrapeptides 4 L-8 and 4 D-8 (ESI Fig. S9A and B†). Next, we tested the activity of M5 constructs using synthetic pentapeptides D/L-9 loaded onto these modules (ESI Fig. S9I–P†). After overnight incubation, all of 5 L-9 loaded on M5 had been converted into the 5 D-form, whilst the mutants showed that epimerization was either suppressed (M5a, 3:2 L:D) or abolished (M5b) (ESI Fig. S9M–P†). The hydrolysis of 9 was low in all cases. These experiments reveal that the M4 E-domain displays significant hydrolytic activity that is not seen for M5.
With evidence of role of the M4 E-domain in peptide hydrolysis, we undertook the design of modified M4 modules in which the E-domain was replaced with the corresponding E-domain from M5. As the M4 construct contains the downstream C-domain from module 5 fused with the E-domain of module 4, it was necessary to find an appropriate non-cognate E-domain accommodation site in these constructs (Fig. 5). We therefore designed two constructs in which the interdomain linker between the C- and E-domains was either retained from E4-and C5-domains (M4c) or matched the E5-and C6-domains (M4d).
Fig. 5 Importance of the E–C interdomain linker for activity in hybrid modules with transplanted E-domains (A). Within the M4–M6 protein there are two linkers between E- and C-domains, shown in green (M4 to M5) and red (M5 to M6). Peptide biosynthesis assays analysed by LCMS commencing from synthetic tripeptide 10 loaded on M3 with M4 and M5 shows effective biosynthesis of pentapeptide 12 (B) when incubated together with ATP and the specific substrates of the A-domains of each extension module (M4, M5 – Hpg). Incorporation of the M5 E-domain in M4 either retaining the M4 linker (M4c, C) or incorporating the M5 linker (M4d, D) into comparable assays shows that biosynthesis of 12 is only maintained for the M4 linker. Peptide species indicated; for m/z data see ESI Table S3.† Double peaks caused by epimerisation of C-terminal Hpg/Dpg residues during methylamine offloading. Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, COM – communication. Module colour codes: M3 – yellow, M4 – green, M5 – pale blue, M6 – dark blue. |
To explore the activity of these modified modules, we pre-loaded M3 with synthetic tripeptide 10 (as performed above) before using this together with M4c/d and M5 in peptide reconstitution assays. Reconstitution using M4c provided excellent conversion of 10 into pentapeptide 12. In contrast, reconstitution using M4d revealed formation of tetrapeptide 11 without hydrolysis, although there was little extension to 12. These results indicate that the existing E–C linker found within an NRPS module should be retained when transplanting E-domains into modules (Fig. 5). This also offers an explanation for the lack of activity of the transplanted E-domain in the M5-6a and M5-6b constructs given the lack of native E–C linker in the M6 module.
First, we wanted to test if communication between M7 and M4 could be enabled solely by relocating the small COM domain that is found in M3. Relocation of the M3 COM domain onto M7 generated the M7a construct and we explored if this protein was active in peptide reconstitution assays. Firstly, M5-6 was incubated together with M7a as a control, and afforded tripeptide 3 as anticipated (Fig. 6). Next, we attempted to extend the M5-6 + M7a assay by adding M4. This did not afford tetrapeptide formation (ESI Fig. S11†), which showed a lack of communication between M7a and M4.
Fig. 6 COM-domain transplantation is insufficient to allow interaction across modules M7 and M4. Peptide biosynthesis assays using dimodule M5-6 and an engineered M7 module M7a bearing the M3 COM-domain (A) forms 3, but addition of M4 has no effect (LCMS, ESI Fig. S12†). HRMS analysis shows formation of 3a when M5-6 + M7a or M5-6 + M7a + M4 was incubated together with ATP and the substrates of the A-domains of each module (M4, M5 – Hpg; M6 – ClBht; M7a – 3,5-Dpg). Peptide species indicated; for m/z data see ESI Table S3.† Double peaks caused by epimerisation of C-terminal Dpg residue during methylamine offloading. Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, COM – communication. Module colour codes: M4 – green, M5 – pale blue, M6 – dark blue, M7 – violet. |
Fig. 7 Engineering modular interactions across A- and C-domains for modules M7 and M4. Peptide biosynthesis assays using dimodule M5-6, an engineered M7 replacement module M3a bearing the M7 C-domain and M4 (A) shows communication between M3a and M4 and affords two starting points for NRPS-mediated peptide assembly (B). Peptide biosynthesis assays including ATP and the substrates of the A-domains of each extension module (M3a – 3,5-Dpg; M4, M5 – Hpg; M6 – ClBht). HRMS analysis (C and D) shows the formation of tetrapeptides 7a, 13a & 15a and pentapeptides 14a, 16a & 17a. 17a contains a sequence that can be rationalised through dipeptide acting as an acceptor substrate during peptide biosynthesis. Apparent pentapeptide peak indicated with an asterisk (*) did not provide an MS2 spectrum to allow structural analysis. Peptide species indicated; for m/z data see ESI Table S3.† Colour code indicates the amino acids added during the assays and matches the colours used in the LCMS traces. Double peaks caused by epimerisation of C-terminal Hpg/Dpg residues during methylamine offloading. Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, COM – communication. Module colour codes: M3 – yellow, M4 – green, M5 – pale blue, M6 – dark blue, M7 – violet. |
We tested the functionality of M3a in the reengineered NRPS assembly line M5-6 + M3a + M4 using our peptide reconstitution assay. This array of modules also offered the possibility for several cycles of peptide synthesis due to the potential interaction of M4 with M5-6. Analysis of the results of this assay showed a complex series of products was present (Fig. 7B). Firstly, a small amount of tetrapeptide 13 was detected, which corresponds to the anticipated M5-6 + M3a + M4 pathway (Fig. 7). Tetrapeptides 15 and 7 were also detected at very low level, and result from M3a + M4 + M5-6 and M4 + M5-6 + M3a pathways, respectively (Fig. 7C and D). However, far more significant production of several pentapeptide species was detected, with smaller amounts of 16 and larger amounts of 14 (Fig. 7D). Production of these pentapeptides can be described by the same cyclic “set” of NRPS interactions 〈-M5-6 + M3a + M4-〉 albeit with different initiation points. Peptide 16 can be rationalised as being formed by M3a + M4 + M5-6 + M3a, whilst formation of 14 is supported by M5-6 + M3a + M4 + M5-6 activity, with no final M6 activity in this case.
Further analysis of the MS2 fragmentation of the pentapeptide products of this assay revealed the presence of pentapeptide sequence 17, which cannot be formed through the biosynthesis pathway discussed above (Fig. 7D). To determine the pathway responsible for formation of 17, we first hypothesised that the module reengineering undertaken to produce M3a could have affected the amino acid specificity of this module, specifically here allowing activation of Hpg. We explored the Hpg vs. Dpg activation properties of M3a and compared them to M3 using a spectroscopic activity assay, in which an enzymatic cascade couples the formation of pyrophosphate during amino acid activation with the oxidation of NADH.27 These assays showed no appreciable difference in the activity of M3a for Hpg versusM3, which is in agreement with A-domain mutagenesis data that shows how selective M3 is for Dpg vs. Hpg (ESI Fig. S13†).28 This makes the incorporation of Hpg by M3a highly unlikely to explain the formation of 17.
In these experiments, it is clear that engineering hybrid modules to enable alternate module interactions through A/C interfaces is possible. Formation of 17 is unexpected: one explanation would be the unusual extension of the M4 + M5-6 + M3a assembly line at the N-terminus by an additional round of either M4 or M5 activity. Results obtained using synthetic peptides loaded on M3 together with M5 (see below and ESI Fig. S12†) support the ability of M5 to perform such extensions. However, the inability of M4 to compete with M5-6 initiation would argue against this pathway. Instead, an alternative that is supported by other experiments (see ESI Fig. S12†) is the formation of tripeptide 3 by M5-6 + M3a, followed by two rounds of N-terminal extension by M5.
Results obtained in these experiments indicate that the pathway of NRPS-mediated peptide synthesis is maintained throughout synthesis of these peptides, with alterations in sequence occurring either through alternate start modules within the pathway (M5 or M3; a general inability to start at M4) or unusual extension of the peptide due to effects on module interactions because of the modular division of the assembly line. In the case of M3a, this retains the naturally split A/C interface between M3 and M4, and shows that these interfaces must be result in higher affinity than the artificially split M4 and M5 modules. This also indicates that such interaction interfaces between divided modules extends beyond isolated COM domain pairs, and suggests that further interactions – presumably mediated between A- and C-domains – are required for effective intermodule interaction.
Fig. 8 Comparative four module NRPS biosynthesis using engineered M5-6a didomain. Peptide biosynthesis assays using engineered dimodule M5-6a, M3 and M4 (A) shows communication between M5-6a and M3, and reveals competition for NRPS-mediated peptide elongation from M4 that favours M3 over M5 (B). Peptide biosynthesis assays including ATP and the substrates of the A-domains of each extension module (M3a – 3,5-Dpg; M4, M5 – Hpg; M6 – ClBht). HRMS analysis (C) shows tetrapeptides 13a, 15a & 18a, pentapeptide 20a and hexapeptide 19a; 19a is the result of initiation from M5 but with M4 tetrapeptide extension mediated by M3 in this case (box). Peptide species indicated; for m/z data see ESI Table S3.† Double peaks caused by epimerisation of C-terminal Hpg/Dpg residues during methylamine offloading. Colour code indicates the amino acids added during the assays and matches the colours used in the LCMS traces. Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, COM – communication. Module colour codes: M2 – orange, M3 – yellow, M4 – green, M5 – pale blue, M6 – dark blue. |
Given these unusual findings, we further investigated hybrid modules through C-domain incorporation. We generated a novel M5 in a similar manner to that used above, where we now fused the M4 C-domain into M5 to generate M5c (C–A–PCP–E–C; Fig. 2 and 9A). To test the functionality of M5c, we utilised this module in peptide reconstitution assays together with synthetic tripeptide 10-loaded M3 and M6 (Fig. 9B). This demonstrated the successful extension of 10 to pentapeptide 21. When M3 loaded with 10 was used together only with M5c, this led to the anticipated tetrapeptide 11 as the major product. In this assay we also identified pentapeptides bearing an additional Hpg residue at the peptide N- or C-termini (ESI Fig. S12†). A control assay using M3 with M5 also showed tetrapeptide and pentapeptide formation, although in this case both were exclusively found at the N-terminus of the synthetic tripeptide (ESI Fig. S12†). Whilst such peptide extension is unexpected, it is clearly able to be an effective process under these conditions as opposed to traditional peptide biosynthesis. These results support those obtained for the M5-6 + M3a + M4 reconstitution experiments, which indicate that unusual N-terminal peptide extension is likely also proceeding in these pathways.
Fig. 9 Engineering modular interactions across A- and C-domains for modules M3 and M5. Peptide biosynthesis assays commencing from synthetic tripeptide 10-loaded on M3 with engineered M5c and M6 (A) shows effective biosynthesis of pentapeptide 21 when incubated together with ATP and the substrates of the A-domains of each extension module (M5 – Hpg; M6 – ClBht). HRMS analyses (B) show no residual 10, with conversion to tetrapeptide 11a and pentapeptide 21. Peptide species indicated; for m/z data see ESI Table S3.† Colour code indicates the amino acids added during the assays and matches the colours used in the LCMS traces. Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, COM – communication. Module colour codes: M3 – yellow, M4 – green, M5 – pale blue, M6 – dark blue, M7 – violet. |
These results show that the effective redesign of naturally separated modules within an NRPS can be obtained through the use of C-domain replacement, which we have demonstrated here by retaining the M3/M4 and M6/M7 interfaces across reengineered modules. This supports other studies that highlight the value of modular redesign by altering C-domains, and also shows how unexpected module interactions can be identified by the modularisation of larger NRPS multimodular proteins.
Fig. 10 Exploring the reconstitution of hybrid NRPS assembly lines from individual modules. Peptide biosynthesis assays using engineered module M6a, M3, M4, M5, M6 and M7 (A) shows a preference for the biosynthesis of peptides of 3–5 residues, and evidence for M5–M5 interactions (B). Peptide biosynthesis assays including ATP and the specific substrates of the A-domains of each extension module (M3, M7 – 3,5-Dpg; M4, M5 – Hpg; M6, M6a – ClTyr, chosen here to favour M3 activity). HRMS analysis (C) shows tripeptide 24, tetrapeptide 22, pentapeptides 23 & 25 and hexapeptide 26. Peptide species indicated; for m/z data see ESI Table S3.† Double peaks caused by epimerisation of C-terminal Hpg/Dpg residues during methylamine offloading. Colour code indicates the amino acids added during the assays and matches the colours used in the LCMS traces. Domain key: A – adenylation, C – condensation, PCP – peptidyl carrier protein, E – epimerisation, X-P450-recruitment, TE – thioesterase, COM – communication. Module colour codes: M2 – orange, M3 – yellow, M4 – green, M5 – pale blue, M6 – dark blue, M7 – violet. |
Firstly, we tested the hypothesis that elongation modules, as well as dimodular and trimodular NRPS proteins, can be converted into initiation modules. Investigating constructs derived from the M4-6 protein Tcp11 protein showed that this was indeed possible from M4, M5 and M6, although the abilities of dimodular constructs from Tcp11 to initiate peptide biosynthesis were generally higher than single modules. In contrast to other reported results,31 we identified that elongation modules containing an N-terminal C-domain were not inhibited from peptide initiation, indicating that the affinity of the C-domain acceptor site in such cases is insufficient to prevent an aminoacyl-PCP from acting as the donor substrate for a downstream C-domain.
We next endeavoured to understand how to artificially induce intermodule communication between separate protein constructs. Given the similarities in the peptide formed by M5-6 and M1-2, we explored reengineering to allow M5-6 to replace M1-2 in its interaction with M3. Here, we tested two new approaches: (1) using C/A interface reengineering between distinct modules and (2) the relocation of an E-domain to ensure communication with the downstream module. The success of these experiments, taken together with the results of our previous M6 PCP exchange experiments perform in the M6 A-PCP construct,26 indicates that both C/A and A/PCP regions are flexible in terms domain exchange. The critical factor in these bio-combinatorial experiments is the selectivity of the acceptor site of the upstream C-domain, which in teicoplanin M6 is high due to the need for this domain to gate aminoacyl-PCP modifications by trans-acting enzymes.7,26 In order to generate communication between non-adjacent modules in the teicoplanin NPRS we identified that, in contrast to other NRPS systems,32–34 it is not sufficient to relocate/exchange compatible COM domains at the end of module of interest. Whilst recent strategies that divide fused modules have shown compatibility with COM domain relocation,35 our results clearly show that larger adjacent domain–domain interaction surfaces are also required to ensure module–module recognition and ensure communication between M5-6 and M3. This maintenance of E- and C-domain interactions is likely the result of the tight coupling of activity of these two domains, where the E-domain is required to act prior to the acceptance of the modified peptide by the subsequent C-domain. It also helps to explain why inactive E-domains are retained within some NRPS assembly lines, exemplified in GPAs by M3/M4 interactions in A47934 biosynthesis36 and M6/M7 interactions in complestatin biosynthesis.37,38 The utility of this approach is not limited by the requirement to maintain E-domain activity, as C-domains are not always exclusively active on the correct peptide stereochemistry as seen previously for teicoplanin M3 and M7, for example.14,25
Having seen the importance of E-domains within modular exchange strategies, we also tested domain exchange experiments investigating the linkers with the E-domains found in the M4 and M5 modules from the Tcp11 protein. Here, we demonstrated that the construct possessing the C–E domain inter-modular linker (IML) connecting M4 with M5 retains activity but not one with the IML connecting M5 with M6. This finding is in agreement with a published IML compatibility analysis, which found successful domain exchanges requires compatible linkers connecting the upstream and downstream modules of interest.39 We identified that the 26 amino acid length linkers connecting E- and C-domains – displaying more than 80% sequence identity in this case – are key to ensuring productive substrate delivery to the E-domain and C-domains. In contrast, we did not observe that E-domain activity was linked to the presence of its partner PCP domain (M5-6avs.M5-6b constructs, ESI Fig. S10†). This disagrees with a previous study based on structural observations that PCP–E di-domains should act as a functional and conservative unit.40 In our case, an inappropriate E/C interface in these reengineered constructs, combined with the lack of stereoselectivity exhibited by the downstream module, would appear to explain these results. The lack of structural data for large NRPS biosynthetic protein constructs limits our understanding the structural role of different NRPS inter-modular linkers that connect different modules. Recent dimodular NRPS protein X-ray structures41 as well as photocrosslinking studies42 provides us with an understanding of the flexibility of the NRPS biosynthetic machinery, which suggests that linker exchange in multi-modular NRPS proteins could well alter domain–domain motion and potentially prevent productive substrate delivery to downstream peptide processing domains. However, further structural investigations of large multi-modular NRPS constructs are needed in order to deliver the molecular insights into the roles that such linkers play in NRPS-mediated peptide assembly.
Given that E-domains are optional in NRPS modules, the exploration of redesigning modules by fusion at the C/A interface shows the most general promise in redesign. In exploring this interface, we again noted the importance of maintaining the catalytic terminal domains (i.e. beyond the COM-domain) between modules split into separate proteins. In redesigning the assembly line to allow M7 to communicate with M4, we showed that the M3a construct allowed communication with M6via the M7 C-domain and M4via the A-domain of M3. We also note that adding the COM-domain of M3 to M7 was insufficient for communication with M4. The suitability of this redesign strategy was further supported by the ability to append the M4 C-domain directly onto M5 and produce M5a. M5a retained the interaction with M3via the M4 C-domain and also retained M6 interaction via the natural M5/6 A/C interface. The A/C interface also provides an explanation why the excision of modules through the division of C-domains provides a path to successful modular redesign for NRPS systems.19
Whilst most of the peptide synthesis pathways established in this work conform to those anticipated based on the natural assembly line, we did identify that modularisation of the teicoplanin NRPS led to unanticipated modular interactions in some cases. Within the M5-6a + M3 + M4 system, for example, the acceptance of the M4 loaded tetrapeptide by M3 was the major pathway present, and shows an unusual M4/M3 interaction (Fig. 8). Here, it is important to note that this interaction only occurs after the activity of M3 within the anticipated M5-6a + M3 + M4 pathway, possibly due to the lack of affinity between the M4 and M5 modules that are normally fused within one protein. Perhaps most curiously of all, we have also noted the apparent N-terminal extension of M3-loaded tripeptides by M5 in a number of assays. This extension must be occurring on the PCP-bound peptide, as all these unusual peptide products were detected in their methylamide forms. Whilst in the context of the M5-6 + M3a + M4 pathway this could possibly be explained by other as yet unidentified intermodule interactions, in the experiments where synthetic tripeptide was loaded on M3 and incubated with M5 the evidence for N-terminal extension appears unambiguous. Whilst unexpected, it should be noted that there is a general lack of structural information concerning the presentation of acceptor substrates within C-domains,5,41,43–45 and that coupled with the reported flexibility of the NRPS assembly line (even within fused modules),25,41,42,46 there is no evidence that the attack of an acceptor peptide onto a donor amino acid is explicitly prevented. This intriguing result highlights the importance of obtaining further structural snapshots of the NRPS C-domain in relevant catalytic states, and is further underlined by the diverse range of catalytic activities performed by domains derived from C-domains.5,9,13,47,48 It also raises further questions as to origins of the replacement of the teicoplanin M3 domain in vancomycin/pekiskomycin-type GPAs,49–51 given that the M3 module appears to be the major source of atypical module interactions in our experiments with the teicoplanin NRPS. The unexpected interactions uncovered in these experiments (M4 with M3, and M3 with M5) also shows further promise for NRPS redesign, for this indicates that the alteration of the fused state of modules within an assembly line can then lead to alterations in the assembly pathway, and hence the formation of new peptide products.
Footnote |
† Electronic supplementary information (ESI) available: Primer sequences and template DNA for construct design; analysis of protein purification; peptidyl-CoA analysis; analysis of biochemical assays including E-domain epimerisation experiments. See DOI: 10.1039/d0sc03483e |
This journal is © The Royal Society of Chemistry 2020 |