Liwen
Zhang†
a,
Chen
Wang†
a,
Kang
Chen
a,
Weimao
Zhong
b,
Yuquan
Xu
*a and
István
Molnár
*bc
aBiotechnology Research Institute, The Chinese Academy of Agricultural Sciences, 12 Zhongguancun South Street, Beijing 100081, P. R. China. E-mail: xuyuquan@caas.cn
bSouthwest Center for Natural Products Research, University of Arizona, 250 E. Valencia Rd., Tucson, AZ 85706, USA
cVTT Technical Research Centre of Finland, P.O. Box 1000, FI-02044 VTT, Espoo, Finland. E-mail: istvan.molnar@vtt.fi
First published on 7th July 2022
Covering: 2011 up to the end of 2021.
Fungal nonribosomal peptides (NRPs) and the related polyketide–nonribosomal peptide hybrid products (PK–NRPs) are a prolific source of bioactive compounds, some of which have been developed into essential drugs. The synthesis of these complex natural products (NPs) utilizes nonribosomal peptide synthetases (NRPSs), multidomain megaenzymes that assemble specific peptide products by sequential condensation of amino acids and amino acid-like substances, independent of the ribosome. NRPSs, collaborating polyketide synthase modules, and their associated tailoring enzymes involved in product maturation represent promising targets for NP structure diversification and the generation of small molecule unnatural products (uNPs) with improved or novel bioactivities. Indeed, reprogramming of NRPSs and recruiting of novel tailoring enzymes is the strategy by which nature evolves NRP products. The recent years have witnessed a rapid development in the discovery and identification of novel NRPs and PK–NRPs, and significant advances have also been made towards the engineering of fungal NRP assembly lines to generate uNP peptides. However, the intrinsic complexities of fungal NRP and PK–NRP biosynthesis, and the large size of the NRPSs still present formidable conceptual and technical challenges for the rational and efficient reprogramming of these pathways. This review examines key examples for the successful (and for some less-successful) re-engineering of fungal NRPS assembly lines to inform future efforts towards generating novel, biologically active peptides and PK–NRPs.
Nature employs multiple ways to diversify NRPs, including duplication, recruitment, deletion, or mutation of the genes that encode the biosynthetic enzymes. Fittingly, NRP biosynthetic gene clusters (BGCs) are often located in dynamic regions of fungal chromosomes such as in loci near the telomere, or on accessory chromosomes.9,10 Accessory chromosomes are of particular interest due to their capacity for horizontal transfer between strains and their dynamic “crosstalk” with core chromosomes.11 The resulting structural complexity of NRPs makes them especially well-suited to interact with a variety of biological targets, including some that have been considered to be undruggable by other types of molecules.12 The structural modifications of the NRPs may also improve their properties as pharmaceutical drug candidates (e.g., N-alkylation for enhancing metabolic stability and intestinal permeability). Thus, NRPSs represent a promising target for synthetic biology to engineer the production of bioactive uNPs (unnatural products, i.e., NP analogues and de novo obtained small molecule products that are not present in nature, but produced in an amenable host organism through biosynthetic manipulation).
Fig. 2 Schematic representation of fungal NRP biosynthesis catalyzed by modular NRPSs. The amino acid substrates are activated by the adenylation domain (A) through the formation of aminoacyl–AMP intermediates. These intermediates are captured by the thiol group of the flexible 4′-phosphopantetheine arm (wavy line) tethered to a thiolation domain (T, yellow circle 1). Condensation domains (C) catalyze successive peptide bond formation between the thioester intermediates loaded onto adjacent T domains (yellow circles 2 and 3 in T domains). Initiation modules often omit a C domain. Elongation modules are followed by a termination module featuring a terminal condensation domain (CT), or much less frequently a Dieckmann cyclase domain (D, also known as R, reductive release domain) or a thioesterase domain (TE) that catalyzes the release of the peptide from the NRPS by hydrolysis or regiospecific intramolecular cyclization. In contrast, the release of the mature peptide by hydrolysis or cyclization is typically catalyzed by a terminal TE domain in bacteria. NRPS modules may also contain additional domains that modify the loaded substrate or the bound intermediates through epimerization (E), N-methylation (N-MT), oxidation etc.15 The released peptide can subsequently be modified by tailoring enzymes, further increasing structural diversity. |
Fungal NRPSs assemble peptides in two modes. Linearly working NRPSs (type A) such as those that synthesize the marketed drugs cyclosporine and echinocandins use each of their constituent modules sequentially and only once during the biosynthetic cycle.16 In contrast, iterative NRPSs such as those that yield enniatin and beauvericin use some of their domains (type C) or complete modules (type B) in an iterative, recursive fashion following a biosynthetic metaprogram that is difficult to predict from sequence data alone.17
Chemical total synthesis of NRPs with their diverse structural modifications often faces significant challenges, including the poor availability of the constituent non-proteinogenic amino acids, inefficient regio- and stereospecific cyclization protocols, and low compatibility among different synthetic steps.18 Thus, the production of such compounds at a commercial scale typically necessitates in vivo biosynthesis (fermentation), while the generation of NRP-based uNPs with improved or novel bioactivities may require the reprogramming of the NRPS assembly line.
To date, a variety of reprogramming approaches have been demonstrated to produce NRP-based uNPs, mainly with bacterial NRPS assembly lines. These approaches include precursor-directed biosynthesis,19 mutasynthesis,20 site-directed mutagenesis of A domains,21 and combinatorial biosynthesis,22 the latter of which mainly includes module and domain exchanges. Similar approaches are gradually adapted to fungal NRPSs. However, two major obstacles remain to be addressed. First, our mechanistic understanding of NRPSs is still incomplete. This includes lingering questions about the intrinsic programming of substrate selectivity in the A and C domains; the extrinsic programming of domain–domain interactions through protein interfaces; and the determinants of the overall metaprogram of the whole NRPS assembly line. The second obstacle is technical, and stems from the lack of convenient and high-throughput techniques to genetically manipulate a large variety of filamentous fungi, or to transfer and manipulate the enormous fungal NRPS megaenzymes in heterologous hosts.
Even with these advancements, far fewer compounds are routinely detected in fungal cultures than predicted from genome surveys, due to a lack of expression of the relevant BGCs under general laboratory conditions. Activation of these “cryptic” BGCs through the manipulation of global or local regulation cascades in the native hosts25,29,30 may remediate this situation, and lead to the production of novel natural products. Activation of multi-gene pathways in the native hosts has been greatly accelerated by the application of CRISPR/Cas9 gene editing techniques.31,38 Epigenetic remodeling of chromatin structure by the deletion/overexpression of global histone modifying proteins, both writers (which add modifications to histone tails) and erasers (which remove modifying groups), has been used extensively to activate cryptic BGCs in fungi. For example, deletion of the histone deacetylase (HDAC)-encoding gene hdaA increased expression in 42 of 68 BGCs in Calcarisporium arbuscular, revealing two novel cyclic peptides and two novel meroterpenoids;39 while deletion of the chromatin reader-encoding gene sntB in A. flavus allowed the production of depsipeptide aspergillicins.40 Similarly, deletion or overexpression of genes for other global regulators such as VeA and LaeA orthologues also affect the production of NPs.31 More examples of new NP discovery by manipulation of regulatory elements were summarized in previous reviews.10,23,25,29,41,42
Another frequently used strategy is the expression of the BGC in a heterologous chassis.43 Considering the lack of molecular tools for most native producers and the increasing availability of metagenomic sequences for pathway discovery, heterologous expression of cryptic BGCs is becoming an increasingly prominent strategy, as discussed in previous reviews.27,44–47 In addition to Saccharomyces cerevisiae,43,48–51 other chassis such as Aspergilli,45,52,53Fusarium54 and Penicillium55 spp. have also been developed. These filamentous fungi may correctly recognize and splice introns, bypassing the need for predicting and assembling large intron-free genes as is necessary when yeast is used as the host.44,47 Filamentous fungi may also offer a larger variety of precursors available for NP biosynthesis. However, host enzymes may modify or degrade heterologous products, and the detection and purification of the engineered compounds may be more complicated in the more complex metabolomic background of filamentous fungi. Both the increasingly sophisticated manipulation of native producers, and the more facile transfer of BGCs to heterologous chassis are critical for engineering the production of nonribosomal peptide uNPs by synthetic biology.
Along with the substrate selectivity of the A domains, the specificity of the C domains was also found to play a critical role in the determination of the sequence of the NRPs. In addition to peptide bond formation, specific C domains perform β-lactam formation, dehydration, hydrolysis, cycloaddition, Pictet–Spengler cyclization, Dieckmann condensation and recruitment of auxiliary enzymes.66,67 Considering that C domains clade into various subtypes during phylogenetic analysis and that these groups may be recognized by specific conserved motifs, prediction of C domain catalytic functions from genome sequence information alone is increasingly feasible.68 Future structural, biochemical, and bioinformatics studies of fungal NRPSs will undoubtedly facilitate the use of combinatorial biosynthesis to generate novel NRPS-based uNPs in a more predictable and reliable fashion.13
In fungi, the power of chemoenzymatic synthesis was mainly utilized to identify biosynthetic pathways or to characterize the biocatalytic functions and mechanisms of NRPS domains.14,77–79 In a remarkable example, the full-length polyketide synthase–nonribosomal peptide synthetase (PKS–NRPS) hybrid enzyme ApdA (aspyridone synthetase, 439 kDa) was purified from an S. cerevisiae chassis to examine the programming rules of both the PKS and NRPS modules.49 Often, such studies with reconstituted enzymes also generate shunt products, some of which are new to nature.14 In some studies, generation of uNPs was explicitly attempted, such as when a dissected NRPS module from a PKS–NRPS of Aspergillus terreus was reconstituted in vitro and was provided with different amino acids and free thiols to produce >60 different thiopyrazine compounds,80 as discussed in Section 3.3.
Fig. 3 Domain and module architectures, substrate selectivities and product structures of representative cyclooligomer depsipeptide NRPSs.56 The table summarizes the structural features of naturally occurring fungal cyclooligomer depsipeptides. Hiv, D-hydroxyisovaleric acid; Pla, 3-phenyl-D-lactic acid; Lac, D-lactic acid. |
Early attempts to engineer the production of uNP cyclooligomer depsipeptides utilized precursor-directed biosynthesis85–89 and mutasynthesis20,90,91 as detailed in a previous review.56 Thus, supplying amino acid analogues to growing cultures or purified NRPS enzymes led to the gradual replacement of the aminoacyl constituents of the peptides by other (N-methyl)-L-amino acids.86 In a pioneering attempt at precursor supply engineering combined with mutasynthesis, three genes for p-aminophenylpyruvate biosynthesis from Streptomyces venezuelae were incorporated into a chorismate mutase-deficient strain of Rosellinia sp. Using its native phenyllactate dehydrogenase enzyme, the fungus produced p-aminophenyllactate, and its oxidation product p-nitrophenyllactate. These two in situ-produced alternative building blocks were then incorporated into PF1022 analogues to replace the native D-phenyllactate constituents (Fig. 4).90
Fig. 4 PF1022 analogues where D-phenyllactate was substituted with in situ produced hydroxycarboxylic acids.90 |
The naturally occurring PF1022 and enniatin NRPSs were challenged with a set of >30 aliphatic or aromatic α-D-hydroxycarboxylic acids to generate new analogues.87,88 These experiments revealed a surprising promiscuity for the PF1022 synthetase, with some noncognate precursor analogues turning out to be even better substrates than the native ones (Fig. 5). In contrast, the enniatin synthetase was found to be more selective, as it did not accept aromatic α-D-hydroxycarboxylic acids.
Fig. 5 Substrate tolerance of the PF1022 and the enniatin synthetases towards α-D-hydroxycarboxylic acids. Incorporation of different synthetic 2-hydroxycarboxylic acids into: (A) enniatin; and (B) PF1022. The percentages describe the kcat,app in comparison to the natural substrate (enniatin: D-hydroxyisovaleric acid; PF1022: D-phenyllactic acid for the aromatic, and D-lactic acid for the aliphatic precursors, respectively). Figure reprinted with permission from Süssmuth et al.56 |
In vivo precursor-directed biosynthesis was also used to produce analogues of beauvericin that were evaluated for their cytotoxicity and directional cell migration (haptotaxis) inhibitory activity.89 Knockout of the novel gene encoding ketoisovalerate reductase (KIVR) in the producer fungus eliminated the production of the natural precursor D-hydroxyisovaleric acid, and allowed the mutasynthesis of 15 unnatural beauvericin congeners from noncognate hydroxycarboxylic acids fed to the culture. Some of these uNP beauvericin congeners displayed increased antiproliferative activity.91 Finally, 11 new beauvericin analogues were produced by in vitro chemoenzymatic and in vivo whole cell biocatalytic syntheses using either a B. bassiana ΔkivR mutant or an E. coli strain expressing the bbBeas beauvericin synthetase gene.20
Further expansion of the cyclooligomer depsipeptide structure space required advances in fungal NRPS engineering. First, whole modules were exchanged among the highly homologous cyclooligomer depsipeptide synthetases.92,93 Module swapping between the beauvericin and bassianolide synthetases showed that product formation requires the maintenance of the N-terminal linker region of the C2 domain of the second module (Fig. 6).92 Chimeric enzymes constructed from the hydroxycarboxylic acid-activating module of the PF1022 synthetase and the aminoacid-activating modules of the enniatin and beauvericin synthetases were also expressed in E. coli and Aspergillus niger, and afforded new cyclodepsipeptides (Fig. 6).93 These experiments also showed that the assembly lines could be recombined using different switchover positions, thus paving the way for the combinatorial biosynthesis of fungal NRPs.92–94
Fig. 6 Hybrid cyclooligomer depsipeptide NRPSs generated by module swapping.93 Fusion of the first module of the octadepsipeptide-synthesizing PF1022 synthetase (PfM1) with the second and the third (termination) modules of the beauvericin (BeM2 and BeM3) or the enniatin (EnM2 and EnM3) synthetases generated chimeric enzymes that produced uNP hexadepsipeptides. |
Nevertheless, these combinatorial engineering attempts were limited to the four known, highly homologous cyclodepsipeptide synthetases, which consequently restricted the structural diversity of the uNPs. For further structural diversification, the programming of these iterative fungal NRPSs had to be deciphered. Two groundbreaking studies in 2017 revealed that fungal cyclooligomer depsipeptide NRPSs adopt a radically different strategy compared to their bacterial counterparts.78,94 Bacterial cyclooligomer depsipeptide synthetases had previously been described to follow a parallel (recursive) logic (Fig. 7A), whereby dipeptidol monomers are assembled first from the precursors, and these monomers are then oligomerized by the C-terminal thioesterase (TE) domain that controls the ultimate chain length (degree of oligomerization).72,95 Unexpectedly, assembly of cyclooligomer depsipeptides in fungi follows a linear (looping) model (Fig. 7B), involving the stepwise incorporation of the precursors. Surprisingly, the CT and C2 domains of the fungal synthetases take turns to incorporate the two biosynthetic precursors into the growing depsipeptide chain that shuttles between the T1 and T2a/T2b domains. When the peptide chain reaches its preordained length, the CT domain releases the product by cyclization. This proposed dual function sets the terminal CT domains of cyclooligomer depsipeptide synthetases apart from “canonical” CT domains in fungal linear assembly lines that perform only the termination step.14
Fig. 7 Comparison of the assembly mechanisms of cyclooligomer depsipeptide synthetases. (A) The parallel (recursive) logic seen in bacterial enzymes;78,94 and (B) the linear (looping) assembly mechanism of fungal cyclooligomer depsipeptide synthetases.72,95 |
Swapping parts of the CT domain also uncovered functional aspects of macrocyclization and ring size control.94 Thus, elongation and cyclization are competitive processes in CT domains, with macrocyclization being performed when the tail end of the linear depsipeptidyl chain is positioned in such a way that the hydroxy group is able to approach the thioester and the catalytic histidine of the CT domain. According to this model, product ring size is primarily determined by the size of the cyclization pocket, collectively formed by the CT-NTD (N-terminal subdomain) and the CT-CTD (C-terminal subdomain). Exploiting this “gauge” role of the CT domain, uNP cyclodepsipeptides with altered numbers of monomers were produced by swapping CT domains that favor different chain lengths (Fig. 8). The new tetrameric product FX1 (octa-beauvericin) was generated by substituting the CT of the beauvericin synthetase with the bassianolide CT (in the form of T2a–T2b–CT, T2b–CT or CT alone), and expressing the chimera in yeast. Utilization of A. niger as a robust heterologous host supported the production of hexa-bassianolide at a very high titer of 1.3 g L−1, and allowed the biosynthesis of new-to-nature octa-enniatin B (4 mg L−1) and octa-beauvericin (10.8 mg L−1) with chimeric synthetases (Fig. 8).94 While the selected CT domains firmly controlled the chain lengths of the products, these domains showed considerable promiscuity towards aliphatic as well as aromatic amino acid side chains (N-Me-L-Val/Leu/Ile/Phe). Importantly, the two hybrid cyclooctadepsipeptides showed up to 12-fold higher activity against the parasites Leishmania donovani and Trypanosoma cruzi compared to the reference drugs miltefosine and benznidazole, respectively. In addition, desmethyl congeners were also produced by deleting the M domains, demonstrating that the absence of structure-modulating N-methylations was tolerated by the assembly lines. From the twin T domains, the T2a domain alone was sufficient for complete precursor processing, though product yields were lower without the T2b domain partner.78,94 These studies provided highly important insights into the programming of the C2, the CT, the twin T2 and the M domains of cyclooligomer depsipeptide synthetases that may be exploited to design and generate novel uNPs. However, future work should define the key amino acid residues that influence chain length control in the CT domains, and those that modulate precursor permissiveness in the C2 and the CT domains.
Fig. 8 New trimeric and tetrameric uNP congeners of cyclooligomer depsipeptides generated by swapping the CT domains with or without the T2 domains.78,94 |
In another interesting attempt, CT domains were reassigned to fill the role of a canonical condensation domain at the junction of two consecutive modules. Thus, the beauvericin, bassianolide, and enniatin synthetases were fused head-to-tail in different combinations, with or without the T2b domain, but with the CT domain replacing the nonfunctional C1 domain of the downstream synthetase (Fig. 9).66 The C1 starter domain at the N-terminal end of the fused NRPS was retained, considering that deletion of such C1 domains was shown to have a deleterious effect on the product yield.78 Remarkably, the CT turned out to be at least partially competent to act in a linear assembly mode, catalyzing the condensation of the peptidyl intermediates to the precursor activated by the downstream A domain. This then led to the production of novel hybrid cyclooligomer depsipeptides with asymmetric structures instead of symmetric oligomers with tandem repeats of identical dipeptidols. The fusion of all three synthetases led to a cocktail of “rainbow” cyclooligomer depsipeptides (Fig. 9).66 Removal of T2b was shown to promote the linear mode of precursor incorporation: apparently, the presence of both T2 domains somehow facilitates loop-back elongation.92,94
Fig. 9 “Rainbow” cyclooligomer depsipeptide analogues generated in a mixed linear/iterative assembly mode by fusing two or three iterative NRPS.66 The assembly line was constructed with or without the T2b domains, while the C1 domains of the downstream synthetases were replaced by CT domains. Colors are as described for Fig. 3. |
Modules from NRPSs of both linear and iterative assembly modes were also successfully combined. Different swapping sites were found to be productive, while respecting the integrity and specificity of the C domains was necessary.82 Thus, modules 2 (processing N-methyl-L-leucine), 4 (processing N-methyl-L-valine), 7 (processing N-methyl-L-glycine) and 10 (processing N-methyl-L-leucine) of the cyclosporine synthetase (CySYN; linear assembly mode) from Tolypocladium inflatum82 could be integrated into cyclooligomer depsipeptide synthetases (iterative assembly mode). However, only those CySYN modules were accepted that displayed an identical substrate specificity with the native assembly lines. Thus, modules 2, 4 and 10 were compatible with the enniatin synthetase, while only modules 2 and 10 were functional as part of the bassianolide synthetase. These chimeric NRPSs showed higher product specificity compared to the native enzymes, but no new cyclooligomer depsipeptide was obtained. Nevertheless, with the growing number of characterized linear fungal NRPS systems and our expanding knowledge of building block recognition and processing, this approach should eventually result in the production of new-to-nature NRPs.
More generally, these experiments also helped to develop a novel strategy (the “two-face exchange system”) to fuse different NRPSs (Fig. 10).82 This strategy combines various exchange units (XU) in a way to maximize their fit with the substrate requirements of the C domains at both sides of the fusion. Thus, an appropriate A–T, C–A–T, C–A–T–C, or A–T–C domain assembly (plus any additional processing domains within these modules) is selected as an XU for a given C domain, based on the following generalized rules: (1) C domains should be presented by the upstream XU with a peptide chain that ends in a building block that is identical (or at least sterically similar) to the native donor-site substrate of the C domain; (2) the precursor offered by the downstream XU must strictly meet the C domain acceptor-site specificity requirements;84 (3) the crossover sites should be positioned right at the C-terminal end of the T domain for new T–C fusions,92,94 and at the C-terminus of the C domain66 or after the α-helical linker element84 for C–A fusions.
Fig. 10 The “two-face” exchange concept for fungal NRPS engineering (reprinted with permission from Steiniger et al.82). The donor (don) and acceptor (acc) site specificities are indicated as a capital letter on top of each C domain. Combining exchange units (XU) of the C–A–T (upper left)22,96 and the A–T–C type84 (upper right) to the so-called two-face system (bottom)82 broadens combinatorial possibilities. °XU, selectivity requirement that must be met by the adjacent XU (upstream or downstream); SXU, starter exchange unit; XUT, termination exchange unit. |
Fig. 11 Inactivation of biosynthetic genes results in uNP pneumocandin analogues.98,108 (A) Genetic organization of the pneumocandin biosynthetic gene cluster. The core NRPS gene and the inactivated genes are color-coded as indicated. (B) Structures of pneumocandins B0 (3) and A0 (30). (C) Structures of mutasynthetic pneumocandin analogues resulting from the inactivation of the PKS-encoding gene GLPKS4 and feeding of various fatty acid precursors. (D) Schematic illustration of the structures of uNPs resulting from the inactivation of tailoring enzyme-encoding genes. (E) Structures of pneumocandin analogues. |
Insertional inactivation of two P450-type hemeprotein monooxygenase-encoding genes (GLP450-1 and GLP450-2) and one non-heme mononuclear iron oxygenase gene (GLOXY1) in Glarea lozoyensis generated 13 different pneumocandin analogues that lack one, two, three, or four hydroxyl groups from the 4R,5R-dihydroxy-ornithine and 3S,4S-dihydroxy-homotyrosine moieties of the parent hexapeptide (Fig. 11).98 Among them, pneumocandins F and G were more potent in vitro against Candida species and Aspergillus fumigatus than the principal native fermentation products, pneumocandins A0 and B0.
In a similar manner, deletion of the ecdH gene encoding a cytochrome P450 monooxygenase in Emericella rugulosa generated echinocandin analogues lacking both hydroxyl groups of the 4R,5R-dihydroxy-Orn1 moiety present in the parental molecule (Fig. 12).109 The deletion of the nonheme mononuclear iron oxygenase-encoding gene ecdG led to the production of echinocandin analogues with a non-hydroxylated homotyrosine residue. In vitro evaluation of the deshydroxy-echinocandin scaffolds in an anticandidal assay revealed up to a threefold loss of potency for the products from the ΔecdG strain. In contrast, a threefold gain of potency was seen for the ΔecdH-derived compounds, in line with prior results on deoxyechinocandin homologues.
Fig. 12 Inactivation of tailoring enzyme-encoding genes results in new uNP echinocandin analogues.109 (A) Genetic organization of the echinocandin B biosynthetic gene cluster. The core NRPS gene and the inactivated genes are color-coded as indicated. (B) Structures of echinocandin B (48). (C) Structures of echinocandin analogues. (D) Schematic illustration of the structures of echinocandin analogues. |
Another example of a successful mutasynthesis of a cyclic peptide-producing NRP was described for the cycloaspeptides, cyclic pentapeptides with various bioactivities obtained from Isaria farinosa and various Aspergillus and Penicillium species.110 The minor analogue, cycloaspeptide E, has shown potent insecticidal activity and drawn interest from the agricultural industry. The cycloaspeptides biosynthetic gene cluster contains a five-module NRPS where each module consists of C–A–T domains only. Two of the modules (modules 3 and 5) preferentially accept and incorporate N-methylated amino acids supplied by a trans-acting N-methyltransferase. Deletion of the N-methyltransferase-encoding gene and supplementation of the fermentation medium with N-methylated amino acids resulted in the production of analogues with fluorinated amino acids at the third and/or the fifth positions of the pentapeptide scaffold.110 A similar N-methyltransferase also participates in the ditryptophenaline biosynthetic pathway,110 suggesting the possibility for an analogous engineering approach to produce uNPs in a broader subclass of N-methylated cyclic peptides featuring trans-acting N-methyltransferases.
Fig. 13 Fungal hybrid enzymes assembled from PKS and NRPS components. (A) Domain organization and reaction mechanisms of a canonical PKS–NRPS hybrid enzyme, represented by the tenellin synthetase TenS.116,117 The PKS module is composed of KS–AT–DH–MT–KR–ACP domains and uses the TenC ER domain as a trans-acting enzyme. The NRPS module has a typical domain organization of C–A–T–D. The product is released by the D domain (Dieckmann cyclase, also known as R, reductive release domain) from the NRPS T domain by forming a tetramate (or pyridone) moiety, which may be further modified by tailoring enzymes. (B) Structures of selected fungal polyketide–amino acid hybrids. The polyketide moieties and amino acid moieties are drawn in red and blue, respectively. The inactive ER domain is represented with ER0. (C) Examples of fungal NRPS–PKS hybrids with incomplete PKS or NRPS modules and iterative or noniterative assembly mechanisms. Tas1, tenuazonic acid synthetase;113 SwnK, swainsonine synthetase;114 HispS, hispidin synthetase.115 |
Fungal iterative PKS–NRPS hybrid systems have been successfully engineered to expand product structure diversity by combining non-cognate PKS and NRPS modules.6,77 These works demonstrated that the combination of heterologous PKS and NRPS modules, constructed by domain or module swapping116–118 or assembled as freestanding modules acting in trans,49 can generate chimeric new molecules as expected, albeit often in a reduced yield. Unfortunately, the condensation reaction between the polyketide chain and the incoming amino acid monomer failed in many cases in these unnatural assembly lines, especially when the similarity between the parent enzymes was not high enough.
In a pioneering study, the dissected ApdA PKS and NRPS modules of the aspyridone synthetase and the standalone ER ApdC were incubated in equimolar amounts in the presence of appropriate cofactors and building blocks to afford preaspyridone in a comparable yield to the intact ApdA.49 When the ApdA PKS module and the ApdC ER were coexpressed with the NRPS module of the cyclopiazonic acid PKS–NRPS CpaS in yeast, a new analogue was produced, but at a greatly reduced yield (2.5% relative to that of the intact ApdA and ApdC; Fig. 14).
Fig. 14 Hybrid PK–NRP production with dissected PKS–NRPS modules.49 (A) Biosynthesis of preaspyridone 69 by ApdA (PKS–NRPS) and ApdC (trans-acting ER). Preaspyridone is converted to the final natural product aspyridone A after enzymatic oxidation and ring expansion.30 (B) Dissected modules from ApdA (the PKS module) and CpaS (the NRPS module) were expressed as freestanding proteins in the same host cells of S. cerevisiae BJ5464-NpgA to afford the expected chimeric product 68. |
Domain swapping between the tenellin synthetase TenS and the desmethylbassianin synthetase DmbS, sharing 87% amino acid sequence identity and incorporating the same amino acid building block, was also successful: the chimeric enzyme as well as combinations of the trans-acting ER domains produced the expected tetramic acids (Fig. 15).116,117 What's more, no reductions in the titer were observed with these chimeric systems; the yield of the TenS PKS–DmbS NRPS hybrid enzyme was even higher than that of the native TenS.116
Fig. 15 Chimeric PKS–NRPS enzymes constructed from the tenellin synthetase TenS and the desmethylbassianin synthetase DmbS biosynthesize the expected tetramic acid products.116 (A) and (B) Domain compositions of the chimeric synthetases. (C) Structures of products. The polyketide moieties and amino acid moieties are shown in red and blue, respectively. No significant variations in the total titer were observed between the native and the hybrid systems. |
Encouragingly, module swapping between the PKS–NRPS CcsA from Aspergillus clavatus involved in the biosynthesis of cytochalasin E, and Syn2 from Magnaporthe oryzae (52% amino acid sequence identity) led to the biosynthesis of the novel products niduchimaeralin A and B (Fig. 16).120
Fig. 16 Chimeric PK–NRP production with hybrid enzymes constructed from the cytochalasin E synthetase CcsA119 and the related synthetase Syn2 from Magnaporthe oryzae.120 (A) Overexpression of CcsA and CcsC in A. nidulans led to the production of niduclavin. Expression of chimeric CcsA–Syn2 affords niduchimaeralin A. (B) Expression of chimeric Syn2–CcsA leads to production of niduchimaeralin B. Overexpression of Syn2 and Rap2 of M. oryzae in A. nidulans affords niduporthin. |
In an even more daring attempt, PKS and NRPS modules from five different PKS–NRPS enzymes (EqxS,121 FsdS,122 CpaS,123 PsoA,124 and LovB125) synthesizing five, chemically highly distinct fungal NPs (equisetin, fusaridione, cyclopiazonic acid, pseurotin, and lovastatin, respectively) were swapped in 34 combinations, utilizing several fusion sites. Unfortunately, only 16 fusion enzymes yielded any product at all, and most of the combinations only afforded the polyketide portions of the expected hybrid molecules. Only the equisetin PKS (EqxS) and the fusaridione NRPS (FsdS) proved compatible to produce a single hybrid PK–NRP product, presumably due to the relatively high similarity (47% amino acid sequence identity) between EqxS and FsdS (Fig. 17).112 The PKS module of the lovastatin synthetase LovB also failed to produce hybrid compounds when paired with other NRPS modules such as that of the chaetoglobosin A synthetase CheA.118
Fig. 17 Compounds biosynthesized by chimeric PKS–NRPS enzymes, constructed by using different fusion sites.112 (A) Structures of the natural products of the five PKS–NRPS enzymes that were used for the construction of the chimeras. The names of the parental PKS–NRPS enzymes responsible for their synthesis are shown under the structures. Dihydromonacolin L is not an amidated product because its biosynthetic enzyme, LovB, possesses a truncated NRPS module. (B) Design of the fusion sites. PKS–NRPS1 (blue) represents PsoA, CpaS, LovB, EqxS or FsdS; and PKS–NRPS2 (red) represents EqxS or FsdS. (C) Compounds afforded by the different PKS/NRPS fusions. Compounds 83, 84, 87, 88 and 89 are novel (nd = no product was detected). Note: compound 83, all double bonds are trans. |
These attempts highlight the significance of considering the selectivity of the downstream NRPS module (with the C domain likely being the most demanding) for the successful biosynthesis of uNPs with PKS–NRPS chimeras. At this stage, the importance of appropriate protein–protein interactions between the PKS and the NRPS modules cannot be discounted either. The PKS and NRPS modules are connected by a relatively long inter-modular linker (70–150 amino acid residues), with no apparent sequence conservation among different synthetases. While PKS–NRPS hybrids were seen to tolerate the alteration of the sequence and the length of such linkers,120 the compatibility of the noncognate domains turned out to be far more important for a successful chimeric enzyme. The significance of selecting an appropriate ACP domain for the PKS module was demonstrated by the systematic study of fusion sites around the ACPs (Fig. 17B and C). All PKSs were active in making polyketides when fused to their own ACP domains, but only a subset of fusions with a non-cognate ACP led to polyketide products.112 Whether the polyketide products can be passed on to the downstream NRPS module largely depends on the selectivity of the C domains. For example, although the EqxS–FsdS fusion produced a chimeric product, the reciprocal fusion (FsdS–EqxS) yielded only the polyketide part assembled by the FsdS PKS module, without being fused to the serine moiety expected from the EqxS NRPS partner. The fusion product was absent even when the native EqxS ACP was present (Fig. 17C).112 Since both the FsdS and the EqxS ACPs could collaborate with the other domains of either PKSs, this incompatibility was unlikely to be caused by insufficient protein–protein interactions between the two modules. Instead, the failure must have been the consequence of the EqxS C domain rejecting the intermediate synthesized by the FsdS PKS partner.112
Successful biosynthesis of uNP analogues with PKS–NRPS megasynthetases may require the identification of promiscuous NRPS catalysts. Thus, the NRPS module (NRPS325) of the only PKS–NRPS (ATEG00325) of A. terreus produces thiol-substituted pyrazines in vitro. Substrate promiscuity of NRPS325 toward different amino acids and free thiols allowed the production of 63 different thiopyrazine compounds in good yields (Fig. 18).80
Fig. 18 The NRPS module (NRPS325) of a PKS–NRPS from Aspergillus terreus was reconstituted in vitro and its substrate promiscuity towards different amino acids and free thiols was explored to produce >60 different thiopyrazine compounds.80 R1 and R2 represent various amino acid and thiol residues, respectively. |
Creation of chimeric products with NRPS–PKS enzymes does not need to consider the limitation of the stringent filtering function of C domains, and may take advantage of the inbuilt promiscuity of some KS and even A domains. For example, an NRPS–PKS hybrid enzyme (AnATPKS) from Aspergillus niger yields compounds where an amino acid starter is extended by a diketide, forming the α-pyrones pyrophen and campyrone B. Upon heterologous expression of the synthetase and precursor feeding, the promiscuous A domain loaded a variety of substituted phenylalanine analogues to the T domain, and these were then successfully extended by the PKS module to afford a library of substituted pyrophen analogues. These analogues may also be further processed by O-methylation and N-acetylation (Fig. 19).126
Fig. 19 The promiscuity of the A domain and the PKS module of AnATPKS allows the production of various pyrophen analogues.126 (A) Domain organization of AnATPKS and illustration of the biosynthetic pathway. (B) The structure of pyrophen, campyrone B and the analogues obtained upon precursor feeding and heterologous expression of AnATPKS and an associated O-methyltransferase (AnOMT) in A. nidulans. |
Substrate | Product | ee (%) | |
---|---|---|---|
Configuration | R | ||
a Reaction conditions: 1.5 mM substrate, 5 μM IvoA, 5 mM ATP, 10 mM MgCl2, 100 mM K2HPO4, pH 7.5. | |||
L | H | 98 | >99 |
L | 5-OMe | 99 | >99 |
rac | 5-CN | 100 | 63 |
rac | 5-NO2 | 101 | 44 |
rac | 4-F | 102 | >99 |
rac | 5-F | 103 | >99 |
rac | 6-F | 104 | >99 |
rac | 5-CI | 105 | >99 |
rac | 6-CI | 106 | 98 |
rac | 5-Br | 107 | >99 |
rac | 6-Br | 108 | 75 |
rac | 7-Br | 109 | 87 |
rac | 2-Me | 110 | 1.3 |
rac | 4-Me | 111 | >99 |
rac | 5-Me | 112 | >99 |
rac | 6-Me | 113 | 97 |
rac | 7-Me | 114 | >99 |
The hancockiamides are a family of N-cinnamoylated piperazines, obtained from the Australian soil fungus Aspergillus hancockii. An NRPS-like enzyme (Hkm11) activates and transfers trans-cinnamate to the piperazine scaffold (Fig. 20).129 The substrate flexibility of Hkm11 allowed the production of bioisosteric thienyl and furyl analogues 121–126 by expressing the truncated biosynthetic gene cluster (hkm4–12) lacking an acetyltransferase in A. nidulans (Fig. 20B). The structure diversity of novel hancockiamides was further expanded by supplementing the culture medium with unnatural cinnamic acid congeners. Notably, compounds 115 and 118 displayed potent cytotoxicity against murine myeloma NS-1 cells (MIC 1.6 and 3.1 μg mL−1, respectively), but were inactive against neonatal foreskin normal fibroblasts, suggesting potential applications in cancer chemotherapy. Compound 117, the likely end product of the hkm pathway, also showed potent anti-germination activity against Arabidopsis thaliana seeds (MIC of 6.3 μg mL−1), but was inactive against the monocot Eragrostis teff seeds, suggesting that this compound may be used as a monocots-targeting herbicide lead. The herbicidal activity of compound 117 was proposed to derive from its inhibition of plant lignin biosynthesis due to its phenylpropanoid-like structure.
Fig. 20 Biosynthesis of hancockiamide analogues.129 (A) The hkm gene cluster is responsible for the production of hancockiamides. AcT, acetyltransferase; MT, methyltransferase; PAL, phenylalanine ammonia lyase. (B) Structures of hancockiamides A–F (115–120) isolated from A. hancockii, and hancockiamides G–I (121–123) and xenocockaimides A–D (124–126) obtained by heterologous expression of the truncated hkm gene cluster in A. nidulans. |
As an example of domain swapping in NRPS-like enzymes, the A domain of the BtyA enzyme responsible for butyrolactone IIa (128) production in A. terreus was replaced by an equivalent A domain from the aspulvinone E synthase ApvA, reconstituting the biosynthesis of butyrolactone IIa (128). More interestingly, replacement of the A domain of BtyA with that of the phenguignardic acid (129) synthetase PgnA afforded the new compound phenylbutyrolactone IIa (130) in A. nidulans (Fig. 21).130
Fig. 21 Domain swapping in related NRPS-like enzymes yields an uNP.130 (A) Three NRPS-like enzymes from Aspergillus terreus and their products, aspulvinone E (127), butyrolactone IIa (128), and phenguignardic acid (129).131 (B) Domain organization of chimeric NRPS-like enzymes and their products. The two ApvA/BtyA hybrids produce butyrolactone IIa (128), while both PgnA/BtyA hybrids yield the uNP phenylbutyrolactone IIa (130). |
However, it is also becoming increasingly clear that NRP assembly line engineering is a complex puzzle with multiple interrelated factors. Before we can harness the full potential of designer NRP biosynthesis, we will have to understand the reasons for drastic drops in titer, and often the complete loss of peptide production when modifying NRP assembly lines. An immediate need is to understand the intrinsic programming of A domain substrate selectivity: we need to decipher the specificity-conferring code of fungal A domains; and next, we need to derive facile engineering rules for the reprogramming of these domains in a keyhole surgery-like manner. A corollary of having a reliable fungal A domain specificity code, especially in conjunction with a more accurate prediction of C domain subtypes, would be a more precise prediction of the structures of the peptidyl backbones assembled by the huge number of NRPSs emerging from fungal genome sequencing projects.
The second factor is the significant influence of the C domain on the rate and specificity of amino acid incorporation, as well as its proofreading activity for peptide intermediates.133 The C domain has a pseudo-dimeric structure with a catalytic center at the interface of two sub-domains, occupied by the two T-domain-tethered amino acids during catalysis.83 Therefore, C domains have selectivity towards both the donor (the growing peptide) and the acceptor (the activated amino acid) substrates, and these domains enact a proofreading role to ensure the synthesis of the correct peptide sequence. This selectivity restricts the amino acids that may be incorporated by engineered NRPSs to only those building blocks that are structurally similar to the native substrate. NRPS engineering strategies such as the “two-face exchange system”82 try to overcome this bottleneck by combining appropriate domain assemblies with A–T, C–A–T, C–A–T–C, or A–T domain constituents to match, as best as possible, the substrate requirements of the C domains in hybrid assembly lines. Another attempt to solve the C domain bottleneck problem is to assemble chimeric C domains from two noncognate subdomains, each with the desired specificity, to create an “exchange unit condensation domain”.83 Either way, more C domains with characterized selectivity for both of their substrates are required to be catalogued to (1) provide a sufficient toolbox for sub-domain selection; and (2) further elucidate the “specificity code” of C domains that is apparently more complex than that of the A domains.
The C domain bottleneck, and the attempts to generate chimeric C domains or fused assembly lines also emphasizes the need for a better understanding of the protein–protein interactions within the subdomains of the C domain, and those with the other domains of the NRPS assembly line.12 Compared to bacterial NRPSs, much less structural information is available for fungal synthetases.134 However, advances in AI-based computational biology tools such as AlphaFold2 (ref. 135) and RosettaFold136 will facilitate the understanding of the structural basis for the dynamic intra- and inter-domain interactions in fungal NRPSs.
Another barrier, the enormous size of the fungal megaenzymes and their encoding genes, continues to pose a challenge for NRPS engineering in both the endogenous producer and in heterologous hosts. In prokaryotes, NRPS assembly lines are usually encoded by several genes, while in fungi, the megaenzyme is typically the product of a single gene. Therefore, multiple bacterial NRPS subunits may be independently manipulated and then expressed in heterologous hosts such as E. coli. These subunits then may efficiently cooperate through N- and C-terminal docking domains (originally referred to as communication domains) composed of 15–25 amino acids, and synthesize the targeted product.137 Artificial docking domains138 and synthetic zippers139 have also been developed to engineer bacterial NRPS exchange units without natural docking domains. In contrast, fungal multimodule NRPSs are too large to be routinely transferred to model synthetic biology chassis such as S. cerevisiae or E. coli, limiting the development of fungal NRPS engineering. Currently, the successfully engineered fungal NRPSs or PKS–NRPS hybrids contain no more than three modules.
Fungal NRP assembly lines represent a promising yet underexplored target for biosynthetic engineering. While many efforts have been focused on their bacterial counterparts, it is clear that harnessing the exceptional versatility of fungal NRPS assembly lines by advanced synthetic biology tools will offer tremendous opportunities for expanding the chemical diversity of NRP-based uNPs for medical and veterinary drug discovery and crop protection applications.
Footnote |
† These authors contributed equally to this review. |
This journal is © The Royal Society of Chemistry 2023 |