Silja
Mordhorst†‡
a,
Fleur
Ruijne‡
b,
Anna L.
Vagstad
a,
Oscar P.
Kuipers
*b and
Jörn
Piel
*a
aInstitute of Microbiology, Eidgenössische Technische Hochschule (ETH), Zürich, Vladimir-Prelog-Weg 4, 8093 Zürich, Switzerland. E-mail: jpiel@ethz.ch
bDepartment of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 7, 9747, AG Groningen, The Netherlands. E-mail: o.p.kuipers@rug.nl
First published on 6th December 2022
Peptide natural products are important lead structures for human drugs and many nonribosomal peptides possess antibiotic activity. This makes them interesting targets for engineering approaches to generate peptide analogues with, for example, increased bioactivities. Nonribosomal peptides are produced by huge mega-enzyme complexes in an assembly-line like manner, and hence, these biosynthetic pathways are challenging to engineer. In the past decade, more and more structural features thought to be unique to nonribosomal peptides were found in ribosomally synthesised and posttranslationally modified peptides as well. These streamlined ribosomal pathways with modifying enzymes that are often promiscuous and with gene-encoded precursor proteins that can be modified easily, offer several advantages to produce designer peptides. This review aims to provide an overview of recent progress in this emerging research area by comparing structural features common to both nonribosomal and ribosomally synthesised and posttranslationally modified peptides in the first part and highlighting synthetic biology strategies for emulating nonribosomal peptides by ribosomal pathway engineering in the second part.
The two largest groups of peptide natural products are nonribosomal peptides (NRPs) and ribosomally synthesised and posttranslationally modified peptides (RiPPs). For NRPs, hundreds of different building blocks are available, whereas only the limited set of 20 proteinogenic amino acids (22 including selenocysteine and pyrrolysine) is available for ribosomal biosynthesis. Several biosynthetic pathways of highly modified peptide natural products remained elusive for years, since the products were assumed to be synthesised by nonribosomal peptide synthetases (NRPS), although no corresponding NRPS gene cluster could be found in the genome. Prominent example are thiopeptides and polytheonamides, the latter of which were for a long time considered as the largest NRPs known.4 However, in 2009, thiopeptides were established as ribosomal peptides,5–7 and in 2012, a ribosomal origin from an uncultivated sponge symbiont was demonstrated for polytheonamides.8 Patellamides9 are moderately cytotoxic peptides produced by various cyanobacterial strains and were assumed to be NRPS products as well. In 2005, the ribosomally encoded biosynthetic genes for patellamide A and C were first identified in Prochloron didemni.10 There are also examples for fungal natural products previously considered to be NRPS products. The omphalotins are cyclopeptides that resemble the NRPS product cyclosporine A, but are synthesised via a ribosomal pathway.11 Only recently, the decades-old mystery of the biochemical origin of victorin, a toxin from the oat pathogen Cochliobolus victoriae, was solved: the cyclic hexapeptides are also members of the RiPP superfamily.12
As their names suggest, the two biosynthetic pathways differ substantially (Fig. 1). NRPs are synthesised by large multienzyme complexes (type I NRPS) consisting of different biosynthetic modules that produce the peptide chain in an assembly-line-like manner by a conserved thiotemplate mechanism.13 Each module typically incorporates one amino acid building block into the final product. In the first step, the adenylation domain (A domain) selects the specific building block, which can be a proteinogenic or nonproteinogenic amino acid, the latter synthesised by additional enzymes. A domains normally exhibit a high substrate specificity and activate amino acids as acyl adenylates with adenosine 5′-monophosphate (AMP) from an adenosine 5′-triphosphate (ATP) donor. In the next step, the aminoacyl adenylate is transferred to the peptidyl carrier protein (PCP) by forming a covalent link with the free thiol group of the PCP-bound 4′-phosphopantetheinyl cofactor. At this stage, different modifications can be installed on the substrate by optional domains, such as epimerisation (E domain) to result in D-amino acid residues, peptide-bond N-methylation (NMT domain), cyclisation (Cy domain) to yield 5-membered heterocycles, or oxidation (Ox domain). The condensation domain (C domain) mediates peptide bond formation between the PCP-bound growing peptide chain and the newly activated amino acid leading to chain elongation. The last module contains domain(s) that catalyse release of the peptide. This is typically a C-terminal thioesterase domain (TE domain) that releases the peptide chain by hydrolysis or a (macro)cyclisation reaction, though other off-loading mechanisms are known, such as reduction by a reductase domain (Red domain). The released peptide is often further modified by so-called tailoring reactions, such as glycosylation, acylation, halogenation, or hydroxylation, to yield the mature natural product(s).
In RiPP biosynthetic pathways, one or more precursor peptides and posttranslational maturation enzymes are encoded in a gene cluster. The ribosomally synthesised precursor peptide usually consists of a core peptide region and an additional N-terminal leader region and/or a C-terminal follower region. The maturation enzymes install modifications in the core peptide region prior to proteolytic release. The posttranslational modifications (PTMs) installed by RiPP maturases greatly expand the structural diversity of the canonical proteinogenic amino acids.
The ribosomal pathway uses mRNA as a template for peptide biosynthesis, whereas the A domains of NRPS enzyme complexes act as a template for NRP biosynthesis. These templates are substantially larger with approximately 100 kDa of protein machinery per incorporated amino acid, instead of one base triplet of mRNA in the ribosomal biosynthesis. Therefore, NRPS products are limited in size and most often consist of <10 amino acids.14 Syringopeptin 25A, a virulence factor of Pseudomonas syringae, is the largest NRPS compound described so far, comprising 25 amino acid building blocks.15,16 In contrast, the largest RiPP product described until now contains 70 amino acid residues: the head-to-tail cyclised bacteriocin uberolysin A from Streptococcus uberis.17 Bacteriocins are gene-encoded antimicrobial peptides produced by bacteria that traditionally exhibit narrow spectrum antibiotic activity against closely related strains.18
In general, NRPSs involve complex enzyme systems encoded by large gene clusters and the assembly-line production route makes them difficult targets for engineering. In comparison, RiPP pathways are genetically simple, since they encode separate enzymes and the modularity of RiPP enzymes facilitates manipulation to generate designer peptides.19
Extensive reviews on both NRPS3,16,20–23 and RiPP24,25 biosynthetic machineries have been published in the past. This review focusses on noncanonical amino acids and single structural features common to both of these peptide natural product classes and the great potential of the enzymology found in ribosomal pathways to emulate NRPS natural products.
In NRPSs, most D-amino acids are converted from L-amino acids in situ by module-embedded cis-domains. Commonly, an E domain located between the PCP and C domains racemises the PCP-bound intermediate at the α-carbon of the most-recently added monomer to generate an equilibrium of D- and L-configured products.26 The downstream C domain exhibits strict stereochemical control for the D-aminoacyl or -peptidyl variant in peptide bond formation, thus syphoning the preferred D-stereoisomer into the growing peptide chain.26 Bifunctional C domains are also known that catalyse both the condensation and epimerisation of the amino acid and tend to form distinct phylogenetic clades to standard C domains.27,28 More exotic examples of in situ mechanisms for D-amino acid incorporation in NRPS products exist. The C-terminal TE domain of the nocardicin NRPS catalyses both epimerisation of the terminal phenylglycine residue and hydrolytic release of the monolactam-containing pentapeptide product.29 In pyochelin and yersiniabactin biosynthesis, a so-called “stuffed” E domain embedded within the A domain and resembling a defunct methyltransferase enables formation of a D-thiazoline moiety.30 Alternatively, D-amino acids, typically generated by free-standing pyridoxal 5'-phosphate (PLP)-dependent racemases, can be selectively activated by A domains for direct incorporation into the peptide. This is the case for selection of D-Ala by the first module of cyclosporine A synthase31 and by the stand-alone NRPS module involved in a post-synthase tailoring modification of the ansatrienin polyketides.32
D-Amino acids have also been recognised as important structural features of many ribosomal-origin peptides, as several different enzyme classes have convergently evolved to posttranslationally epimerise RiPPs. Peptidyl isomerases have long been known from bioactive eukaryotic peptide biosynthetic processes predominantly found in venoms or nervous tissues. Although they favour production of the D-amino acid, they generally yield a mixture of D- and L-stereoisomers, relying on deprotonation/reprotonation chemical steps. Examples include D-Ala in the potent μ-opioid agonist dermorphin from frog skin,33D-Ser in ω-agatoxin from funnel web spider,34 and various D-residues from mollusc conotoxins,35,36 among others.37 Although the responsible isomerases catalyse similar acid-base chemistry, the enzymes characterised to date are largely unrelated structurally to each other and contain no cofactors.
Some lanthipeptides contain D-Ala or D-amino butyric acid residues, which arise from a two-enzyme process. In the first step, Ser or Thr residues undergo a formal dehydration to dehydroalanine (DhA) or dehydrobutyrine (DhB) intermediates, respectively, by the first dehydratase component of lanthionine synthetases. Although these dehydroamino acids are typically subject to conjugate addition of Cys sulfhydryls to form the namesake lanthionine thioether bridges, here, dehydration is followed by subsequent diastereoselective hydrogenation to produce the D-stereoisomers. The hydrogenation enzyme types differ and include zinc-dependent dehydrogenases termed LanJA, members of the flavin-dependent oxidoreductase superfamily termed LanJB, and the most recently discovered F420H2-dependent reductases termed LanJC from, for example, lacticin 3147,38 carnolysin,39 and lexapeptide40 biosyntheses, respectively.
Two subfamilies of radical S-adenosylmethionine (rSAM) enzymes that utilise oxygen-sensitive [4Fe–4S]-cluster cofactors and SAM co-substrates are known to install diverse D-amino acids. The cytotoxic, 48-mer polytheonamides contain 18 D-amino acids (natively at Val, Ala, Asn, Ser, and Thr residues) installed by a single enzyme, i.e., the rSAM peptide epimerase PoyD.8 Characterisation of PoyD and several homologues, as well as various mutagenesis studies, demonstrated this enzyme class catalyses epimerisation on a wide range of peptide targets and residue types.41–46 A second class of rSAM epimerases with distinct domain architecture to the PoyD-type comes from epipeptide RiPPs first characterised in Bacillus subtilis.47 The epimerase YydG installs D-allo-Ile and D-Val in the YygF precursor protein. These posttranslational epimerisations are essential for the peptide's function to induce a major component of the bacterial cell envelope stress-response. Radical-mediated epimerisation is advantageous over racemases that use acid-base catalysis in that D-amino acids are installed irreversibly via a radical mechanism involving abstraction of the α-hydrogen of the target amino acid residue and hydrogen donation on the back-face likely from a conserved Cys side chain.45,47,48 However, D-amino acid patterns installed by a rSAM epimerase vary greatly depending on the core sequence and it is not yet possible to predict the product stereochemistry.42
Single representatives of other epimerase types have been characterised in additional RiPPs. In the biosynthesis of the antibiotic bottromycin, Asp is converted from the L- to D-configuration by α-β hydrolase family member BotH.49 The C-terminal D-Tyr of the lasso peptide MS-271 was recently shown to be installed by a metal- and cofactor-independent enzyme MslH belonging to the metallo-dependent phosphatase family.50 In other RiPPs, the source of D-amino acids remains elusive. Salinipeptins, for example, contain up to nine D-amino acids at diverse residues by an as-yet undefined mechanism, since the biosynthetic gene cluster lacks genes for any known class of epimerase.51 Recent work suggests that the gene sinL from the salinipeptin biosynthetic gene cluster encodes a novel peptide epimerase, which is also found in other type-A linaridins such as grisemycin and cypemycin.52 The diversity and prevalence of epimerases and the importance of D-amino acids to RiPP bioactivities suggest that additional new epimerase families will be discovered in the coming years.
Examples for macrocyclised NRPS products include the antibiotics daptomycin,55,56 gramicidin S,57 and tyrocidin A (Fig. 3A),58 the immunosuppressant cyclosporine A,59 the thrombin inhibitor cyclotheonamide,60,61 the antifungal lipopeptides echinocandin B62 and fengycin,63,64 and the biosurfactant surfactin A,65 among many others. An unusual macrocyclic structure is found in the weakly cytotoxic nostocyclopeptides, in which an imino linkage is responsible for the ring closure.66,67 Further examples of peptide-containing macrocycles are hybrid polyketide synthase (PKS)-NRP products, such as streptogramins (pristinamycin68 and virginiamycin69), rapamycin,70,71 and tricholide A and B.72
In NRPSs, the macrocyclisation reaction is usually catalysed by the C-terminal TE domain. It is included in the final module of the enzyme complex and responsible for off-loading the peptide product. For the cyclisation reaction, a nucleophilic group of the nascent peptide attacks the thioester bond to the PCP domain and releases the cyclic peptide. In case of imine formation in nostocyclopeptides, an NAD(P)H-dependent Red domain is predicted to be involved in the ring closure.67
Several RiPP natural products also contain macrocycles. Circular bacteriocins are large antimicrobial peptides forming monocyclic skeletons; prominent representatives are enterocin AS-48 from Enterococcus faecalis, uberolysin A from Streptococcus uberis, and circularin A from Clostridium beijerincki.17,73–76 Other common examples for macrocycles are found in the class of cyanobactins including the patellamides (Fig. 3B) and trunkamides.24 The plant-derived RiPP classes of orbitides (e.g. cyclolinopeptide A24) and cyclotides (e.g. kalata B177,78) include macrocyclic structures as well. The fungal peptide omphalotin A also features a head-to-tail cyclised peptide produced by the basidiomycete Omphalotus olearius. Further fungal examples are amatoxins (α-amanitin) and phallotoxins (phalloidin).24 The telomerase inhibitor telomestatin, a potential therapeutic agent for cancer treatment, was only recently identified as a RiPP and contains eight heterocyclised amino acids in a macrocyclic ring.79
In contrast to NRPS macrocyclisation reactions, which are commonly catalysed by TE domains, RiPP macrocyclisation reactions are accomplished by diverse enzymes and some remain elusive to date. For example, the biosynthetic mechanism and the role of the leader peptide resulting in the circular bacteriocins are not well understood yet and the nature and type of enzyme remains enigmatic. It has been suggested that no single protein, but rather a group of four to five gene products are responsible for their circular maturation and secretion.76 On the contrary, the cyanobactin macrocyclases are well-studied enzymes. They are dual-action subtilisin-like serine proteases that cleave the C-terminal follower peptide upstream of the recognition sequence and concomitantly catalyse the N-to-C cyclisation.80 The crystal structure of PatG from the patellamide biosynthetic gene cluster has been solved,81 and a homologue, PagG from prenylagaramide biosynthesis has been used to generate a large peptide library highlighting the biotechnological potential of cyanobactin macrocyclases.82 A different family of macrocyclases are cysteine proteases of the asparaginyl endoprotease family found in plants; these enzymes cleave C-terminally to Asn or Asp residues, generating an acyl-enzyme intermediate that is intra- or intermolecularly captured to yield cyclisation or ligation products, respectively.25 They are typically involved in the biosynthesis of cyclotides and orbitides. The macrocyclisation of fungal amatoxins and omphalotins is proposed to be catalysed by a prolyl oligopeptidase performing two reaction steps: first, cleavage at the N-terminus and second, a transpeptidation reaction to cleave the C-terminus of the core peptide and form the cyclic peptide.83,84
Besides macrocycles, many other cyclic structures are formed by both NRPS and RiPP pathways. Various NRPS intramolecular crosslinks are found in, for example, glycopeptide antibiotics, such as vancomycin, teicoplanin, and kistamycin, which contain multiple crosslinks via side chains of aromatic amino acids. These crosslinks are introduced by cytochrome P450 enzymes.85,86 In RiPP products, many different cyclisation variants have been discovered.24,25 Common crosslinks are thioether bridges which are found in lanthipeptides,87 sactipeptides,88 and ranthipeptides.89 Disulphide bonds occur in cyclotides90 and in class I, II, and IV lassopeptides.91 In addition to disulphide bridges, lassopeptides contain a macrolactam cycle formed by an isopeptide bond between the N-terminal α-amino group and an Asp or Glu side chain.91 A Cys-Trp crosslink is found in amatoxins leading to a tryptathionine structure.25 C–C crosslinks have been discovered in streptides (Lys–Trp crosslink)25 and in ryptides (Arg–Tyr crosslink),92 and ether bonds (C–O) in rotapeptides (Thr–Gln crosslink).93 Both C–C and C–O crosslinks have recently been identified in indole- or phenyl-bridged cyclophanes.94 Most of these crosslinks are installed by rSAM enzymes, while the lanthionine and methyllanthionine bridges are introduced between dehydrated Ser or Thr residues and Cys by different classes of lanthionine synthetases.25
Lipid modifications can also be added by post-synthase tailoring reactions, as in the lipoglycopeptides. The representatives teicoplanin and A40926 isolated from Actinomadura species are clinically used against Gram-positive bacterial infections where they bind the D-Ala-D-Ala moiety of peptidoglycan with high affinity.98 Their crosslinked, heptapeptide backbones are synthesised by NRPSs in concert with trans-acting oxidases and are sequentially modified by glycosyl- and acyltransferases to attach lipid-modified amino sugars to the 4-hydroxyphenylglycine at residue 4. Other examples for post-synthase lipidation are prenylated NRPs, e.g. the antitumour fungal metabolite terrequinone A from Aspergillus sp. Terrequinone A is built by head-to-tail dimerisation of indole pyruvate followed by bisprenylation.99
N-Acetylations were identified in a number of RiPPs—including microviridin toxins from various cyanobacteria,100 the linear azol(in)e-containing peptide goadsporin (Fig. 4B),101 and the lasso peptide albusnodin102—but until recently, longer chain-length acylations were unknown. In 2018, the founding members of the anti-staphylococcal lipolanthines, i.e., microvionin and nocavionin, containing N-terminal N,N′-bismethylated guanidine fatty acid modifications were reported (Fig. 4B).103 Genome mining and biosynthetic studies indicated an extended family of lipolanthine-type peptides occur in Actinobacteria. The presence of genes encoding fatty acid synthase (FAS), PKS, and NRPS biosynthetic machineries clustered adjacent to the RiPP precursor and maturase encoding genes are suggestive of diverse acyl group appendages.104 In 2020, the structures and biosynthetic route to another type of lipidated lanthipeptide, the goadvionins, were reported.105 Goadvionin and goadpeptin analogues slightly vary in their peptide sequence (encoded by slightly different precursor cores) and the oxidation state of the lipid moieties. They contain trimethylammonio 32-carbon, polyhydroxylated polyketide acyl groups generated by a dedicated PKS. A Gcn5-related N-acetyltransferase superfamily (GNAT)-member, GdvG, catalyses the polyketide-RiPP coupling step in an ACP-dependent manner from the PKS to the N-terminus of the modified core peptide following proteolytic cleavage of the precursor leader. Lipolanthine clusters also contain GNAT homologues thought to be responsible for diverse N-acylations.
Whereas to date most lipid modifications in lipopeptides of NRP or RiPP origins are located at the peptide N-termini, the newest family of lipopeptide RiPPs, termed selidamides, are N-acylated at Lys or ornithine side chains.106 Ornithines arise from modification of Arg by peptide arginases, as described in the ornithine section (see Section 2.8 Ornithine residues).107,108 Three founding members were recently characterised by heterologous expression of pathways in E. coli: kamptornamide and nostolysamides from cyanobacteria, and phaeornamide from the marine alphaproteobacterium Phaeobacter arcticus. Phaeornamide was furthermore directly detected in the host, indicating it is an authentic natural product. Acyltransferases belonging to the GNAT superfamily, but with distinct phylogenetic lineage to the lipolanthine and goadvionin type GNATs, are responsible for catalysing the side-chain fatty acylations. The acyl groups appear to be recruited from the primary metabolic lipid pools, and the GNATs exert tight chain-length specificity, transferring C12, C10–OH and C16(–OH) fatty acids to kamptornamide, phaeornamide, and nostolysamides, respectively. This specificity stands in contrast to many NRP products, which are generated in difficult-to-separate mixtures of differentially fatty-acylated congeners. Bioinformatic analyses of the novel subclass of GNATs and their respective biosynthetic gene clusters indicate a wide distribution of selidamides in bacteria with diverse precursor sequences. Their compact gene clusters and propensity to make a single fatty acylated product lends the selidamides well to synthetic biology strategies for peptide engineering.
In addition to N-acylations, isoprenoid alkylations of diverse carbon and heteroatom positions in linear and cyclic cyanobactin RiPPs are known.109 The responsible prenyltransferases tend to attach 5-carbon prenyl units, but 10-carbon geranyl transferases have also been characterised.110 Tyr, Ser, and Thr O-prenylations were the earliest examples and recent genome mining efforts have expanded the scope to C- and N-prenylations of Trp, bis-N-prenylation of Arg,111C-geranylation of His,110 and even prenylations at the amide nitrogen or carboxy oxygen of linear peptide termini.112,113 Crystal structures of several peptide prenyltransferases have been solved, providing a mechanistic basis for their broad substrate specificity toward alternative peptide and even small-molecule substrates containing suitable acceptor moieties in a largely leader-independent manner.110,114 Furthermore, the preference for prenyl versus geranyl units can be toggled by a single point mutation in the prenyltransferase, as shown for interconversion between C5 and C10 isoprenoid preferences for PagF from prenylagaramide and PirF from piricyclamides RiPPs.115–117 In a distinct class of prenylated RiPPs, ComX pheromones from Bacillus subtilis are geranylated at C3 of Trp4 within a 6-residue core peptide.118
In NRPS products (and NRPS-PKS hybrids), heterocycles of the oxazole series are, for example, found in the iron-chelating siderophores vibriobactin (Fig. 5A) from Vibrio cholerae121 and mycobactin from Mycobacterium tuberculosis122 (both contain phenyloxazoline moieties), and in the antiviral thiangazole123,124 (β-methyloxazole). Two adjacent, cyclodehydrated and dehydrogenated Ser residues occur in the cytotoxic compound diazonamide A (monochloro-bioxazole) and in the antiviral hennoxazole A (4,2-linked bisoxazole). Heterocycles of the thiazole series are found in the siderophores yersiniabactin125 (two thiazolines, one thiazolidine) and pyochelin126,127 (one thiazoline, one thiazolidine), as well as in the antibiotic bacitracin128 (thiazoline). Analogously to bisoxazoles, bithiazoles can be formed by two adjacent Cys residues, as discovered in myxothiazole129 and bleomycin.130
Cyclodehydration during NRPS biosynthesis is catalysed by Cy domains, which are variants of C domains (sometimes referred to as C’). The reaction occurs during elongation of the covalently bound substrate. It is presumed that Cy domains first catalyse the peptide bond condensation reaction and subsequently cyclise the thiol or hydroxyl side chain of Cys or Ser/Thr against the peptide backbone onto the previously formed peptide bond yielding (thio)hemiaminal intermediates that are dehydrated to form the CN bond in the thiazoline and oxazoline rings. These may subsequently be oxidised to thiazoles and oxazoles by oxidative (Ox) domains or reduced to thiazolidines and oxazolidines by reductive (Red) domains.
Products of ribosomal pathways can also contain heterocycles derived from Cys, Ser, or Thr residues. The family of linear azol(in)e-containing peptides (LAPs) is characterised by the presence of such heterocycles.25 A prominent representative is the antibiotic microcin B17 with four oxazoles and four thiazoles, where two of each form mixed tandem pairs of biazoles. Other RiPP families with azol(in)e heterocycles include thiopeptides (e.g. thiostrepton, nosiheptide, sulfomycin), cyanobactins (e.g. patellamide, trunkamide), and bottromycins (e.g. bottromycin A1).25 A RiPP product consisting solely of heterocycles is the telomerase inhibitor telomestatin (Fig. 5B), which has only recently been identified as a ribosomal product.79
While heterocyclisation in NRPS products occurs “co-translationally”, it happens posttranslationally in RiPPs. Oxazole and thiazole installations proceed in two steps: in the first step, the oxazoline or thiazoline is installed in an ATP-dependent reaction and in the second step, the formed heterocycle is oxidised to an oxazole or thiazole moiety in a flavin mononucleotide (FMN)-dependent reaction. The whole process is catalysed by a trimeric synthetase, whereby the first step requires an association of a YcaO enzyme and an E1-ubiquitine activating protein; these two are often fused in one protein (cyclodehydratase/heterocyclase). The second step is optional and catalysed by an FMN-dependent dehydrogenase.25,131 Distinctly, a parallel pathway towards thio(seleno)oxazole has been identified recently that relies on rSAM chemistry.132
Glycosylation is ubiquitous in secondary metabolite biosynthetic pathways. Approximately 20% of all bacterial natural products are glycosides.140 For a detailed overview of glycosylated peptides, the reader is referred to chapter 15 of the review article entitled “A comprehensive review of glycosylated bacterial natural products” by Elshahawi et al.140 In this section, we list a few examples of the most common NRPS glycopeptides and the (still) rather small number of RiPP glycopeptides.
Glycopeptide antibiotics are clinically used as cell wall biosynthesis inhibitors to fight infections by Gram-positive bacteria. Famous representatives are vancomycin from Amycolatopsis orientalis and teicoplanin from Actinoplanes teichomyceticus, which both contain O-glycosylations (Fig. 6A).141,142 Vancomycin carries two sugar moieties, i.e.D-glucose and L-vancosamine,141 whereas teicoplanin contains the carbohydrates N-acyl-β-D-glucosamine, N-acetyl-β-D-glucosamine, and D-mannose. Teicoplanin is a mixture of several compounds that differ in the fatty acyl side chain attached to β-D-glucosamine.142 Another O-glycoside natural product is the cytotoxin bleomycin, which is conjugated to a disaccharide of D-mannose and L-gulose.130 The sugar moieties can also be attached to nitrogen atoms; examples for this are the cyclic glycopeptide antibiotics mannopeptimycins. These structures contain the unusual amino acid α-amino-β-[4′-(2′-iminoimidazolidinyl)]-β-hydroxypropionic acid that is fused to D-mannose.143
In NRP biosynthesis, glycosylation is a tailoring reaction, which occurs after peptide release from the NRPS assembly line. In general, this regio- and stereospecific bond formation is catalysed by glycosyltransferases. Many biosynthetic pathways comprise several glycosyltransferases, each enzyme being specific for the glycosylated position and the transferred sugar moiety. Some glycosyltransferases are promiscuous and have been employed for in vitro glycorandomisation to generate libraries of peptides with diverse glycosylation patterns.144–146
The RiPP class of glycocins (glycosylated bacteriocins) is defined by the PTM glycosylation. Sugar moieties are found on Cys, Ser, or Thr residues yielding S- or O-glycosides.25 Representatives are, amongst others, glycocin F from Lactobacillus plantarum, sublancin 168 from Bacillus subtilis 168, and pallidocin from Aeribacillus pallidus.139,147 Sublancin 168 and pallidocin contain a single glycosylated Cys residue, whereas glycocin F contains both glycosylated Ser and Cys residues. There are also diglycosylated examples of glycocins; listeriocytocin and enterocin 96 both contain a diglucosylated Ser.148
Glycosylation has also been reported for a few other RiPP classes, such as thiopeptides (e.g., glycothiohexide α, philipimycin, nocathiacin I), lanthipeptides (NAI-112), lanthidins (cacaoidin), lassopeptides (pseudomycoidin), and sulfatyrotides (plant peptide PSY1).25,149–152 These examples are distinguished from the glycocins by the residue carrying the sugar moiety. Cacaoidin features an O-glycosylated Tyr (Fig. 6B),150 and PSY1 a hydroxyl-Pro residue.152 In the case of NAI-112, a Trp residue is glycosylated and it is the indole nitrogen that forms the glycosidic bond. This N-glycoside is to date unique to RiPPs.149 A special case is also found in pseudomycoidin; a C terminal phosphorylated Ser is glycosylated at the phosphate group.151
The sugar moieties attached to RiPPs range from glucose (sublancin, pallidocin, listeriocytocin) and arabinose (PSY1) to rhamnose and gulose (both cacaoidin). Analogously to NRPS biosynthesis, they are generally installed by glycosyltransferases. Some of these enzymes are able to transfer the sugar moiety onto different acceptor molecules, for example the glycosyltransferases from the biosynthetic gene clusters of thurandacin and glycocin F glycosylate both Cys and Ser residues.139,153 For some glycosylation steps, the enzymes are not characterised to date, such is the case for nocathiacin I.154 The glycosylation of the phosphate group in pseudomycoidin biosynthesis is hypothesised to be catalysed by a nucleotidyltransferase.151 The class-defining enzyme of glycocins is a SunS-like glycosyltransferease (protein family PF00535) in combination with a peptidase-containing ATP-binding cassette transporter.25
In NRP or NRP-PKS hybrid products, β-amino acids are commonly synthesised prior to incorporation and are selected by β-amino acid specific A domains. In a typical scenario, the β-amino acid is activated by the A domain and transferred to the PCP domain. The β-amino group then acts as nucleophile for condensation with the growing peptide chain thioester. Examples include the incorporation of β-Ala in bleomycin,159 (2R,3S)-MeAsp in microcystin (Fig. 7A),160 and (R)-β-Tyr in chondramides161 by modular cis-acting A domains, whereas representative discrete A domains select 3-aminononanoic acid in cremimycin,162 (S)-β-Phe in andrimid,163 and (S)-β-Tyr in the enediyne antibiotic C-1027.164,165 Many of these monomers are catalytic rearrangements of standard α-amino acids. A variety of enzyme types catalyse rearrangements from α- to β-amino acids, but PLP-dependent transaminases and aminomutases are the major classes.
Fig. 7 β-Amino acid residues occur in NRP (A) and in ribosomal peptides (B). The amino acid residues of the first structure shown in panel B is expected based on the PlpA2 core peptide. The common three-letter amino acid code is employed. The formation of the highlighted β-amino acid has been confirmed experimentally.166 |
β-Amino acids may also be formed in situ as NRPS-bound intermediates. In the antitumour drug bleomycin, on-line condensation of the N1 of Asp with the C3 of a Ser-derived α,β-unsaturated dehydroalanine in an aza-Michael addition reaction is proposed to yield the β-amino acid residue 2,3-diaminopropionate.
Examples of β-amino acids being added as late-stage tailoring reactions in NRPs are also known, as in the biosynthesis of the anti-tuberculosis antibiotics viomycin (Fig. 7A) and capreomycins. Here, a β-lysyl-carrier protein intermediate is generated from L-Lys by a minimalistic NRPS module containing only A and PCP domains in concert with a lysine-2,3-aminomutase. A discrete C domain is proposed to append the β-Lys to the main cyclic NRP structure.167,168
β-Amino fatty acids are also common starter units or in situ generated intermediates of lipopeptides. The β-amine is typically generated in situ on the mega-synthetase-bound thioester intermediate. The canonical mechanism for this is exemplified in the microcystin-type fatty-acylated cyclic peptides.169 In these, a PKS-extended β-keto-ACP fatty acyl unit is converted by an embedded PLP-dependent transaminase domain to form the analogous β-aminoacyl-ACP. This β-aminoacyl-intermediate is then elongated by NRPS modules. The β-amino group may be involved in macrocyclisation to release the mature macrolactam peptide product from the synthetase, as occurs in microcystin170 and the related PKS-NRP mycosubtilin, among others.171
In RiPPs, only a few examples of β-amino acids are known, and the biosynthetic logic for their production is unrelated. The spliceotides are a wide-spread family of RiPPs with posttranslationally installed α-keto-β-amides at diverse residues within distinct classes of precursor proteins.166,172 Such ketoamides are known pharmacophores that inhibit serine and cysteine proteases naturally found in some NRPs—such as cyclotheonamides, cyclotheonellazoles,173 orbiculamide,174 jahnellamides,175 and calyxamides176—suggesting that spliceotides may serve a similar function in nature. In further support of this function, three synthetically generated spliceotide representatives exhibited selective and potent inhibitory activity against pharmaceutically relevant proteases.172 Furthermore, the α-keto-moiety is uniquely electrophilic in a protein environment, facilitating chemical derivatisation at these modified sites, exemplified by the fluorescent tagging of a modified precursor protein in the original study.166 To date, the final spliceotide natural products from native hosts remain elusive, but heterologous expression in E. coli or alternative hosts have established the activity of the rSAM superfamily member enzymes—termed “spliceases”—responsible for catalysing α-ketoamide formation in their cognate precursor substrates. Labelling studies revealed that spliceases excise all but C1 of a Tyr residue (equivalent to removal of tyramine) within an “XYG” motif. In an unusual rearrangement of the peptide backbone, a new C–C bond is formed between C1 of the former Tyr and C1 of the upstream adjacent residue. In the originally characterised type I spliceotide systems from Pleurocapsa spp. Cyanobacteria, the resulting β-amino acid (“X” from the target motif) was natively Met or Leu, but mutagenesis studies and natural precursor variants from other spliceotide pathways showed a variety of α-keto-β-amino acids can be formed.166,172 Jahnellamides also contain α-keto-β-methionine as a direct structural parallel in an NRP product.175
In a second example of posttranslational installation of a β-amino acid, a cryptic methylation of the carboxylate side chain of a conserved L-Asp by an O-methyltransferase is key for its rearrangement to L-isoaspartate in the lanthipeptide OlvA(BCSA).177 The olv pathway from Streptomyces olivaceus NRRL B-3009 was studied by heterologous expression in E. coli. The former Asp side chain becomes integrated by a rearrangement into the backbone, introducing an extra methylene to form a β-amino acid with a carboxylate side chain. Mechanistic studies support that a succinamide intermediate is formed when nitrogen of the neighbouring Gly cyclises upon the carboxyl of the methyl ester within the precursor core. Hydrolytic opening of the succinamide yields isoaspartate. It is unclear if the electrophilic succinamide group is the true enzymatic product as multiple rounds of methylation occur on the same substrate residues. Homologues of the responsible O-methyltransferase are abundant in lanthipeptide pathways in Actinobacteria, which contain precursor proteins with a conserved Asp core residue, indicating this modification is widespread.
In cinnamycin RiPP biosynthesis, the C-terminal Lys of the core peptide attacks a dehydroalanine intermediate in a β-amino acid-generating aza-Michael-addition macrocyclisation reaction.178 This general type of reaction parallels the way β-amino acids are formed in some NRPs. β-Amino acids are commonly generated by Michael addition of ammonia or an amino-acid amino group to α,β-unsaturated carboxylates that can occur on either free monomers or mega-synthetase-bound intermediates.
Fig. 8 N-Methylamides. Peptide natural products with backbone N-methylation produced by NRPS (A) and RiPP (B) pathways. |
Backbone N-methylation in NRPs is commonly introduced by NMT domains, which are integrated in A domains and share sequence similarity with class I SAM-dependent methyltransferases.189 Hence, N-methylation occurs on the activated aminoacyl adenylate substrate prior to the condensation reaction.190–193
Until recently, peptide backbone N-methylation has been an exclusive characteristic of NRPs, However, in 2017, the first posttranslationally modified α-N-methylated peptides from the mushroom Omphalotus olearius were characterised: the omphalotins. In contrast to NRPS biosynthesis, where methylation occurs before amide bond formation, in RiPP biosynthesis the N-methyl group is introduced after amide bond formation. Omphalotins contain nine (out of the twelve in total) methylated backbone amide nitrogen atoms (Fig. 8B) that are installed by an unprecedented precursor-fused methyltransferase domain that acts iteratively on the core region in an autocatalytic fashion.194 Thereby, for NRPS as well as RiPP biosynthetic pathways, SAM serves as the methyl donor for N-methylation. Interestingly, the omphalotin precursors form homodimers, where each α-N-methyltransferase acts on the C-terminal core peptide of the other precursor.194,195 Upon completion of methylation, the core peptide is cleaved off and macrocyclised by OphP to yield the final omphalotin (Fig. 8B).11,194 The nematotoxic omphalotins are the founding members of the RiPP family of borosins that have recently been expanded and divided into different classes with other fungal representatives such as the gymnopeptides,196 lentinulins and dendrothelins.83 Even more recently, the N-methylated family of peptides was expanded by a model type IV borosin RiPP system: the more distantly related “split borosins”.197 As the name indicates, split borosins are characterised by a separately encoded α-N-methyltransferase and precursor peptide, with their gene clusters predominantly identified in bacteria. In vitro biochemical characterisation of the methyltransferase SonM and precursor peptide SonA from the bacterium Shewanella oneidensis showed that SonA is composed of an N-terminal five-helix-bundle (named the borosin binding domain) that serves as the leader peptide, which is connected to the core peptide by a linker. Besides borosins, backbone N-methylation has also been found in the RiPP family of proteusins to which the polytheonamides also belong: The FkbM-like methyltransferase EreM from the pythonamide pathway installs up to six backbone N-methylations, mainly on Val residues. EreM is the first member of the FkbM family member that displays N-methyltransferase activity instead of O-methyltransferase activity.198
In in vivo co-expression experiments, EreM showed activity with a variety of core analogues.198 Likewise, for both the fused and split borosins, the N-methyltransferase showed moderate sequence flexibility for the core peptide, which is a promising rationale for engineering cleavable, N-methylated linear peptides. Initial results indicate that NRP-related cores corresponding to cyclosporine A-like and dictyonamide-like peptide sequences can be modified by the omphalotin N-methyltransferase with up to five and nine methylations, respectively.194
Fig. 9 Ornithine-containing peptides can be produced nonribosomally (A) or by a PTM of ribosomally synthesised peptides (B). |
In NRPS biosynthesis, ornithine residues are installed by an ornithine-specific A domain. The formation of ornithine from Arg takes place before loading onto the NRPS module. Some modifications, such as N-acylation, occur when ornithine is bound to the PCP domain.20 Other modifications of ornithines can take place before loading onto the A domain. These are for example catalysed by δ-N-ornithine monooxygenase (hydroxylation) and δ-N-hydroxy-L-ornithine acetyltransferase (acetylation) during erythrochelin biosynthesis.200
Until recently, the incorporation of ornithine residues was thought to be limited to NRPSs. A ribosomal pathway could potentially be used to install tRNA-bound ornithine residues, but the tRNA aminoacyl ester is highly unstable since ornithine has the ideal chain length to form a δ-lactam ring. Likewise, this lactamisation reaction is the reason for the instability of ornithine-containing peptides and proteins: the formation of the lactam-ring can lead to spontaneous cleavage of the backbone. This is probably the reason why many ornithine residues are further modified, e.g., by hydroxylations and/or acylations, where the induced steric hinderance would prevent intramolecular lactamisation.
It has been shown that ornithine residues can be mimicked by Lys in RiPP biosynthesis; the designed NRPS analogues of brevicidine—containing three Lys residues instead of ornithines—showed similar anti-Gram-negative bacterial activity.201 Nevertheless, ornithine residues were recently discovered in the RiPP natural product landornamide A (Fig. 9B).107 In its biosynthesis, an Arg residue is posttranslationally converted to ornithine by an enzyme termed peptide arginase. Several representatives of this new enzyme family (PF12640) that modify a variety of precursor types have been identified in bioinformatics analyses and a few representatives were characterised by coexpression experiments conducted in E. coli. In general, peptide arginases seem to be promiscuous toward diverse precursor sequences, which makes them an interesting family for synthetic biology approaches.108 A distinct lineage of peptide arginases was recently identified from the biosynthetic pathway for enteropeptins, unusual sactipeptides containing an N-methylated ornithine residue and possessing self-bacteriostatic activity isolated from the gut microbe Enterococcus cecorum.202
In several RiPP products, ornithine residues have also been identified to be further modified. Similarly to NRPS products, peptide arginase-generated ornithines can also be hydroxylated and acylated in RiPP biosynthetic pathways, such as was observed in phaeornamide that contains an ornithine residue with a C10–OH fatty acid attached to the side-chain amino group (see Section 2.3 Lipopeptides).106 Likewise, the ornithines from enteropeptins are further modified by N-methylation of the side chain and thioether crosslinking to a neighbouring Cys at the backbone α-carbon.202
Over the past few years, much knowledge has become available on the minimal requirements of leader recognition for several RiPP classes.104,218–221 In a key example of precursor engineering, utilising knowledge of such minimal recognition motifs, Burkhart and co-workers combined the required leader elements from two different RiPP enzyme classes to rationally design a hybrid substrate.222 By using E. coli as the heterologous expression host, they were able to introduce a thiazoline, a lanthionine, and a D-amino acid into a single core peptide. Similarly, two thiazolines and sactionines (sulphur-to-α-carbon thioether bridges) were introduced into the same core peptide using a redesigned hybrid leader, showing the versatility of this approach (Fig. 10A).
Designer leader peptides have also been created for the in vitro flexizyme-enabled biosynthesis of macrocyclic thiopeptides. Flexizyme technology223 allows the reprogramming of codons to accept unnatural amino acids in vitro using aptamers to charge tRNAs. In this study, flexizyme-incorporated unnatural Se-phenylselenocysteine residues were chemically converted to dehydroalanine as an alternative to lanthionine dehydratases.224 The dehydroalanines act as substrates for macrocyclisation by the pyridine synthase TclM from thiocillin biosynthesis.225 Thiazoles were incorporated by the cumulative action of the cyclodehydratase LynD from aesturamide biosynthesis and the leader-independent azoline-oxidase TbtE from thiomuracin biosynthesis.224 In this hybrid chemical and biocatalysis approach, the leader recognition sequences from the unrelated TclM and LynD RiPP enzymes were combined in a chimeric leader to facilitate the generation of thiocillin-type thiopeptide scaffolds.
As an alternative approach to hybrid leader design, Franz and Koehnke exchanged full-length leader peptides for a shared core by a sortase A-mediated transpeptidation (Fig. 10B).226 As a proof of principle for the method, the in vitro maturation of the core with enzymes from unrelated pathways by both the cyclodehydratase LynD and the ω-ester-bond installing heterocyclase MdnC from microviridin J biosynthesis was realised.226 This approach could potentially be useful for combinatorial biosynthesis using RiPP enzymes that have strict leader peptide requirements or are not compatible with hybrid leader design, as is the case for the fused borosins, where the backbone N-methylating enzyme OphMA comprises the enzyme and the substrate as parts of a single polypeptide chain (see Section 2.7 N-Methylamides).194 Notably, all of the aforementioned chimeric and hybrid leader design and leader swapping examples used natural core sequences as substrates for the PTM enzymes to date.
Non-native core peptides have also been tested as substrates for various RiPP maturases to elucidate the biosynthetic potential of RiPP maturases.194,227,228 Often these studies include peptide libraries based on the natural core peptide yielding peptides still resembling native core sequences (with multiple variations).42,229,230 For example, testing the rSAM peptide epimerase OspD in a library approach revealed that OspD is an extraordinarily promiscuous maturase that does not require specific amino acid moieties or positions, and the epimerisation pattern is dictated by the peptide sequence.42 However, until recently, few attempts have been made to use RiPP enzymes to introduce modifications into designer peptides that are completely unrelated to native core sequences. Evidence for the combinatorial use of different RiPP enzymes on such non-native substrates is even more sparse. Indeed, the design of a hybrid leader is not required in some cases, as the RiPP modification enzymes required already stem from pathways with PTM enzymes that act on the same precursor leader type and have promiscuity for diverse core sequences (Fig. 10C).44,46,231 The application of such a pathway for NRP-mimicking is exemplified by landornamide. The wild-type precursor protein OspA is natively modified by three enzymes: peptide epimerase OspD, lanthionine synthetase OspM, and peptide arginase OspR to introduce D-amino acids, lanthionine bridges, and ornithines, respectively.107 Modifications by OspR and OspD were combined on several NRP-mimicking substrates fused to the native OspA precursor leader, successfully yielding a brevicidine-mimetic with both ornithines and a D-amino acid.108
For example, the LynD heterocyclase converts multiple Cys residues to thiazolines within a peptide substrate with broad promiscuity to the Cys-flanking residues, making it an interesting enzyme to install thiazolines within non-native substrates. These heterocycles are a recurring motif in bioactive secondary metabolites (see Section 2.4 Heterocyclisations), including some important NRPs, such as bleomycin and bacitracin. Thiazoline-installing enzymes such as LynD are potentially interesting to mimic these NRPs. LynD can be activated by providing the leader peptide in trans or by an engineered LynD enzyme with a leader fusion, thereby eliminating the requirement for designing chimeric leader peptides.236 As an added benefit, free-peptide substrates do not require proteolytic cleavage of the leader and may facilitate combinatorial biosynthesis. Similarly engineered leader-fusions were created for the LctM lantibiotic synthetase from lacticin 481 antibiotic biosynthetic pathway234 and the graspetide ATP-grasp ligases.233,237
Furthermore, for some of these tailoring maturases, the enzyme might require the instalment of a previous PTM in order to introduce the next one, where the leader might be necessary for the initial modification (Fig. 10E). For example, in duramycin biosynthesis, a lysoalanine-crosslink is formed by DurN242 that distantly resembles the crosslink found in the nonribosomal antibiotic bacitracin. DurN does not require a leader peptide for its activity, but uses a substrate-assisted mechanism that involves a hydroxylated Asp residue that is previously installed by a different duramycin biosynthetic enzyme.242 Similarly, for the incorporation of D-amino acids by the hydrogenase LtnJ, an α,β-dehydrated Ser or Thr residue is required that can be installed by the leader-dependent NisBC machinery.246 In landornamide biosynthesis, the activity of the arginase OspR, the rSAM epimerase OspD, and the lanthionine synthetase OspM is coordinated to yield a product containing two ornithines, two D-amino acids, and two lanthionine rings. Investigations on the progression of biosynthetic events of these enzymes on the native core peptide indicated that OspD and OspM are co-dependent: neither enzyme catalysed complete conversion of the core peptide when the activity of the other enzyme was absent.107 In addition, instead of a leader peptide, modification enzymes might require a certain “recognition” motif in the core peptide or structural feature such as a macrocycle to install its modification.166,239
Two of the most promiscuous RiPP enzymes described to date are macrocyclases: for example, the macrocyclases from cyanobactins, such as PatG and PagG,82,228,249,250 and the macrocyclases from some lanthipeptides pathways, such as ProcM and SyncM.251,252 The ProcM lanthionine synthetase naturally accepts 29 different substrate precursor proteins encoded in the host genome to generate a mixture of RiPP products collectively known as prochlorosins (or cyanotins as the collective name for products from similar pathways).251 Although lanthionine bridges are not present in any NRP described to date, lanthionine ring structures might provide similar proteolytic stability to those observed for lactones, lactams and other common NRP macrocycles.201 Head-to-tail cyclisation, such as the reaction catalysed by cyanobactin macrocyclases, is a more commonly observed modification in NRPs (see Section 2.2 Macrocycles) and is present in potent antibiotics such as gramicidin S and tyrocidines.228 The ProcM lanthionine synthetases and cyanobactin macrocyclases both have an exceptionally high tolerance for variable core sequences that have been applied to generate libraries of non-native products, including macrocycles with D-amino acids and other nonproteinogenic amino acids.229,250 Furthermore, core substrate requirements are minimal for these cyclising enzymes: lanthionine generation by ProcM requires only a Cys and a Ser/Thr to form the lanthionine bridge in the core peptide, and in cyanobactin biosynthesis, a conserved C-terminal recognition element flanking the core peptide, which is excised upon N-to-C cyclisation, and an azoline (heterocycle) or Pro at the N-terminus of the released core are all that is required.82,227 Similarly, macrocyclisation by the RiPP enzymes involved in lactazole biosynthesis, as well as macrolactamisation in lasso peptides, were shown to be flexible enzymes for heavily mutated native substrates. However, compared to ProcM and PagG, these enzymes call for more specific core motifs and thus, might be less amenable for NRP-mimicry.218,253,254 Nonetheless, since macrocycles are a commonly observed modification in NRPs, the broad substrate specificity of these enzymes show promise to generate NRP-like products.
Further interesting promiscuous enzymes for NRP mimicry are for example D-amino acid-installing enzymes, such as LtnJ and OspD. LtnJ-type hydrogenases from lanthipeptide-type pathways convert dehydrated Ser or Thr residues to D-Ala or D-ethylglycine, respectively. The rSAM peptide epimerase OspD from the proteusin landornamide pathway has been shown to install D-amino acids in highly diverse unnatural core sequences at many different proteinogenic residues and in different patterns (see Section 2.1 D-Amino acids).42 Core-swap experiments using chimeric precursors as substrates for various OspD-like proteusin epimerases suggested that core elements rather than epimerase identity largely direct the regiospecificity of modifications to control the pattern of installed D-amino acids.43 However, the core peptide elements that direct the site(s) of epimerisation are not yet defined and are therefore difficult to predict at this time.42
Thus, an important challenge is how to guide the PTM enzymes towards modifying the desired amino acid, as the substitution of certain amino acids can alter the pattern and the extent of modification.45,255–257 For example, the NRPS product cyclosporine A has been closely mimicked by replacing the fused borosin OphMA core peptide with a cyclosporine A-resembling one. Subsequent backbone N-methylation by OphMA and macrocyclisation by OphP yielded products with methylation patterns that did not fully correspond to the pattern of the cyclosporine A natural product.194 Although the NRP has structurally been closely mimicked, in this case, the position of the methyl groups of cyclosporine A is very likely to be important for bioactivity. Nonetheless, this study shows the possibility of using OphMA for introducing N-methyl groups in non-native substrates. In addition, the cyanobactin prenyltransferases have been shown to act on a variety of linear and macrocyclic peptides, together with a high regio- and stereospecificity. The broad substrate selectivity includes hydroxyl and phenolic forward and reverse O-prenylation, as well as C- and N-termini, Arg and Trp prenylation, thereby highlighting this broad RiPP toolset for peptide alkylation.111,113–116,258–260 Interestingly, the prenyltransferase PagF was also shown to prenylate a lanthipeptide, highlighting its broad substrate specificity and potential use for the generation of NRP lipopeptide mimics or to boost the bioactivity or pharmacokinetics of peptides.110,117,261
Besides natural promiscuity of many RiPP enzymes, certain limitations of their substrate selectivity can be circumvented by the search for novel analogues and by engineering the enzyme itself.82 For example, although the cyanobacterial prenyltransferases can function on a diverse range of acceptor substrates, they show strict specificity for the prenyl donor. However, this prenyl donor specificity was altered by changing a single amino acid in the isoprene-binding pocket of the PagF enzyme, highlighting the ability to further engineer RiPP enzymes by site-directed mutagenesis to increase donor promiscuity.117 Similarly, adapted substrate specificity for the NisB dehydratase was obtained.262
Altogether, it is possible to engineer RiPP combinatorial pathways that act on NRP-mimicking core substrates. This however requires the careful consideration and design of hybrid leaders, engineered RiPP enzymes, biosynthetic timing, and installation of other substrate requirements. Another unique challenge herein is to maintain efficient conversion of the non-native core by several RiPP enzymes to obtain sufficient production of the multiple-modified product. Also, specific selection methods must be put into place to select products of interest from the heterologous product mixtures that are generated. When applied and designed successfully, RiPP combinatorial pathways can generate libraries of novel NRP-like products. The production, screening and selection methods of these libraries will be addressed in the following section.
An important factor for the successful engineering of NRP mimics is the production method. Although heterologous expression and native expression of a variety of RiPP enzymes in vivo is well-established,263–267 the in vitro generation of NRP-mimicking substance libraries by combinatorial biosynthesis offers several advantages over in vivo systems. First, as the core peptide is often not completely modified in vivo, combinatorial biosynthesis with an NRP-mimicking substrate will generate a heterologous mix with the desired peptide produced only at low amounts. In vitro systems are advantageous as they allow more control over reaction conditions. For example, monitoring reaction times, optimising substrate-to-enzyme ratios, and exerting temporal control (e.g., the stepwise addition of enzymes) can reduce the amount of non-modified or partially modified peptide and mitigate the product mixtures caused by competition of multiple enzymes for the same functional group.238 Second, a cell-free production framework can avoid issues encountered in in vivo heterologous expression systems, such as product toxicity, poor gene expression, pathway enzyme inactivity, or other host incompatibilities. Even more, noncanonical amino acids present in many NRPs—including D-amino acids, β-amino acids and N-methylated amino acids (see Chapter 2)—have been introduced in vitro using methods that do not rely on PTMs but rather on incorporation of the noncanonical amino acid by engineered ribosomes that charge tRNAs with noncanonical amino acid monomers.268,269 One could imagine using these technologies to generate modified precursor proteins that are further combined with RiPP modifying enzymes in vitro to generate hybrid modified peptides. One advantage of in vivo production approaches is that any biosynthetic enzyme, including difficult-to-purify enzymes, can be utilised to install PTMs. Moreover, the most noteworthy advantage of using native or heterologous hosts is that it allows for large-scale production of RiPP analogues and the possibility for direct bioactivity screening as compared to in vitro methods.
Flexizyme-assisted biosynthesis is another in vitro approach that has been applied to study a variety of RiPP enzymes.101,224,235,253,274 Flexizymes (flexible tRNA acylation ribozymes) are aptamers developed to condense a wide array of amino acid esters with tRNAs of choice, thereby eliminating the requirement of structural similarity to the canonically loaded amino acid as is needed for amino acyl-tRNA synthetase-based loading. Flexizymes thus facilitate reprogramming of the genetic code to express peptides containing various nonproteinogenic amino acids.223 This in vitro production platform has been exploited for the incorporation of multiple, noncanonical amino acids within a thiopeptide scaffold, which would be highly challenging using solely RiPP combinatorial biosynthesis.253 In addition, chemical handles, such as the noncanonical amino acid Se-phenylselenocysteine (SecPh) can be incorporated, which allow for the introduction of dehydroalanines by oxidative elimination with H2O2, thereby circumventing the use of a lanthipeptide dehydratase and consequently reducing the number of RiPP enzymes in a combinatorial pathway to allow for a less heterologous mix of products.224 Though this approach can therefore in theory generate a plenitude of NRP-mimics, and is not interfered by proteolytic stability issues (as one might observe with in vivo systems and cell-free expression systems), it requires the successful purification of the RiPP enzymes of interest and generates low production titres relative to other technologies. Low production titres can hereby pose a significant challenge, as it complicates the characterisation of the heterologous product mixture and requires sufficient knowledge on the promiscuity and efficiency of the RiPP enzymes used to be able to shift the reactions to a more homologous product.
Combinatorial pathways from in vitro libraries have also been integrated with high-throughput screening, such as by mRNA display.275–277 mRNA display is a peptide display technology in which peptides are associated to their own mRNA during in vitro ribosomal translation via a puromycin linker. By panning this mRNA-peptide fusion library after reverse transcription of the mRNA tag against the target of interest, the peptide of interest can be isolated by several means of selection (Fig. 11A), after which sequence information of the peptide is obtained by sequencing of the mRNA tag.278 As this technique yields a linear peptide, modifications can be introduced either during in vitro ribosomal translation via the previously mentioned FIT system (flexible in vitro translation facilitated by flexizymes) or via PTMs. For the latter, mRNA display has shown to be compatible with the lactazole and pantocin RiPP machineries.276,277 The mRNA display method was hereby used as a biochemical tool to study substrate recognition by the RiPP enzyme PaaA, an enzyme that catalyses the double dehydration/decarboxylation of two Glu residues in its peptide substrate PaaP to form a fused-bicyclic core. The modification of Glu allowed for the selection of modified from unmodified substrate analogues by a degradation step with the Glu-specific GluC protease and subsequent enrichment by a streptavidin pulldown assay. A broader applicability for the technique with other RiPP enzymes looks promising. Recent advances in the combination of incorporating multiple noncanonical amino acids through the FIT-system with mRNA display (referred to as the RaPID system for ndom nonstandard eptides ntegrated iscovery) theoretically already allow for mimicking of natural products, though screening for bioactivity is still a barrier to be overcome.276,277,279
Together, in vitro synthesis approaches are highly compatible for screening a variety of RiPP peptides bearing several noncanonical amino acids, given that RiPP enzyme(s) can be rendered active in these cell-free systems and an appropriate selection method is in place to exclude unmodified variants from the RiPP library. Interesting tailoring reactions catalysed by oxygen-sensitive rSAM enzymes might herein not yet be accessible for cell-free platforms or will require special reaction conditions in vitro in order to pave the way for combinatorial biosynthesis.280 Next to making use of the RiPP biosynthetic enzymes at hand, the in vitro synthesis methods described can be combined with synthetic chemistry through hybrid chemical reactions to further tailor the final NRP-emulating product.
Besides antimicrobial discovery screening assays—phage, yeast, and bacterial display techniques also have been adapted to allow for screening of large RiPP libraries—mainly using lanthipeptide biosynthetic enzymes (LctM, HalM, ProcM, NisBC).283–285 These screening methods were largely successful due to the high conversion rates of alternative peptide sequences to cyclic structures and the likely lack of activity for unmodified linear lanthipeptides. Aside from cell surface display techniques of modified peptides, successful intracellular lanthipeptide libraries have been generated as well. Such an intracellular lanthipeptide library was generated using the previously described promiscuous ProcM enzyme in E. coli, thereby identifying a bicyclic lanthipeptide that was able to successfully disrupt the protein–protein interaction crucial for HIV budding from infected cells. This lanthipeptide, displaying novel biological activity, was selected by coupling the lanthipeptide library to a bacterial reverse two-hybrid system, where inhibition of the viral budding particle protein–protein interaction caused reporter genes for cell survival to be expressed.286 Recently another in vivo peptide selection strategy was developed to identify RiPPs that bind to a single target of interest, in this case the SARS-CoV-2 spike receptor binding domain. In this study, a precursor peptide library was fused to half of a split intein and split σ factor, with the target of interest fused to the other half of the split intein and σ factor. Upon binding of the modified precursor peptide to the target of interest, the fusion of the split inteins releases the σ factor that turns on a promoter for a selectable marker (Fig. 11B).287 In contrast to most other reporter RiPP peptide libraries that are based on lanthipeptide biosynthesis, the cyclic peptide library herein is generated through rSAM mediated biosynthesis from the freyrasin family,288 in which a thioether macrocycle is formed between a Cys and Asp or Glu. This highlights the possibility of using rSAM enzymes for the generation of in vivo RiPP libraries, which can—as described in the previous sections—mimic many different NRP features.
In conclusion, several library generation and screening techniques have been developed for different classes of RiPPs, for peptides generated by both in vivo and in vitro techniques. Recent advances in cell-free protein synthesis and flexible in vitro translation (FIT)-systems have significantly expanded the construction of vast peptide libraries, as they allow for the incorporation of several noncanonical amino acids and, depending on the desirable bioactivity of the final peptide product, coupling to high-throughput assays for screening of new drug leads. Although most of these techniques have been developed using a single RiPP class, i.e., lanthipeptides, they are in theory amenable to combinatorial biosynthesis, thereby taken into consideration that the tags for mRNA, phage and yeast display that are often necessary for selection are compatible with the promiscuity of the enzyme and leader peptide design. New interesting lead-peptides can subsequently be synthesised in vivo by combinatorial biosynthesis.
As reviewed here, mimicking NRPs by RiPP biosynthesis enzymes requires a sufficient promiscuity of RiPP enzymes to install modifications in the NRP-mimicking core template, correct biosynthetic timing, optional leader design, choice of in vitro or in vivo expression and coupling synthesis to efficient screening methods. As the biochemical mechanisms and astounding biosynthetic plasticity of several RiPP pathways are being deciphered, the opportunities and limitations for the creation of rationally engineered products start to be discerned.260,289 Though vast libraries of NRP-mimicking peptides can be generated, the promiscuity of RiPP enzymes involved remains a bottleneck and is not limitless. RiPP enzymes from many different classes are known to disfavour certain amino acids in the core peptide such as negatively charged Asp and Glu residues in lactazole and PagG macrocycle biosynthesis, or do not accept nonproteinogenic amino acids, such as D-amino acids for OphMA.82,253 Moreover, despite their promiscuity, RiPP enzymes might require certain motifs, preformed secondary structures or scaffolds in their core peptide that are indispensable or highly contribute to efficient modification, thereby limiting the NRP-mimicking variants that could be synthesised in practice.166,253,290–292 Recently, a number of studies have indeed indicated at least partial recognition of the RiPP enzyme resides on the core peptide. For example, highly specific substrate recognition of the side chain-to-side chain macrolactam- and macrolactone-catalysing graspetide macrocyclase PsnB for its core substrate was shown.292 Another example is the leader-independent installation of α-keto-β-amino acids in protein backbones; the rSAM splicease PlpXY recognises an 11-residue splice tag from the natural core peptide in diverse proteins.245 Mechanistic studies on ProcM also suggest that its extreme substrate tolerance is a result of the substrate sequence, rather than the enzyme determining the outcome of the cyclisation process.251,256,257 Indeed, some enzymatic limitations can be overcome by engineering of individual RiPP enzymes or by genome mining efforts in search for more promiscuous variants. Another possible bottleneck for NRP-mimicking is that in novel combinatorial biosynthesis, a certain desired order or combination of modifications might still not be possible even after optimisation of the novel biosynthetic pathway, as in certain RiPP pathways RiPP enzymes are proposed to communicate with each other via the substrate or require a certain cooperativity, and interactions between installed PTMs influence the assembly process considerably.242,293,294
This brings us to another pending question concerning the degree of flexibility of NRP scaffolds themselves to exert their biological action. When using combinatorial biosynthesis, the chance is high that the NRP “scaffold” will need to be adapted in order to ensure modification by two different enzymes. Even though the aim is not to mirror the NRP but to structurally mimic it, will structural mimicking of the NRP be sufficient to create novel drug leads? Structure-activity relationship (SAR) studies have, for example, shown that replacing the thiazolidine ring in the novel anti-MRSA peptide lugdunin with several similar structures diminished the activity of the peptide, while the activity was retained for its enantiomer.295 Further important questions to create NRP analogues are whether all the biosynthetic tools are at hand to mimic the NRP pharmacophore, in addition to whether these RiPP enzymes are compatible with other biosynthetic enzymes (when necessary) and the availability of appropriate screening methods. Moreover, are NRP analogues possibly more readily attainable through chemo-enzymatic methods? The choice of the NRP to mimic and knowledge from SAR studies on essential motifs for bioactivities will therefore have to be carefully considered for the successful rational design of these new-to-nature peptides. Notably, the RaPID system that integrates genetic code reprogramming by flexizymes with mRNA display technology is advancing rapidly. The specific introduction of several noncanonical amino acids and ring structures with or without the addition of RiPP enzymes that structurally mimic NRPs, is already possible.279 If sufficient amounts of peptides and a suitable method to screen for bioactive peptides are available, this in vitro chemo-enzymatic technology is another viable strategy for the generation of novel NRP-emulating drug leads.
Despite these challenges, some newly discovered RiPP enzymes are highly interesting to create NRP mimics, as their modifications are challenging to install by synthetic approaches, or their modifications are essential for the bioactivity of certain NRPs. Examples are the hybrid PKS-RiPPs lipolanthine and goadvionin enzymes to create lipopeptides (essential for many NRP lipopeptide antibiotics), glycosyltransferases to create glycosylated peptides that are abundant in NRPS products together with the bicyclic structures in cittilins resembling glycopeptide antibiotics vancomycin and teicoplanin, as well as backbone N-methylation to improve pharmacological properties of the peptide.103,105,150,296–298 However, the biotechnological applicability of many of these enzymes for NRP-mimicking mostly remains unexplored to date, as the enzymes have not yet been characterised in detail.
All in all, the generation of novel NRP-mimics by RiPP enzymes is still in early development, although a few proofs-of-principle for the partial structural mimicking of brevicidine, daptomycin, gramicidin S, and cyclosporine A have already been demonstrated.108,194,201 Coupling these early NRP engineering efforts to combinatorial biosynthesis and library screening will potentially yield many interesting novel lead drug scaffolds.
Footnotes |
† Present address: Pharmaceutical Institute, Department of Pharmaceutical Biology, University of Tübingen, Auf der Morgenstelle 8, 72076 Tübingen, Germany. |
‡ Equal contribution. |
This journal is © The Royal Society of Chemistry 2023 |