Tom H.
Eyles
,
Natalia M.
Vior
,
Rodney
Lacret
and
Andrew W.
Truman
*
Department of Molecular Microbiology, John Innes Centre, Norwich Research Park, Norwich, NR4 7UH, UK. E-mail: andrew.truman@jic.ac.uk
First published on 19th April 2021
Thiostreptamide S4 is a thioamitide, a family of promising antitumour ribosomally synthesised and post-translationally modified peptides (RiPPs). The thioamitides are one of the most structurally complex RiPP families, yet very few thioamitide biosynthetic steps have been elucidated, even though the biosynthetic gene clusters (BGCs) of multiple thioamitides have been identified. We hypothesised that engineering the thiostreptamide S4 BGC in a heterologous host could provide insights into its biosynthesis when coupled with untargeted metabolomics and targeted mutations of the precursor peptide. Modified BGCs were constructed, and in-depth metabolomics enabled a detailed understanding of the biosynthetic pathway to thiostreptamide S4, including the identification of a protein critical for amino acid dehydration that has homology to HopA1, an effector protein used by a plant pathogen to aid infection. We use this biosynthetic understanding to bioinformatically identify diverse RiPP-like BGCs, paving the way for future RiPP discovery and engineering.
1 features multiple post-translational modifications that are common to most thioamitides but are otherwise rare in nature, including four thioamide bonds, a 2-aminovinyl-3-methyl-cysteine (AviMeCys) macrocycle,8 histidine bis-N-methylation, histidine β-hydroxylation, and an N-terminal pyruvyl group. 1 also features tyrosine O-methylation, which is not found in other thioamitides.5,9 These features are interesting due to their structural and biosynthetic rarity, along with the possible influence they have on bioactivity. For example, histidine bis-N-methylation is a modification not found in other RiPPs, and the installation of multiple thioamide bonds is extremely rare.10 However, there was limited data on thioamitide biosynthesis at the onset of this study.11,12 We hypothesised that understanding thiostreptamide S4 biosynthesis would reveal new biosynthetic machinery involved in RiPP maturation, which could inform future pathway engineering and genome mining for new RiPPs with related biosynthetic machinery. Notably, thioamitide biosynthesis is predicted to require lanthipeptide-like Ser/Thr dehydrations, but homologues of the Lan proteins that usually catalyse this step are not encoded in thioamitide BGCs.2
Gene deletions are commonly used to understand natural product biosynthesis, as they can lead to the production of intermediates and therefore reveal the role of a gene, especially as there can be substantial challenges in the in vitro reconstitution of complex multi-step pathways from Actinobacteria. However, there are significant difficulties in using gene deletions to understand RiPP biosynthesis.13 If the deleted biosynthetic gene produces a protein that acts early in a biosynthetic pathway, then the resulting precursor peptide intermediate is often unstructured and minimally modified. These peptides can be readily digested by endogenous proteases and acetylated endogenously.13 Therefore, the identification of intermediates and shunt metabolites can be very challenging, especially if these issues are combined with low productivity in complex media (Fig. S1†).
Here, we use a combination of heterologous expression, gene deletions, untargeted metabolomics, and yeast-mediated core peptide engineering to understand the biosynthesis of thiostreptamide S4. This provides a genetic basis for almost every post-translational modification in thioamitide biosynthesis. In addition, the identification of genes associated with Ser/Thr dehydration enables the discovery of diverse RiPP-like BGCs across multiple bacterial taxa.
All genes in the predicted tsa BGC were independently deleted in pCAPtsa using PCR targeting, replacing most of the target gene with an in-frame 81 bp scar sequence while retaining the original start and stop codons.17 The up- and downstream regions described above were also deleted. These mutated plasmids were then expressed in S. coelicolor M1146. This revealed that tsaA–tsaJ and tsaMT were required for the biosynthesis of 1, whereas the molecule was still produced in ΔtsaK, ΔtsaL and ΔtsaMO (Fig. S2B; † for simplicity, each S. coelicolor M1146 strain harbouring a mutated version of pCAPtsa will herein be referred to by the mutation only). Production of 1 following deletion of tsaK was surprising given that this gene is conserved amongst thioamitide BGCs5 and encodes a cysteine protease that we predicted was involved in leader peptide removal. It is possible that native peptidases from the heterologous host, S. coelicolor M1146, can complement this deletion as there was a small drop in the production of 1 (Fig. S3†). TsaK may only be necessary when 1 is produced in the native host. Similarly, tsaL-like genes are conserved amongst almost all thioamitide BGCs, although there is no clear catalytic domain in TsaL (Table S1†). This analysis also demonstrated that tsaMO is not required for the biosynthesis of 1.
The deletion of the trio of upstream genes, tsa-3, tsa-2, and tsa-1, caused a significant drop in production (Fig. S3†), whereas production was unaffected by deletion of tsa+1, tsa+2, and tsa+3. With the exception of ΔtsaA, each deletion that abolished production was successfully complemented (Fig. S2B†), which ensured there were no unwanted polar effects of each gene deletion. Genetic complementation experiments were carried out by expressing the gene from the strong constitutive promoter PermE*18 in pIJ10257,19 which integrates into a φBT1 site in the S. coelicolor M1146 genome. This enabled us to determine the correct start codon of each gene (Table S3, Fig. S4†), which revealed that there are two series of genes with overlapping start and stop codons within the BGC, tsaC-G and tsaH-MT, with an untranslated 28 bp region between tsaG and tsaH.
Fig. 2 Extracted ion chromatograms (EICs) normalised for intensity showing the varied methylation and hydroxylation patterns of thiostreptamide-like molecules produced in S. coelicolor M1146 expressing the WT, ΔtsaG, ΔtsaJ and ΔtsaMT BGCs. Structures were inferred using detailed MS/MS analysis (Fig. S5†), which are consistent with 2–5 featuring full modifications on the N-terminal linear peptide portion (as in 1). The structure of the macrocycle from each metabolite is shown above the relevant traces; in each case the rest of the molecule is identical to 1. The 3* label indicates the +2 isotope peak of 3. |
Deletion of tsaMT, which encodes a class I SAM-dependant methyltransferase, resulted in the loss of 1 and 3 (Fig. 2). Instead, 2 was produced, which lacks the tyrosine methylation but is otherwise identical to 1, therefore confirming that TsaMT is the protein responsible for this modification. Tyrosine O-methylation is a rare modification but is observed prior to assembly line biosynthesis of the fungal phytotoxin pyrichalasin H.20 A retro-aldol MS/MS fragmentation that provides a loss of m/z 125.07 is consistent with histidine hydroxylation and bis-N-methylation in 2 (Fig. S5†). This shows that the tyrosine methylation is not required for the histidine hydroxylase and methyltransferase to function. ΔtsaJ produces 3 and 4 (Fig. 2), which both lack the histidine hydroxylation. This indicates that TsaJ, a non-heme Fe(II) and α-ketoglutarate dependent dioxygenase, is responsible for histidine hydroxylation. The production of 3 shows that histidine hydroxylation is not a prerequisite for tyrosine methylation or histidine bis-N-methylation.
Deletion of tsaG, which encodes a SAM-dependant methyltransferase, abolished production of 1 and instead led to production of 5, a version of 1 that lacks all modifications to the macrocycle but is otherwise fully mature (Fig. 2). This means that histidine bis-N-methylation is a prerequisite for TsaJ-catalysed histidine hydroxylation and TsaMT-catalysed tyrosine methylation. This indicates that TsaJ and TsaMT are unable to recognise a TsaA-derived substrate without the histidine methylations, which provide a permanent positive charge. The thioamitides are the only RiPPs that feature a bis-N-methylated histidine. Given that the macrocycle is correctly formed in each mutant, these results are consistent with a biosynthetic model where these modifications to the macrocycle are among the final steps in thiostreptamide S4 biosynthesis. TsaG-catalysed histidine methylation occurs first, which is then followed by TsaJ-catalysed histidine hydroxylation and TsaMT-catalysed tyrosine methylation in an undefined order. The role of TsaJ is consistent with a parallel study on the homologue in thioholgamide biosynthesis, ThoJ.12
To see if other thiostreptamide-like metabolites were produced by these mutants, the characteristic fragmentation pattern of these metabolites was used to search the LC-MS/MS data from the WT, ΔtsaG, ΔtsaJ, and ΔtsaMT strains. The macrocycle is one of the main fragments of 1–5, and so the masses of the different macrocycle fragments seen in 1–5 (m/z 687.33, 673.31, 671.33, 657.32, and 629.29, respectively) were used to search all fragmentation events in LC-MS/MS spectra. This enabled the preliminary identification of six new metabolites, 6–11 (Fig. S7 and S8†). 6–10 are proposed to be versions of 1–5 that are hydrolysed between Ala4 and Ala5 (Fig. 3, Fig. S7†), while 11 is predicted to be a version of 5 where the other non-thioamide bond between Ala7 and the macrocycle is hydrolysed (Fig. S8†). These therefore result from hydrolysis of the only non-thioamidated peptide bonds in the tail portion of the molecule, which supports previous evidence that thioamide bonds protect molecules from proteolysis.21
Fig. 3 Untargeted metabolomic analysis of thiostreptamide S4 biosynthesis. (A) Matrix of detected molecules versus mutants. Red shading indicates the production of a molecule in a given mutant. “+2” indicates that the doubly charged m/z is detected. (B) Proposed structures of molecules 6–11 based on MS/MS and accurate mass data (Fig. S7 and S8†). *Permanent charge on bis-methylated histidine means that a single protonation generates a doubly charged molecule. Proposed structures of 13–16 and associated MS data are shown in Fig. S9–S10 and Fig. S17.† |
There were numerous difficulties in interpreting these data. Identification by MS/MS initially proved difficult due to limited fragmentation, and fragments that were observed could not be accounted for by the simple loss of proteinogenic amino acids. Notably, molecules containing thioamide bonds can undergo fragmentation to lose SH2; corresponding to a mass loss of 33.9877 Da that does not break the backbone of the molecule.22 This signature loss can be seen very clearly in the fragmentation of 1 (Fig. S6†) and can be used as a tool to identify metabolites that contain thioamides. This indicated that previously unidentified metabolites (m/z 552.23, 503.15, 465.22, 453.16, 392.11, 370.12, 348.14, 330.13 and 259.09) have thioamide bonds in their structure due to this signature fragmentation and are not produced by the ΔtsaA mutant (Table S4†). These molecules are hypothesised to be short shunt metabolites that are protected from proteolytic degradation by thioamidation.21
Fig. 4 NMR characterisation of 12 in CD3OD. See Fig. S11–S16 and Table S5† for NMR assignments. |
Following the characterisation of 12, the structures of 13 (m/z 552.23), 14 (m/z 465.21), and 15 (m/z 370.12) could also be proposed to be related acetylated and thioamidated short peptides, based on similar MS/MS fragmentation patterns, accurate mass data and predicted thioamidations (Fig. S9A†). This similarly enabled us to propose the structure of 16 (m/z 503.14), a metabolite produced by the WT and all ΔtsaC-F mutants (Fig. 3 and Table S4†). 16 is proposed to be N-acetyl-SerValSMetSAla, which we hypothesise derives from the precursor peptide (Ser1 to Ala4, Fig. 1A) that has undergone expected thioamidation of Val2-Met3 (Fig. S10†). Further support for this structure was provided by precursor peptide modifications (S1T and M3I), which led to expected mass shifts to this metabolite (Fig. S10; † see later section for a description of site-directed mutagenesis).
The structure of 12 (Fig. 4), and the predicted structures of 13–16 (Fig. S17†), provide key information towards the proteins involved in dehydration (Fig. 5). 12–14 are shunt metabolites of an intermediate that lacks the macrocycle but contains the Dhb8 residue that is required for macrocycle formation. In contrast, 15 can derive from an intermediate that contains an unmodified Thr8; the lack of modification making it susceptible to proteolysis. In all strains containing deletions of any of tsaC–F, all detected metabolites lack the macrocycle (12–16), which implies that they are involved in steps previous to its formation. Of these mutants, ΔtsaC and ΔtsaD produce thioamidated 15 and 16 (Fig. 3) but do not produce shunt metabolites containing Dhb8. We hypothesise that these metabolites derive from a modified TsaA that is not yet dehydrated at Thr8 and is therefore more susceptible to proteolysis at that position. This would indicate that TsaC and TsaD cooperate to catalyse dehydration of Thr8 to Dhb8.
TsaC contains an aminoglycoside phosphotransferase (APH)-like domain (pfam01636). APHs are structurally similar to eukaryotic protein kinases30 and it has been shown that some APH enzymes can also phosphorylate serine residues.31 We therefore propose that TsaC is responsible for phosphorylation of Thr8, allowing for a subsequent elimination reaction to dehydrate Thr8 (Fig. 4). The role of TsaD in threonine dehydration is currently unclear, although TsaD contains a HopA1 effector family domain (pfam17914). HopA1 itself is a type III effector that aids plant infection by the plant pathogen Pseudomonas syringae,32,33 although the mechanistic basis for this activity is unknown. Other effectors, such as the OspF family,34 do function as lyases to inactivate protein kinases in the host cell. OspF family effectors are also known as HopAI1-like proteins, but have no sequence homology to the similarly named HopA1-like proteins. The N-terminal lyase domain of the LanL family of lanthionine synthetases has sequence homology with OspF proteins.35 TsaD may therefore act as a C–O lyase to catalyse the elimination of a TsaC-installed phosphate group to dehydrate Thr8. Alternatively, TsaD may have a non-catalytic role that is essential for TsaCD-catalysed dehydration, such as precursor peptide binding. Very recently, two RiPPs containing lanthionine cross-links and AviMeCys macrocycles were reported (cacaoidin36 and lexapeptide37), whose BGCs encode homologues of TsaC and TsaD. Intriguingly, our results are in contrast to a very recent report on the biosynthesis of the thiosparsoamide, which indicated that lanthipeptide synthetases encoded outside of the BGC catalyse thioamitide dehydration.38
TsaE possesses weak homology to the APH-like phosphotransferase domain (pfam01636) that is also found in TsaC. However, ΔtsaE has a very similar metabolite profile to ΔtsaF (Fig. 3 and 5, Table S4†). It is somewhat surprising that the macrocycle cannot form in ΔtsaE, given that Dhb8-containing molecules are produced by this mutant and the cysteine decarboxylase, TsaF, is present. The lack of macrocycle could be explained if TsaE assists with AviMeCys cyclisation. However, it is unclear what role a phosphotransferase could play in cyclisation and there are no tsaE homologues encoded in BGCs for other Avi(Me)Cys RiPPs, such as the linaridins. An alternative hypothesis is that TsaE functions as the phosphotransferase involved in Ser1 dehydration to 2,3-dehydroalanine (Dha), which we predicted to be necessary for the formation of the N-terminal pyruvyl group of 1.
To demonstrate that the pyruvyl group originates from a dehydrated Ser1 instead of a pyruvyl transferase,46 a S1T mutant of TsaA was generated, producing the construct pCAPtsaS1T. S. coelicolor M1146-pCAPtsaS1T produced a molecule with m/z 1391.5598 (17) that was absent in the WT (Fig. S18†). This mass reflects an extra methyl group compared to 1 and MS/MS fragmentation is consistent with 17 containing an N-terminal 2-oxobutyryl moiety instead of a pyruvyl moiety (Fig. S18†). This therefore confirms that the natural N-terminal pyruvyl group originates from a dehydrated serine. Hydrolytic removal of the leader peptide generates an enamine in equilibrium with an imine that is predicted to spontaneously hydrolyse to the pyruvyl group (Fig. 6E). A Thr1 residue is seen naturally in the predicted core peptides for uncharacterised thioamitides from Micromonospora eburnea and Salinispora pacifica.5 The amino acid origin of the pyruvyl group is consistent with previous co-expression studies on epilancin 15X47 and polytheonamide dehydratases.48 The serine origin of the pyruvyl group means that it is plausible that TsaC, TsaD and/or TsaE are involved in Ser1 dehydration, given that 16 is proposed to contain an unmodified Ser1 residue and is produced by each of these mutants (Fig. 3 and S10†), but further experimental work is required to confirm the serine dehydratase.
We propose that the AviMeCys macrocycle is then formed, which first involves Cys13 decarboxylation to generate a reactive thioenolate. This is proposed to be catalysed by flavoprotein TsaF, given the absence of cyclised molecules produced in ΔtsaF and the prior characterisation of the TsaF orthologue from the thioviridamide pathway. Macrocyclisation itself may be non-enzymatic (Fig. 6D), given the lack of an obvious cyclase protein, although the apparent absence of multiple diastereoisomers of 1 suggests stereochemical control during AviMeCys formation.
The next step is bis-N-methylation of His12. Deletion experiments show that TsaG is responsible for this step. Histidine bis-methylation is not present in any other natural product family and provides a positive charge that may be important for biological activity.49 Gene deletion experiments show that this methylation acts as a gatekeeper for subsequent modifications: His12 β-hydroxylation and Tyr11 O-methylation, installed by TsaJ and TsaMT respectively. Our data indicate that these proteins preferentially act on substrates containing a bis-methylated histidine. Whilst mature thiostreptamide S4-like molecules detected in this study rarely lack the histidine N-methylations, we could readily detect mature thiostreptamide S4-like molecules lacking the histidine β-hydroxylation and tyrosine O-methylation (Fig. 2). Therefore, these modifications are not a prerequisite for leader peptide cleavage and associated pyruvyl formation, and may happen following leader peptide removal.
There are no clear data to allow the assignment of an enzyme for leader peptide cleavage. This was unexpected, as bioinformatic analysis shows that TsaK is a C1A family cysteine protease and homologues are encoded in other thioamitide BGCs. This protease family is rare in bacteria, although a C1A family protease catalyses removal of the leader peptide in polytheonamide biosynthesis.50 It is possible that the small change in production observed when tsaK is deleted (Fig. S3†) is because endogenous proteases catalyse hydrolysis of the leader peptide, as in the biosynthesis of many class III lanthipeptides.51 Following proteolysis, the pyruvyl group is likely to be formed spontaneously from Dha1 (Fig. 6E). This is supported by the production of 17 (featuring an N-terminal 2-oxobutyryl group) when the precursor peptide contains a S1T mutation (Fig. S18†).
This strategy was used to generate the S1T mutant that was discussed earlier. To test the tolerance of the biosynthetic enzymes to modifications to the macrocycle amino acids, we made four further mutants of the TsaA core peptide: T8S, Y11V, H12A and H12W (Fig. S19B†). T8S was constructed to assess whether dehydration and macrocyclisation takes place when Thr8 is swapped with a serine residue, which is found in this position in some related precursor peptides.5 This led to the production of 18, which has a mass (calc. m/z 1363.5311, obs. m/z 1363.5261) and MS/MS fragmentation that is consistent with a fully modified derivative of 1 featuring the expected AviCys moiety (Fig. S20†). In contrast, the other modifications were not tolerated, as no macrocyclised molecules were detected with the Y11V, H12A and H12W mutants. However, an increase in the production of 12 (Fig. S21†) in each mutant indicated that early stage thioamidation and Thr8 dehydration took place, but either TsaE or TsaF would not function.
A common metabolite detected throughout growth and extraction of thiostreptamide S4 (1) is the methionine sulphoxide derivative (19; Fig. S22†). Met3 is particularly susceptible to oxidation, which is problematic if this molecule was used in a clinical setting, as the methionine sulphoxide version of a similar molecule, thioholgamide, is around ten times less active than un-oxidised thioholgamide.3 To engineer thiostreptamide S4 into a more stable molecule, a version was made with Met3 swapped for an isoleucine (M3I), which is naturally found at this position in the thioalbamide precursor peptide.5 This modification was tolerated and led to the production of 20 (m/z 1359.59, Fig. S23†). As with a site-directed mutagenesis study on thioviridamide,53 these data indicate that precursor peptide mutagenesis represents a viable route to novel thioamitides, although the complexity of these pathways means that there are mutants that are not tolerated by all tailoring enzymes (Fig. S19B†).
Thioalbamide has a hydroxylated Phe5 not seen in other characterised thioviridamide-like compounds, so it was hypothesised that TaaCYP is responsible for this hydroxylation. To test this, we used yeast-mediated assembly to generate two new versions of the thiostreptamide S4 BGC with mutated tsaA genes: one encoding a core peptide with a containing a phenylalanine at position 5 (A5F), and TsaCoreTaa, where the entire thiostreptamide S4 core peptide was replaced with the thioalbamide core peptide (Fig. S19B†). Unfortunately, no related metabolites could be detected when these clusters were expressed in S. coelicolor M1146, meaning that these modifications were not tolerated by the thiostreptamide S4 tailoring enzymes.
These putative BGCs were manually assessed for characteristic features of RiPP BGCs: co-linearity of putative biosynthetic genes and a position at the beginning of biosynthetic genes for the short peptide gene. The thioamitides themselves belong to peptide Family 10. The majority of BGCs belong to Family 1A (Cyanobacteria) and Family 1B (Actinobacteria), which includes the BGCs for the antibiotics cacaoidin and lexapeptide, the first members of the recently described lanthidin RiPP family.36,37 Genes cao7, cao9 and caoD in the cacaoidin BCG encode a HopA1-like protein, a phosphotransferase and a cysteine decarboxylase homologous to TsaD, TsaC and TsaF respectively, which suggests that the AviMeCys group found in this molecule is installed following a similar mechanism as in the thioamitides. The precursor peptides in this family feature C-termini with highly conserved Thr and Cys residues (Fig. 7A), consistent with the production of diverse AviMeCys containing RiPPs. In parallel with our study, a new RiPP genome mining algorithm, decRiPPter, also identifies the discovery of a similar set of actinobacterial RiPP BGCs encoding HopA1-like proteins and phosphotransferases, which led to the discovery of pristinin A3.56
Fig. 7 Selected HopA1-associated precursor peptides and examples of corresponding BGCs. Networks57 represent short peptide networking output from RiPPER with a 40% identity cut-off (Fig. S27†). Sequence logos58 are shown for selected portions of the C-termini of each family (see Fig. S28 and S29† for full logos). In each BGC, the HopA1/phosphotransferase pair is highlighted with a grey bar and the HopA1-like protein accession is listed. (A) Families 1A and 1B, with precursor peptides of recently identified RiPPs highlighted.36,37,56 (B) Family 11/20 peptides, which co-occur in the same HopA1-LanC BGCs. Additional associated peptide networks are highlighted. |
Family 1A precursor peptides (570 peptides across 183 BGCs) are exclusive to Cyanobacteria and are encoded in partially conserved BGCs (Fig. 7A). These peptides have high conservation of their leader region, which features a conserved double glycine motif that is a common cleavage motif in lanthipeptides.29 In contrast, their C-terminal regions, which are predicted to correspond to the core peptide regions, are highly variable and do not feature conserved Ser, Thr or Cys residues (Fig. S28†). These BGCs typically encode multiple non-identical precursor peptides, which is common for cyanobacterial RiPP BGCs.59 20% of HopA1-like proteins are encoded near, or fused to, LanC-like cyclases, such as Family 11/20 (Fig. 7B) and Family 22 (Fig. S30†). The HopA1-phosphotransferase pair could catalyse the dehydration required for LanC-catalysed lanthionine bond formation.29 HopA1-LanC fusions could represent a new uncharacterised lanthionine synthetase, where the lyase and cyclase are fused in a single protein, thereby resembling LanM.29 We also identified additional diverse RiPP-like BGC families (Fig. S31–S32†). Determining the true products of these BGCs represents a significant future effort.
These data include a number of key findings about thioamitide biosynthesis, which enables a biosynthetic pathway to be proposed (Fig. 6A). Our work confirms that YcaO and TfuA domain proteins (TsaH and TsaI) are required for iterative thioamidation and this functions as a gatekeeper for all subsequent biosynthetic steps. Prior studies of archaeal YcaO proteins indicate that this is an ATP-dependent process.26,60 We define the proteins responsible for histidine hydroxylation and bis-methylation, as well as the reductase required for N-terminal reduction in the thioalbamide pathway. Bis-methylated histidine is currently only found in thioamitides, although 1-N-methyl-His is found in archaeal methyl-coenzyme M reductase, which is a protein that intriguingly features a number of other RiPP-like modifications, including thioamidation, methylation, oxidation and hydroxylation.61,62 Yeast-mediated assembly provided a route to site-directed mutagenesis of the thiostreptamide S4 precursor peptide, which demonstrated that the pathway is tolerant to precursor peptide mutations, but does stall at an early stage in the biosynthetic pathway with some mutations. This indicates that macrocyclisation is a bottleneck for engineering thiostreptamide S4 biosynthesis.
We show that a phosphotransferase and a HopA1-like protein (TsaC and TsaD) are required for dehydration, which represents a new route to α,β-dehydroamino acids. Our results contrast with a recent study indicating that lanthipeptide synthetases encoded outside of the BGC catalyse dehydration in thioamitide biosynthesis.38 Metabolomic results show that a further phosphotransferase (TsaE) is essential for biosynthesis, where it may have a role in either dehydration or macrocyclisation. A detailed informatic analysis using RiPPER22 shows that the phosphotransferase/HopA1-like protein pair defines multiple new RiPP BGC families, with representatives across over 1,000 sequenced genomes. The variety of tailoring enzymes and precursor peptide sequences indicates that the products will be highly diverse. This is supported by the parallel identification of HopA1-containing BGCs by the decRiPPter algorithm,56 which have been recently defined as lanthidins in antiSMASH 5.0.63
Our insights are supported by parallel studies of individual enzymes in other thioamitide pathways,11,12,64 as well as the recent discoveries of cacaoidin,36 lexapeptide37 and pristinin,56 which contain AviMeCys macrocycles, as predicted from our experimental and informatic analyses. We anticipate that the data reported here will inform further experimental work on the thioamitides and related RiPPs to determine key biosynthetic steps, including the true role of HopA1 domain proteins in both RiPP biosynthesis and as a P. syringae effector protein,33 given that this domain does not features a known catalytic domain. Similarly, it will be important to determine whether bacterial RiPP-associated TfuA proteins function in an equivalent way to the recently characterised archaeal TfuA protein, which hydrolyses thiocarboxylated ThiS to provide a sulphur donor for its cognate YcaO protein.28 A further key goal is to determine the effect that each thioamitide post-translational modification has on antiproliferative activity towards cancer cells.6,7 More widely, understanding the diversity of products made by HopA1-like associated RiPP BGCs will be a substantial and exciting research effort, especially given the diversity of pathways identified.
Footnote |
† Electronic supplementary information (ESI) available: Methods, ESI tables and ESI figures, ESI dataset 1 (XLSX): co-occurrence data for HopA1 proteins, ESI dataset 2 (XLSX): data for networked peptides from RiPPER, ESI dataset 3 (Cytoscape file): networked short peptides and associated data. See DOI: 10.1039/d0sc06835g |
This journal is © The Royal Society of Chemistry 2021 |