Jialiang
Wang
,
Zixin
Deng
*,
Jingdan
Liang
* and
Zhijun
Wang
*
State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Science & Biotechnology, Shanghai Jiao Tong University, Shanghai, China. E-mail: wangzhijun@sjtu.edu.cn; jdliang@sjtu.edu.cn; zxdeng@sjtu.edu.cn
First published on 15th August 2023
Time span of literature covered: up to mid-2023
Iterative type I polyketide synthases (iPKSs) are outstanding natural chemists: megaenzymes that repeatedly utilize their catalytic domains to synthesize complex natural products with diverse bioactivities. Perhaps the most fascinating but least understood question about type I iPKSs is how they perform the iterative yet programmed reactions in which the usage of domain combinations varies during the synthetic cycle. The programmed patterns are fulfilled by multiple factors, and strongly influence the complexity of the resulting natural products. This article reviews selected reports on the structural enzymology of iPKSs, focusing on the individual domain structures followed by highlighting the representative programming activities that each domain may contribute.
Based on domain organization, PKSs are usually divided into three types: type I PKSs, which are multidomain megaenzymes that act linearly through multiple modules (modPKSs) or iteratively through one module (iPKSs), type II PKSs, which contain several discrete stand-alone proteins, and type III PKSs, which contain a single domain.11 Among the most interesting yet least understood PKSs are the type I iterative PKSs, mostly found in fungi.10 Based on the various reductive degrees of the polyketide chain backbone, fungal type I iPKSs are further classified into highly reducing iPKSs (HR-iPKSs), nonreducing iPKSs (NR-iPKSs), and partially reducing PKSs (PR-iPKSs). Mycocerosic acid synthase (MAS), MAS-like PKSs and the related mammalian fatty acid synthase (mFAS), which generate fully reduced products, may also be viewed as subsets of type I iPKSs.12,13 In addition, a remarkable group of modular PKSs in bacteria such as aureothin PKSs obtained from Streptomyces thioluteus are observed to function iteratively.14
The hallmark of iterative type I iPKSs is that the usage of a single set of catalytic domains is repeated yet programmed. To realize the catalytic complexity of type I iPKSs, it is worth comparing their biosynthetic logic to those of the related mammalian fatty acid synthase (mFAS) and modPKSs (Fig. 2). In the representative case of the HR-iPKS LovBC complex involved in the biosynthesis of lovastatin15 (Fig. 2A), the biosynthetic cycle is initiated by loading malonyl-CoA catalyzed by a malonyl-acetyl transferase (MAT) domain, presumably followed by decarboxylation to generate the starter acetyl unit. Chain elongation is performed iteratively by eight rounds of Claisen condensation catalyzed by a β-ketosynthase (KS) domain and one Diels–Alder reaction. After each condensation reaction, the growing polyketide chain tethered to the phosphopantetheinyl (pPant) arm of the acyl carrier protein (ACP) domain and is subjected to modification processes, including α-carbon methylation by C-methyltransferase (CMeT), β-keto group reduction by the ketoreductase (KR) domain, β-hydroxyl group dehydration by the dehydratase (DH) domain and α,β-double bond reduction by the enoylreductase (ER) domain. Intriguingly, during the eight extension cycles, the CMeT-KR-DH-ER (LovC)-modifying domain usage is permutative. This is in stark contrast to either the mFAS in mammalian palmitic acid biosynthesis,16 where the elongated intermediate chain is faithfully modified in each extension cycle to finally form fully reduced C16 or C18 fatty acid (Fig. 2B), or the modPKSs, such as in erythromycin biosynthesis,17 where the modification steps are specified by each modular domain component (Fig. 2C). In addition, the modification processes of NR-iPKS PksCT18 (Fig. 3A) and PR-iPKS 6-MSAS19 (Fig. 3B) are less complicated than that of HR-iPKS LovBC.
Fig. 3 Representative biosynthetic logic of (A) NR-iPKSs (PksCT in citrinin biosynthesis) and (B) PR-iPKSs (6-methylsalicylic acid synthase (6-MSAS) in 6-methylsalicylic acid (6-MSA) biosynthesis). See also Fig. 2 for comparison. |
How does a single set of catalytic domains of type I iPKSs use different domain combinations to construct polyketides under specific program rules? The underlying programming is extremely complicated,20,21 intertwined with intrinsic and extrinsic contributions such as starter/extender unit selections, intermediate specificity, kinetic competition, gatekeeping and interdomain in trans interactions.
Fortunately, the biochemical activities established by in vitro reconstitution, domain swap and re-engineering have begun to unveil the program rules.20 Additionally, significant advances in obtaining structural information have been uncovered in the past decade.16 Looking into the architectures, the active sites and substrate tunnels of these molecular machineries will provide the necessary structural basis for the catalytic cycles of polyketide biosynthesis. This review summarizes the currently reported structures of type I iPKSs, either relatively complete regions or excised domains, and highlight the representative examples of programming mechanisms delicately contributed by catalytic domain(s).
Fig. 4 Overall architectures of type I iPKSs and mFAS. (A) HR-iPKS LovBC complex, PDB: 7CPY. (B) Loading/condensing region of NR-iPKS CTB1, PDB: 6FIJ. (C) Hybrid model containing the modifying and condensing regions of MAS-like PKS Pks5, PDB: 5BP4 and 5BP1, respectively. (D) mFAS (pig), PDB: 2VZ9. For each structure, the linear domain organization and dimensions are labeled. The hypothetical substrate shuttling trajectories (one side) within the catalytic chamber are shown as lines with each arrow pointing toward the next step; the locations of active site residues of each domain are marked as gray balls. The dashed arrows indicate the loading of the malonyl or acetyl starter unit. The double-sided arrow in (B) indicates a possible mechanism of NR-iPKSs in which the iterative polyketide extension is rapid and processive in the KS domain (PksA, discussed in Section 2.3). |
Great insights into the loading/condensing region architecture of an NR-iPKS have been provided by the 2.8 Å crystal structure of CTB1 comprising starter unit acyltransferase (SAT)-KS-MAT domains23 (Fig. 4B). The structure is particularly featured by the interactions between the SAT and MAT domains of opposite chains, covering an average of 957 Å2, which results in an overall rhomboid-shaped compact dimer. Furthermore, by using the mechanism-based crosslinker, the 7.1 Å cryo-EM structure traps the ACP docked to only one side of the KS dimer, which indicates that the asymmetric domain arrangements mediate polyketide biosynthesis.
A breakthrough in the understanding of a fully reducing PKS came from the hybrid model comprising the crystal structures of condensing and modifying regions of MAS-like PKS Pks524 (Fig. 4C). The condensing region adopts a similar conformation to the known structures.25,26 In contrast, instead of being the V-shaped conformation in LovB22 and mFAS,26 the DH dimer of the Pks5-modifying region adopts a linear conformation at an angle of 222°, similar to that of modPKSs.27,28 The Pks5 hybrid model reveals a linker-based, rather than domain–domain interaction-based, architecture and establishes a framework for the modPKS domain organization.
Compared to the well-studied mFAS structure26 (Fig. 4D), type I iPKSs exhibit a similar dimeric architecture; however, the condensing region of HR-iPKS LovB is rotated ∼180° opposite to that of mFAS when both modifying regions are superposed. Furthermore, the condensing and modifying regions of LovB clearly show contact at the “waist”, which is distinct from the central flexible linker observed in mFAS. This contact indicates that the condensing region of LovB may not be able to undergo large-scale rotation, as seen in the multiple conformations of mFAS analyzed by cryo-EM.29 Nevertheless, structural dynamics, either subtle or dramatic, are observed in all of these megaenzymes and believed to be necessary for creating asymmetric chambers during substrate shuttling.
Fig. 5 Structures of type I iPKS MAT and AT domains of DynE8 acylated with Mal ((A), PDB: 4AMP), mFAS (murine) in complex with Mal-CoA ((B), PDB: 5MY0, chain C), LovB, CTB1, Pks5 ((C), PDB: 7CPX, 6FIJ, 5BP1, respectively), and MAS WT and S726F mutant acylated with MMal-CoA, MMal-CoA and Mal-CoA ((E), PDB: 7AGS, 7AGU, 7AGT, respectively). The Ser-His catalytic dyads, key residues and interactions with substrates are labeled. The substrate pocket (yellow surface) is located between the two subdomains (shown as mesh), and the G-X-S-X-G motifs (nucleophilic elbows) are rainbow colored in (A) and (B). (D) Proposed catalytic mechanism of MAT. |
The deep substrate binding pockets are located between the two subdomains where the substrates are sandwiched (Fig. 5A and B). The two-subdomain formed interface harbors a Ser-His catalytic dyad (e.g., S651-H753 in DynE8-MAT35), in which Ser is located in the highly conserved G-X-S-X-G motif that is referred to as the “nucleophilic elbow”36 (Fig. 5A and B). The MAT-catalyzed transfer of building blocks to the ACP is conducted via a ping-pong bi–bi mechanism37 facilitated by the geometry of the nucleophilic elbow (Fig. 5D). The nucleophilic Ser attacks the thioester carbonyl carbon of malonyl-CoA to form a covalently Ser-acylated intermediate. Then, His acts as a general base mediating the deprotonation of the ACP pPant thiol group, and a subsequent nucleophilic attack results in malonyl-ACP. A key conserved Arg residue (e.g., R676 in DynE8-MAT) interacts with the malonyl carboxylate group by forming a salt bridge, which is responsible for holding the substrate. The specific substrate selectivity for malonyl-CoA (Mal-CoA) over α-substituted extender units such as methylmalonyl-CoA (MMal-CoA) is partly contributed by a bulk side chain containing phenylalanine (e.g., F752 in DynE8-MAT), one residue before the catalytic His. Recently, molecular-level evidence of substrate specificity in the AT crystal structures of iterative mycocerosic acid synthase (MAS) in complex with substrates has been provided38 (Fig. 5E). The M624V-S726F double mutants exhibit almost completely inverted specificity from natural MMal-CoA to Mal-CoA. Although the S726F mutant does not hinder the binding of MMal-CoA by a proposed steric clash with the additional methyl group, perhaps due to the motion of the entire ferredoxin-like small subdomain, the M624V-S726F double mutant structure completely abolishes the formation of the complex with MM-CoA. This study reinforces the influence of this Phe residue on Mal-CoA substrate specificity.
To date, two crystal structures have been reported: the SAT domain of CTB123 and CazM acylated with a hexanoyl substrate.42 Similar to the known MAT domains in both overall structure and catalytic mechanism, SAT comprises a large α/β-hydrolase-like subdomain and a ferredoxin-like small subdomain (Fig. 6A and B). The substrate pocket is located at the two-subdomain interface that contains a Cys-His catalytic dyad (e.g., C155-H277 in CazM-SAT). Acylated hexanoyl, the triketide substrate mimic, is stabilized by multiple hydrophobic interactions with H22, A156, A158, I191, I272 and I276, which may contribute to acyl substrate selection (Fig. 6A). The highly conserved substrate-holding Arg residue of the MAT domain is replaced by A187 in CazM-SAT, which lines at the bottom end of the pocket, resulting in a deeper L-shaped hydrophobic substrate pocket than that of MATs.
Fig. 6 Structures of SAT domains of CazM ((A), PDB: 4RPM) and CTB1 ((B), PDB: 6FIJ). Cys-His catalytic dyads, key residues and hydrophobic interactions with the hexanoyl substrate mimic (cyan) are labeled. The substrate pocket (yellow surface) is located between the two subdomains (shown as mesh). (C) Programming mechanism of the RrDalS2 SAT domain revealed by chain length control. |
An excellent example of the programming of the SAT domain by chain-length control has been recently investigated by domain swapping43 (Fig. 6C). HR-iPKS RrDalS1 collaborates with downstream NR-iPKS RrDalS2 and AtCurS2 to produce triketide- and tetraketide-primed main products, respectively. By exchanging the SAT domains of the two NR-iPKSs, the chimeric RrDalS2-SAT/AtCuS2 intercepts and forces the HR-iPKS to offer the triketide, resulting in the dominance of the triketide-primed polyketide. This study shows that the SAT domain can, although rarely, be a proactive selector that acts in trans to strictly accept the preferred substrate based on the chain length (the triketide by the SAT of RrDalS2) from the upstream partner HR-iPKS to further polyketide biosynthesis.
Fig. 7 Structures of KS domains of type I iPKSs and mFAS. (A) KS of HR-iPKS LovB, PDB: 7CPX. (B) KS of NR-iPKS CTB1, PDB: 6FIJ. (C) KS of mFAS, PDB: 2VZ9. (D) Monomeric KS of MAS-like PKS Pks5, PDB: 5BP1. For each KS structure, the Cys-His-His catalytic triads, tunnel bottleneck residues, and substrate tunnels (yellow surfaces) are labeled. (E) Superposition of the catalytic triads and bottleneck residues reveals the potential side chain rotation of LovB-F436 of ∼40°. (F) Proposed catalytic mechanism of KS. |
An exceptional characteristic of the LovB KS domain is the substrate tunnel (Fig. 7A). In contrast to the end-to-end connected KS tunnels of CTB1 and mFAS (Fig. 7B and C), the LovB KS pPant pocket is disconnected from the acyl chain pocket by the F436-M132 hydrophobic interaction in the tunnel and truncated at the end by the putative ionic interactions formed by H134-E137-D178 (Fig. 7E). The acyl pocket volume is significantly smaller than that of CTB1 and mFAS. Both the Euclidean distance and the solvent-accessible surface (SAS) distance of LovB KS from the reactive Cys to the tunnel end are also shorter. The superposition of the three KSs shows that F436 of LovB exhibits an approximately 40° side-chain rotation (Fig. 7E) and perhaps suggests a “breathing motion” mechanism mediating the substrate entry process. A similar observation conducted by the corresponding F395 of mFAS (murine) has been postulated as a “gatekeeper” in which the side chain rotates approximately 120° after the loading with the C8 fatty acid chain substrate.33 The residues at this Phe residue-equivalent position of NR-iPKSs vary. Nevertheless, whether a gate-keeping or breathing motion mechanism is utilized, the potential conformational variability of F436 of the LovB KS domain needs to be further clarified by capturing the KS complex structure with Cys181-acylated intermediates inside the tunnel.
Although the chain length control of type I HR-iPKSs has not been deeply investigated, NR-iPKSs and type II HR-iPKSs have both offered valuable information. The clade II NR-iPKS16 PksA with the SAT-KS-MAT-PT-ACP-TE domain organization was interrogated by mass spectrometry.45 Interestingly, partially elongated intermediates covalently attached by ACP were not detectable, and only fully elongated octaketide could be observed. This observation suggests an unexpected yet efficient working mode of KS (Fig. 8A): since the β-carbon of the β-ketoacyl polyketide chain does not need to be modified, the intermediate chain elongation is processive and may never leave the KS substrate tunnel, shuttling back and forth expeditiously and progressively between the active site cysteine and the pPant-tethered extender unit (malonyl-ACP) until the correct chain length is achieved. The abovementioned gate-keeping residue Phe in LovB and mFAS is replaced by alanine in PksA without bulk side-chain restrictions, which may also contribute to the processive mechanism of clade II NR-iPKS. The evidence of chain length controlled by KS has also been shown by a domain swap of the two closely related NR-iPKS CoPKS1 and CoPKS4, in which it is the KS domain that determines whether the dominant product is hepta- or octaketide.46
Fig. 8 Representative programming mechanisms revealed by NR-iPKS PksA (A) which indicates a processive extension mechanism of KS, and type II HR-iPKS Iga11–Iga12 in complex with Iga10-tethered C8 β-chloracrylamide pantetheinamide (C8Cl), a mimic of the pPant arm ((B), PDB: 6KXF). The chain length is proposed to be restricted by L125 of Iga12. |
The second perception comes from the bacterial type II HR-iPKS Iga11-Iga1247 (Fig. 8B). The chain length controlled by Iga11 (KS) is observed with a non-active site-containing chain link factor (CLF, Iga12), which may be viewed as an inactive version of the KS monomer. A negative charge of the Asp113 side chain of KS promotes the release of the Iga10 (ACP)-tethered β-ketoacyl intermediate and drives the chain forward for further β-carbon modifications. The substrate tunnel is rigid, and Leu125 of CLF creates steric hindrance in the rigid tunnel, which prevents the acyl moiety from being transferred to Cys170, thus controlling the chain length up to C14.
Fig. 9 Structures and mechanism of CMeT domains. (A) CMeT structure of NR-iPKS PksCT, PDB: 5MPT. (B) H2067-E2093 catalytic dyad and pocket-end residues are labeled. The substrate pocket (yellow surface) is located between the two subdomains (shown as mesh), and the G-X-G-X-G SAM-binding motif is shown in rainbow colors. (C) Hydrogen bonding and hydrophobic interactions with the byproduct SAH. (D) Structures of HR-iPKS LovB CMeT and trans-acting PsoF CMeT and CalH'. PDB: 7CPX, 6KJI and 7DMB, respectively. (E) Proposed catalytic mechanism of CMeT. |
Both the cis- and trans-acting CMeTs exhibit similar overall monomeric architectures (Fig. 9A–D). For example, PksCT CMeT adopts a two-subdomain organization with a hydrophobic substrate pocket, and the conserved Tyr1955 and His2067-Glu2093 catalytic dyads are located at the interface (Fig. 9A). The core C-terminal subdomain displays a Rossmann-like fold and belongs to the typical class I methyltransferase superfamily,52 which is also responsible for binding the SAM cofactor. The N-terminal subdomain contains several helices that may be viewed as a large lid protecting the substrate entrance, whereas many HR-iPKSs contain an inactive version of CMeT in which the N-terminal subdomain is completely absent or severely truncated, similar to the ψCMeT domain of mFAS.26 The conserved G-X-G-X-G motif is responsible for binding the cofactor SAM. Hydrogen bonds and hydrophobic interactions stabilize SAH, and the SAH homocysteine moiety also assists in the formation of the funnel-shaped substrate pocket that has an ∼25 Å SAS distance from the protein surface to the I1960-M2094-defined end (Fig. 9B and C). The following methylation process has been proposed: activated by Glu, the His acts as a general base and abstracts the proton from the α-carbon of the ACP pPant-tethered β-ketoacyl intermediate to form an enolate, which can subsequently perform a nucleophilic attack on the methyl donor of SAM to complete the reaction. The Tyr may also facilitate the methyl transfer process (Fig. 9E).
Two layers of programming contributed by the CMeT domains have been significantly revealed. A kinetic competition experiment on LovB in lovastatin biosynthesis was performed in vitro53 (Fig. 10A). The LovB DH domain has been mutated previously, leaving only the CMeT- and KR-modifying domains to be analyzed. In the presence of NADPH and SAM cofactors, CMeT is exceptionally selective toward the natural tetraketide intermediate. This suggests that CMeT has a higher kinetic efficiency on this particular tetraketide intermediate and therefore outcompetes the downstream KR domain within the iteration. Furthermore, a rare case of a trans-acting CMeT domain of PsoF that acts as an essential checkpoint to maintain the correct acyl intermediate transfer has been reported54 (Fig. 10B). In azaspirene biosynthesis, the HR-iPKS-NRPS hybrid PsoA collaborates with the CMeT domain of PsoF to produce the aminoacylated polyketide. In the process, the programmed CMeT specifically methylates the tetraketide. Without the methylation pattern performed by trans-acting CMeT, the polyketide intermediate cannot be transferred to the NRPS, and PsoA only synthesizes and releases the shunt α-pyrone product.
Fig. 11 Structures of ψKR/KR domains of LovB ((A), PDB: 7CPX), Pks5 ((B), PDB: 5BP4) and mFAS ((C), PDB: 2VZ9). The Lys-Ser-Tyr catalytic triad below the substrate pocket (yellow surface) and the interactions with the cofactor NADP+ (cyan) are labeled in (A). (D) Proposed catalytic mechanism of KR. |
KR harbors the conserved Lys-Ser-Tyr catalytic triad (e.g., K2266-S2294-Y2307 in LovB-KR) and the substrate binding pocket. F2341 and L2246 of LovB narrow the pocket and constrict the substrate entry direction. The cofactor NADP+, located below the active site, is mainly stabilized by multiple hydrogen bonds, and F2341 and V2336 may also form hydrophobic interactions with the nicotinamide ring (Fig. 11A). It is suggested that the KR domain operates by a proton-relay mechanism56 (Fig. 11D): the hydride of NADPH nicotinamide ribose performs a nucleophilic attack on the β-carbon of the β-ketoacyl intermediate, activated by the hydroxyl groups of the Tyr and Ser side chains, and β-carbon oxygen abstracts a proton from Tyr. Proton transfer is also facilitated by Lys.
KRs are classified into A-type KRs that set a β-hydroxyl group product in the L-configuration, B-type KRs that set a hydroxyl in the D-configuration, and C-type KRs that are reduction-incompetent.57 Since a single KR domain is utilized in almost every iteration of the polyketide chain extension (e.g., LovB in Fig. 2A), it seems that type I iPKS KRs are expected to display a little substrate selectivity. However, an unusual example clearly shows that the KR domain can switch the stereochemical outcome, which is controlled by the substrate chain length58 (Fig. 12A–D). In hypothemycin biosynthesis, the KR domain of HR-iPKS Hpm8 reduces β-ketone of the diketide into the L-configuration, whereas it reduces β-ketone of the triketide exclusively to the common D-configuration (Fig. 12A). This also indicates that the diketide and triketide may enter the substrate tunnel from the opposite direction.58,59 A series of SNAC-substrates with various chain lengths were analyzed in vitro, which clearly shows that the tri-, tetra-, penta- and hexaketides are all reduced to the common D-configuration, except that the diketide is converted into the L-configuration (Fig. 12B). Furthermore, by a sequence swap between the Hpm8 and Rdc5 (reduces the diketide into the D-configuration in monocillin II biosynthesis)60 KR domains, the catalytic triad-containing α4β5α5α6 motif was pinpointed as the site responsible for this substrate-tuned stereospecificity alteration (Fig. 12C and D). Recently, similar stereochemistry alterations of HR-iPKS KR domains have also been reported, including ApmlA in phaeospelide A biosynthesis,61 MpmlA in phaseolide A biosynthesis62 and KU42 from the basidiomycete fungal species Punctularia strigosozonata.63
Although the versatility and limited substrate specificity of KR domains suggest subtle contributions to programming, an insightful case of chain-length control by HR-iPKS KR was observed by rationally designed domain swaps64 (Fig. 12E). With the collaboration of the corresponding trans-acting ER domains, HR-iPKS-NRPS TENS and DMBS produce penta- and hexaketide-primed products, respectively. When the KR domains of the two PKS-NRPSs are exchanged, the reprogrammed TENS/DMBS-KR hybrid synthesizes the dominant hexaketide-primed product in which the chain length has been clearly altered.
Fig. 13 Structures and representative stereochemistry of DH domains. (A) HR-iPKS LovB DH domain, PDB: 7CPX. (B) MAS-like PKS Pks5 and mFAS DH domains, PDB: 5BP4 and 2VZ9, respectively. The 3.5 Å-distance-located His-Asp catalytic dyad and substrate pocket (yellow surface) are labeled in (A). H1037, which is crucial for the mFAS (pig) DH catalysis, and the equivalently positioned Q residues in LovB and Pks5 are also labeled. (C) Proposed catalytic mechanism of DH. (D) Substrate stereoselectivity of the SQTKS DH domain. (E) Stereochemistry determined by both the KR and DH domains. Note that the top scheme about the biosynthetic pathway for the conversion of diketides into triketides has been proposed, but not successfully characterized experimentally. |
The intrinsic substrate stereoselectivity of an HR-iPKS DH domain has been investigated in vitro.65 A total of six potential SNAC substrates (diketides) with opposite S and R configurations and various complexities were analyzed by the isolated DH from squalestatin tetraketide synthase (SQTKS) (Fig. 13D). SQTKS DH only efficiently catalyzes 2R,3R-2-methyl-3-hydroxybutyryl-SNAC to yield a trans-olefin product, which clearly reveals the strict stereoselectivity of the DH domain at both the α- and β-carbon positions. The successful molecular docking of the substrate into the DH shows a satisfactory geometry in which the α-proton is at a distance of 3.6 Å from catalytic H1034, and the β-hydroxyl group is ∼3.0 Å from D1225.
Several cases have been reported in which the structure of polyketides synthesized by type I iPKSs contains both common trans- and “less common” cis-α,β-double bonds.63,66,67 It is proposed that the stereochemistry of the DH domain is mainly determined by the β-hydroxyl group configuration reduced by the previous KR-catalyzed step, which is exemplified by HR-iPKS KU4263 (Fig. 13E). The diketide is reduced by the KU42 KR domain to generate the D-configuration β-hydroxyl group, and the following DH forms the product with the trans-double bond; however, when the triketide is reduced to give the L-configuration, the cis-double bond is formed by the same DH domain. Additionally, only the trans-double bond-containing diketide substrate is reduced to form the triketide with the L-configuration (Fig. 13E, bottom), which indicates that the stereoselectivity of KU42 KR is strongly influenced by the unsaturation degree of the substrate. This study reveals that the KU42 KR and DH domains are reciprocally related in determining the stereochemistry of the final product.
Fig. 14 Regioselectivity of the aldol cyclization reactions catalyzed by five groups of NR-iPKS PT domains. |
The crystal structures of the PksA PT domain have been reported in complex with palmitate, a bicyclic substrate mimic or a bis-isoxazole heptaketide, which better mimics the natural poly-β-ketone intermediate70,71 (Fig. 15A–D). The dimeric PksA PT domain adopts a double hot-dog fold similar to that of the DHs (Fig. 15A, 13A and B). However, the two monomers of PT interact with each other via the C-terminal hot-dogs, which are directly opposite the DHs of modPKSs, HR-iPKSs and mFAS, which all interact via the N-termini. The approximately 30 Å (an SAS distance of 38 Å) substrate tunnel of PksA PT can be divided into three regions: an ∼14 Å linear region for binding the pPant arm of ACP, which delivers the intermediate polyketide chain; an ∼8 × 13.5 Å cyclization reaction chamber containing the His1345-Asp1543 catalytic dyad for the two regiospecific cyclization reactions; and an ∼6 × 6 Å hydrophobic hexyl binding region for accommodating the hexyl starter unit of the polyketide chain. The C16 palmitate captured in this deep tunnel is stabilized via multiple hydrophobic interactions (Fig. 15C). G1491 defines the end of the tunnel and is otherwise replaced by a bulk-side-chain hydrophobic residue in the nonhexyl binding PTs (Fig. 15B).
Fig. 15 Structures and mechanism of PT domains. (A) Dimeric PT domains of NR-iPKS PksA and bacterial iterative AviM. PDB: 3HRQ and 7VWK, respectively. (B) Binding of palmitate (cyan) helps to identify the three-region-comprised substrate tunnel of ∼30 Å (yellow surface). The ∼3.0 Å distance-located H1345-D1543 catalytic dyad and G1491 are labeled. (C and D) Detailed views of the interactions between PksA PT active sites and palmitate and heptaketide substrate mimetic, respectively. The captured compounds are colored cyan. PDB: 3HRQ and 5KBZ, respectively. (E) Proposed catalytic mechanism of PksA-PT. |
Important insights into the catalytic mechanism have been uncovered by docking simulations70 and the structure of PT in complex with a C14 heptaketide (a substrate mimetic)71 (Fig. 15D). The mimetic is optimally positioned in an extended conformation, where pPant is strongly stabilized by R1623 via hydrogen bonding and a salt bridge, and the hexyl moiety forms hydrophobic interactions with Y1492, L1508 and F1551. In the cyclization reaction chamber, the critical heptaketide C4 is located precisely near the catalytic H1345 (3 Å), which catalyzes the regiospecific cyclization between C4 and C9 to form the first ring. The overall size and shape of the substrate tunnel are also important in chain length control.
Recently, the crystal structure of a bacterial PT domain of AviM that catalyzes C2–C7 cyclization in orsellinic acid synthesis has been reported72 (Fig. 15A, bottom). The overall structure, the dimeric interface and the active site catalytic dyad of the AviM PT domain are more similar to the canonical modPKS DHs than the fungal NR-iPKS PTs. Phylogenetic analysis showed that the bacterial AviM PT domain represents an evolutionary intermediate between the modPKS DH domains and NR-iPKS PT domains.
Fig. 16 Structures and representative programming mechanism of ER domains. (A) Monomeric LovC, PDB: 3B70. The cofactor NADP+ (cyan), substrate tunnel (yellow surface) and active site residues are labeled. (B) Dimeric ψER of LovB, ERs of Pks5 and mFAS (PDB: 7CPX, 5BP4 and 2VZ9, respectively). (C) Proposed catalytic mechanism of ER (LovC). (D) LovB–LovC interface. (E) Gate-keeping function of trans-acting LovC. |
Despite the common dimeric architecture of MDR enzymes, LovC uniquely exists as a monomer either in the stand-alone state proven by the crystal structure and size-exclusion chromatography76 or in the complex state unveiled in the cryo-EM structure that LovC binds laterally to the LovB MAT domain through the C-terminal loops22 (Fig. 16A). LovC comprises of the substrate-binding subdomain and the nucleotide-binding subdomain. The hydrophobic substrate binding pocket is in the two-subdomain interface, which also binds the cofactor NADP+. The LovC substrate pocket is size-limited, which may contribute to the specific tetra-, penta- and heptaketide substrate selectivity, unlike the traverse tunnels of mFAS and Pks5. As indicated by substrate docking, the shorter di- or triketide may bind in the pocket in a nonproductive conformation, and the pocket is unable to accommodate polyketide intermediates longer than the heptaketide. K54 (highly conserved in the trans-type ER) and T139 are crucial for the reduction activity of LovC, in which T139 is at a distance of 3.2 Å from C4 of the NADPH nicotinamide ring, where the hydride is transferred. The enoylreduction mechanism has been proposed76 (Fig. 16C): when the ACP-tethered substrate enters the pocket, the hydride of NADPH nicotinamide ribose is transferred to the β-carbon of the α,β-unsaturated intermediate, followed by a proton transfer from an acidic residue or water to the α-carbon to form the α,β-saturated product. The enoyl reduction process may be facilitated by an oxyanion hole formed by the side chains of T139, K54 and G282.
The correct programming fidelity controlled by trans-acting ERs has been firmly established in multiple polyketide synthetic pathways.17,77–79 In the biosynthesis of dihydromonacolin L (the core of lovastatin), LovB faithfully constructs the nonaketide with the assistance of LovC, which has strict substrate selectivity for an α-methyl-substituted chain at the tetraketide stage17 (Fig. 16E). Without the partner LovC or eliminating the LovB–LovC interactions by interface mutation (Fig. 16D), LovB predominately produce the shunt pyrone products with shorter chain lengths (hexa- and heptaketides) (Fig. 16E) due to the highly reactive polyunsaturated polyketide structure. The highly substrate specificity of LovC toward the methylated tetraketide, not the unmethylated one, clearly provides evidence for the gate-keeping function of the trans-acting ER in maintaining the correct polyketide biosynthesis.
Fig. 17 Structures of the C-terminus-fused TEs. (A) TE structure of NR-iPKS PksA shown in both cartoon and surface representations, PDB: 3ILS. The S1937-D1964-H2088 catalytic triad is labeled. (B) hFAS TE domain, PDB: 1XKT. (C) Proposed catalytic mechanism of NR-iPKS PksA TE. |
However, HR-iPKSs often do not contain a fused C-terminal TE domain and release the chain indirectly, although two rare cases of KU42-HR-iPKSs and KU43-HR-iPKSs show that the chains are released by their C-terminal TEs via thiolation and aminoacylation reactions,63 respectively. The polyketide chains are usually released by different stand-alone, trans-acting enzymes, including thioesterases (hydrolase fold and hot-dog fold), reductases, ATs, SATs, pyridoxal 5′-phosphate (PLP)-dependent enzymes (Fum8p)86 and even NRPSs.
Two trans-acting thioesterase crystal structures that display the α/β-hydrolase fold have been reported: DscB in complex with the substrate mimic during 10-membered lactone decarestrictine C1 biosynthesis87 and GrgF, which fuses the two different chain-length polyketides via C–C bond formation in gregatin A biosynthesis.88 Both of them are homodimeric in the asymmetric unit, and each monomer contains the Ser(Cys)-Asp-His catalytic triad (e.g., S114-D247-H276 in DcsB) (Fig. 18A–C). DcsB comprises the large α/β-hydrolase core region and the inserted small lid region containing three helices and two sheets. The capture of the pentaketide substrate mimic helps to identify an ∼151 Å3 substrate pocket located between the two regions. The mimic is stabilized by hydrogen bonds and hydrophobic interactions lining the pocket; however, the catalytic triad is 7.9 Å away from the thioester, which results in a nonproductive conformation. Subsequent docking simulations indicate that the substrate can be properly positioned near the triad in a ready-to-cyclize conformation, coordinated by hydrogen bonding interactions with the amides of F40 and F115.
Fig. 18 Structures and representative programming mechanism of α/β-hydrolase fold trans-acting TE. (A) Dimeric DcsB in complex with the pentaketide substrate mimic, PDB: 7D79. The substrate pocket (yellow surface) and Ser-Asp-His catalytic triad are labeled. A detailed view of the interaction between the active site and pentaketide mimic reveals a nonproductive conformation. (B) GrgF structure, PDB: 6LZH. Note that the conserved Ser in the triad is substituted by Cys (C115). (C) Proposed catalytic mechanism of α/β-hydrolase TE. (D) Chain length control of trans-acting TE revealed by Bref-PKSs and TH (thiohydrolase). |
The programming contributed by the hydrolase has been investigated. The chain-release of HR-iPKSs catalyzed by a trans-acting hydrolase is well demonstrated in the case of lovastatin biosynthesis.89 LovG, a serine hydrolase, is capable not only of releasing the correctly programmed dihydromonacolin nonaketide from LovB but also of proofreading by removing the incorrectly modified polyketide intermediates. Furthermore, in brefeldin A biosynthesis, how the product chain length of an HR-iPKS Bref-PKS is affected by a trans-acting hydrolase Bref-TH is elucidated90 (Fig. 18D). The Bref-PKS itself produces the acyclic nonaketide; however, in collaboration and specific interactions with Bref-TH, the octaketide product is released before the chain can be further elongated for an additional round. This hydrolase-mediated chain release indicates that Bref-TH contributes to the program rule by controlling the chain length of the final polyketide product. A similar chain-length control mechanism by the trans-acting releasing enzyme is exhibited by the pair of Fma-PKS and Fma-AT in the biosynthesis of fumagillin.91
The chain release catalyzed by trans-acting TEs is not limited to the α/β-hydrolase fold family. Three crystal structures of TE that adopt the hot-dog fold have been resolved (Fig. 19A and B): DynE7,92 CalE793 and SgcE10,94 which are involved in enediyne biosynthesis, including dynemicin, calicheamicin and C-1027, respectively. They form the homotetramer architecture, and each of the hot-dogs contains the conserved Arg catalytic residue (e.g., R35 in DynE7). The highly conjugated polyene, the product of the partner type I iPKS DynE8, is observed in the hot-dog B subunit of DynE7, which reveals an L-shaped traverse tunnel formed by both hot-dogs B and D. The tunnel can be divided into two regions: an ∼11 Å region for binding the pPant arm of ACP and a linear ∼19 Å acyl pocket for accommodating the extended polyketide chain in which the poly-carbon backbone is stabilized by multiple hydrophobic interactions. Intriguingly, although tetrameric, only one hot-dog's substrate tunnel is open and bound to the product, whereas the tunnels of the other three hot-dogs are closed, which support an induced-fit conformational change upon product binding. The following hydrolysis mechanism has been proposed (Fig. 19C): The Arg residue could serve as an oxyanion hole, facilitating the nucleophilic attack of the water molecule on the thioester bond of the ACP-tethered intermediate.
Fig. 19 Structures of the tetrameric hot-dog fold trans-acting TE. (A) DynE7 in complex with the polyene product (cyan), PDB: 2XEM. The L-shaped substrate tunnel (yellow surface), residues interacting with the product and the Arg catalytic residue are labeled. (B) CalE7 and SgcE10 structures, PDB: 2W3X and 4I4J, respectively. (C) Proposed catalytic mechanism of hot-dog TE. |
A trans-acting acyltransferase can intercept and thus release the intermediate polyketide chain from an HR-iPKS, as also exemplified during lovastatin biosynthesis. The acyltransferase LovD is responsible for releasing the diketide synthesized by HR-iPKS LovF, and then transfers the chain to monacolin J acid to produce lovastatin95 (Fig. 20A). The chain transfer process essentially requires highly specific interactions between LovF-ACP and LovD. LovD exhibits broad substrate specificity toward various acyl-CoAs, acyl carriers, and variants of the monacolin J acceptor, and thus, can be used as an effective biocatalyst.96 A series of crystal structures of LovD have been reported,97,98 including wild-type (WT) LovD and directed evolution-improved LovD mutants (representative LovD G5, LovD6 and LovD9), which exhibit higher catalytic efficiency than the WT. LovD comprises a large α/β-hydrolase core region and a small lid region (Fig. 20B). The substrate pocket and Ser76-Lys79-Try188 are located at the two-region interface. Remarkably, the catalytic efficiency was enhanced by 1000-fold (LovD9) via directed evolution, considering that LovD WT has already been a broad substrate competent. By comparison of the WT with the evolved mutants, the dynamics of the lid region are observed, revealing a reduction of the pocket where the active sites are more deeply buried98 (Fig. 20C).
Fig. 20 Structures of the α/β-hydrolase trans-acting AT. (A) Reaction of diketide release and transfer catalyzed by LovD. (B) LovD structure, PDB: 3HLB. The S76-K79-Y188 catalytic triad is labeled. (C) Superposition of the LovD WT and the evolved LovD6 and LovD9 on the core region reveals the dynamics of the lid regions, PDB: 3HLB, 4LCL and 4LCM, respectively. |
The interplay of catalytic domains is a key feature that has impacts on the programming pattern of type I iPKSs. A representative example of this is the biosynthesis of dihydromonacolin L acid (Fig. 21). The correct programming fidelity is achieved in the presence of the HR-iPKS LovB–LovC complex, substrate (malonyl-CoA) and cofactors (SAM, NADPH), leading to the nonaketide formation and released by LovG. LovB CMeT displays a higher kinetic efficiency at the tetraketide stage and thus outcompetes KR, and ER displays strict substrate selectivity toward this methylated tetraketide. If SAM, NADPH, and LovC are excluded (i.e., CMeT, KR, DH and ER function-disabled) in different combinations, the programming pattern goes off course and LovB can only produce pyrone shunt products. The production of pyrone shunt products is due to the highly reactive poly-β-carbonyl structure, which should be reduced by KR in the native context. Without the presence of LovC, the non-native tetraketide is elongated, but not further reduced by KR, which indicates that KR has also substrate selectivity. In addition, during the iterative LovB catalysis, LovG plays a proofreading role, removing the incorrectly modified polyketide intermediates from LovB, although it is not currently understood why the stalled aberrant intermediates cannot be further elongated by KS.
Fig. 21 Overall view of the programming fidelity illustrated by the LovB–LovC system in dihydromonacolin L acid biosynthesis. |
The programming pattern of each catalytic domain can vary in the different types of I iPKS systems. In the case of kinetic competition between CMeT and KR, the TENS CMeT domain outcompetes KR at the triketide stage, resulting in the dimethylated polyketide (Fig. 22, left). However, in the closely related DMBS system, CMeT loses to KR and methylates the triketide only once. At the tetraketide stage (Fig. 22, right), CMeT of LovB is faster than its KR, whereas the TENS and DMBS KR domains beat their CMeT. With regard to the ER substrate selectivity, trans-acting LovC shows high specificity toward the methylated tetraketide (Fig. 16E). However, the cis-acting SQTKS ER is broadly selective toward a wide range of substrates that contain different chain lengths and methylation patterns.99
Although gradually revealed in recent decades, the overall programming of a type I iPKS is still very difficult to predict. However, the future is promising. With the development of the artificial intelligence-powered Alphafold2,100 one can now easily predict type I iPKS structures of either the excised domain or the complete architecture, which will rapidly provide fingerprints for rational engineering. Additionally, by using methods such as NMR, X-ray crystallography and cryo-EM single particle analysis, the capture of ACP pPant-tethered substrate binding in each domain will provide crucial information for scientists to uncover the catalytic mechanism, which might otherwise be difficult, as seen in many structures that adopt closed conformations without the binding substrate. Furthermore, using time-resolved cryo-EM101 combined with continuous heterogeneous cryo-EM reconstruction (such as cryoDRGN)102 will likely solve the transient, ever-changing intermediates inside the substrate tunnel of each domain during the time-ordered catalytic cycles. Visualizing these megaenzymes in a continuous action will provide invaluable molecular-level information, just as it is much clearer how a car engine works from a series of photographs, or even a video, than from just a single snapshot. With the further understanding of the programming mechanisms provided by the combinations of new strategies and technologies, it is eagerly anticipated that type I iPKSs may eventually be designed as programmable molecular machineries for the generation of desired products and novel pharmaceutical agents.
This journal is © The Royal Society of Chemistry 2023 |