Ryo
Mizuuchi
*ab and
Norikazu
Ichihashi
cde
aDepartment of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, Shinjuku, Tokyo 162-8480, Japan. E-mail: mizuuchi@waseda.jp
bJST, FOREST, Kawaguchi, Saitama 332-0012, Japan
cKomaba Institute for Science, The University of Tokyo, Meguro, Tokyo 153-8902, Japan
dDepartment of Life Science, Graduate School of Arts and Science, The University of Tokyo, Meguro, Tokyo 153-8902, Japan
eUniversal Biology Institute, The University of Tokyo, Meguro, Tokyo 153-8902, Japan
First published on 20th June 2023
The emergence of RNA self-reproduction from prebiotic components would have been crucial in developing a genetic system during the origins of life. However, all known self-reproducing RNA molecules are complex ribozymes, and how they could have arisen from abiotic materials remains unclear. Therefore, it has been proposed that the first self-reproducing RNA may have been short oligomers that assemble their components as templates. Here, we sought such minimal RNA self-reproduction in prebiotically accessible short random RNA pools that undergo spontaneous ligation and recombination. By examining enriched RNA families with common motifs, we identified a 20-nucleotide (nt) RNA variant that self-reproduces via template-directed ligation of two 10 nt oligonucleotides. The RNA oligomer contains a 2′–5′ phosphodiester bond, which typically forms during prebiotically plausible RNA synthesis. This non-canonical linkage helps prevent the formation of inactive complexes between self-complementary oligomers while decreasing the ligation efficiency. The system appears to possess an autocatalytic property consistent with exponential self-reproduction despite the limitation of forming a ternary complex of the template and two substrates, similar to the behavior of a much larger ligase ribozyme. Such a minimal, ribozyme-independent RNA self-reproduction may represent the first step in the emergence of an RNA-based genetic system from primordial components. Simultaneously, our examination of random RNA pools highlights the likelihood that complex species interactions were necessary to initiate RNA reproduction.
RNA reproduction has been demonstrated for ligase and recombinase ribozymes.13–16 For example, a ligase ribozyme derived from the R3C ligase ribozyme17 catalyzes the joining of two RNA substrates as a template to form a sequence identical to itself.13 This ribozyme is the simplest self-reproducing RNA known to date in terms of its length (61 nt) and the number of components (two fragments: 13 and 48 nt). However, the ribozyme is still relatively large and was rationally designed, and it remains unclear how such a ribozyme and its components could have been prevalent in prebiotically accessible RNA mixtures, which were likely dominated by shorter (up to ∼20 nt) and random oligonucleotides.18,19 The ligase ribozyme also requires 5′-triphosphate activation, which necessitates an additional set of complex reactions.20 Consequently, it has been proposed that the template-based self-reproduction of short RNA molecules independent of complex ribozymes may have emerged first in the RNA World.21,22
Self-reproduction of short nucleic acids has been studied mainly using DNA. Previous studies demonstrated the autocatalytic reproduction of chemically modified DNA oligonucleotides through template-directed ligation, although the reproduction was severely hindered by two tightly bound templates (or a template and its identical product after ligation).23–25 A recent study employed temperature cycling to overcome such template inhibition in the reproduction of chemically activated DNA.26 Despite these efforts with DNA, the self-reproduction of short RNA oligomers is currently missing. Moreover, temperature cycling would be incompatible with RNA because, unlike DNA, RNA easily degrades at high temperature, which is accelerated by divalent metal ions27,28 that commonly enhance ribozyme catalysis28,29 as well as template-directed RNA synthesis.30,31
Template-directed ligation of short RNA is achieved in the laboratory using terminal activation such as with 2′,3′-cyclic phosphate (>p),30 which readily forms in prebiotically plausible environments,32,33 while recombination occurs directly—or in combination with spontaneous >p formation through hydrolysis of RNA, termed α/α′ mechanisms.31,34 Notable are recent studies that demonstrated that pools of short random RNA can undergo diverse intermolecular ligation and recombination, presumably in a templated manner.31,35 In these populations, RNA products that form efficiently or that self-amplify are expected to be enriched. Thus, a close examination of the enriched products may lead to the discovery of efficient RNA reproduction via template-directed ligation or recombination. The identification of such reproducing RNA would also provide insights into the likelihood of the emergence of self-reproduction out of random chemistry.
In this study, we first examined spontaneous ligation and recombination reactions in pools of short random RNAs and found that they can be detected more quickly than previously demonstrated. We observed the enrichment of RNA families with common motifs in multiple RNA pools. Subsequent analyses of the most enriched products and their variants led us to find a short (20 nt) RNA oligomer that can self-reproduce via template-directed ligation of two 10 nt substrates. The RNA contains a 2′–5′ phosphodiester bond, a linkage usually generated during non-enzymatic RNA synthesis.36–38 Partly due to the non-canonical linkage, the RNA circumvented its dimerization and displayed a potential for exponential reproduction in an isothermal environment, although the restricted formation of an active complex with the substrates limited the amplification. Its autocatalytic properties and structures are somewhat similar to those of the previously developed self-reproducing ligase ribozyme.13 These results demonstrate the first example of minimal RNA self-reproduction independent of a ribozyme and also help understand the dynamics of primordial, random RNA pools.
Fig. 1 Reactivity of N20 and N20>p RNA pools. (A) Incubation of 50 μM N20 or N20>p in 100 mM MgCl2 at 22 °C and pH 8.0 for 2 days, analyzed by 20% denaturing PAGE. RNA products (ca. 21–45 nt) were excised and subjected to RT-PCR. Note that RNA with >p migrates slightly faster. Relative band intensities of the indicated region are shown on the right. The dependence of band intensity on RNA length was not calibrated. (B) RT-PCR products analyzed by 15% native PAGE. Different PCR cycles were applied to the N20 and N20>p samples. (C) Length distribution of 21–45 nt products detected multiple times in the HTS analyses. (D) Nucleotide compositions of the products with each length for N20 (left) and N20>p (right) pools. The frequency of each nucleotide was represented by a linear combination of RGB values as in the previous study.35 The compositions of the original 20 nt pools were displayed for comparison. Arrowheads indicate putative ligation junctions. |
To examine sequences enriched in the random RNA pools, we excised the elongated products (ca. 21–45 nt) in both N20 and N20>p pools from a denaturing polyacrylamide gel and subjected them to RT-PCR and high-throughput sequencing (HTS). The RT-PCR was performed using the SMARTer technology, via poly-A tailing and following template switching during reverse transcription. We detected PCR products for both N20 and N20>p pools only if they were pre-incubated for two days, confirming recombination and ligation during the incubation (Fig. 1B). From the HTS data, we analyzed 374357 and 412461 reads of 21–45 nt products that were detected at least twice for the N20 and N20>p pools, respectively. The majority of the products derived from the N20 pool were 24–39 nt (95%) with a sharp drop-off above 39 nt (Fig. 1C), indicating that they were generated primarily by recombination, because a single recombination of two 20 nt RNAs could lead to a 21–39 nt product. On the other hand, the products in the N20>p pool were predominantly 24–40 nt (98%) with a sharp peak at 40 nt (Fig. 1C), suggesting that both recombination and ligation operated in the pool. It should be noted that recombination could occur either directly or indirectly via ligation on >p of a hydrolyzed RNA.31 The nucleotide compositions in the <40 nt products of the random RNA pools displayed a slight enrichment in G at the both sides of a putative ligation junction between a cleaved RNA>p (<20 nt) and a 20-mer (Fig. 1D and S1,† in the direction indicated by black arrowheads), which was more evident in the N20 pool than in the N20>p pool. The results contrast with the previous studies that incubated random RNAs in ice and without MgCl2, where cytosine and/or uracil were particularly enriched as putative phosphate donors.31,35 The predicted secondary structures of the products tended to be more stable than those of random sequences of the same sizes and nucleotide compositions (Fig. S2†), consistent with a previous study.35
Fig. 2 The most enriched RNA families. (A) Frequencies of the most enriched 20 RNA families in analyzed N20 or N20>p-derived products, sorted in descending order. Each panel represents families constructed based on sequence similarity around 5′ or 3′ terminus for N20 or N20>p. The arrowheads indicate N20-f1 and N20>p-f1. (B) Nucleotide compositions in sequences of N20-f1 (top) and N20>p-f1 (bottom). The sequence logos show the probability of each nucleotide at each position, calculated by ignoring the redundancy of each sequence (Fig. S3†). The black lines above indicate sites where a specific nucleotide is detected with a probability of >0.8. (C) A predicted secondary structure of f1-1, the most abundant sequence in N20-f1. Nucleotides detected with a probability of >0.8 are colored according to panel B. The commonly observed stem-loop structure is enclosed in the dotted line. The arrowhead indicates the putative recombination junction. |
RNA sequences in N20-f1 and N20>p-f1 displayed a common stem-loop structure at positions 11–27 nucleotides from the 3′ end, with five consecutive base pairs and a seven-base loop (Fig. 2C). The stem-loop region contained the majority of the commonly observed nucleotides, as represented in the most dominant sequence in N20-f1, named f1-1. Secondary structural prediction showed the same stem-loop structure at the same positions in 68% and 52% of ≥27 nt sequences in N20-f1 and N20>p-f1, respectively. In addition, only 7% of the RNAs in either family could form more than five base pairs in the stem region, underscoring the dominance of the specific stem-loop structure.
The enrichment of RNA families with shared nucleotides and structures in the random RNA pools encouraged us to investigate how these sequences could have been synthesized. As they were observed in both N20 and N20>p pools, they should form via recombination. The conserved 3′ region in the RNA of varying lengths, in conjunction with the current understanding of recombination mechanisms, suggests a two-step α/α′ recombination, wherein hydrolysis forms >p at the 3′ end of one RNA, followed by ligation of the 5′-OH of another RNA to the >p.31 If the ligating RNA is 20 nt long, as in the original pools, the probable recombination junction was between the oft-observed C and U at positions 20 and 21 from the 3′ end. We first tested whether f1-1 (29 nt) can form through this mechanism by splitting f1-1 into the first 9 nt attached with >p (i.e., fragment A) and the remaining 20 nt (i.e., fragment B) (Fig. 3A) so they could undergo ligation, the second step of α/α′ recombination. In a 2 day incubation of A and B, we detected f1-1 with ∼0.2% yield (Fig. 3B and C). It is important to note that this reaction may not strictly reflect what happened in the original random RNA pools because other RNAs could have been involved.
We also tested recombination directly by attaching 11 nt random nucleotides to A (AN11). Incubation of AN11 with B did generate a distinguishable product whose length is similar to—but slightly longer than—f1-1 (Fig. S5A and S5B†). Sequence analysis of the product revealed that it was predominantly f1-1 with a G inserted between positions 20 and 21, named f1-1G (Fig. S5C†). We confirmed that the addition of a G at the 3′ end of A (AG) (Fig. 3A) significantly enhanced its ligation with B (Fig. 3B and C). We also examined the effect of other nucleotides A, U, or C at the same position (AA, AU, or AC) for ligation with B. The fragment AA exhibited improved ligation but less efficiently so than AG, whereas AU and AC did not show enhanced ligation (Fig. S6†). These variant RNAs were not detected in the products derived from the N20 and N20>p pools, despite only a single nucleotide difference from f1-1 and high capacity for synthesis, highlighting the difficulty of understanding reactions in random RNA mixtures based on an examination of only a small number of isolated RNAs.
Ligation between >p of AG and BS could generate two possible phosphodiester bonds, either 3′–5′ or 2′–5′ linkages (Fig. 4A). Using ribonuclease (RNase) T1, which selectively cleaves G3′-p-5′N linkages of unpaired nucleotides, we determined that the ligation catalyzed by TG primarily formed a 2′–5′ linkage (Fig. S9†). Next, we prepared TG containing a 2′–5′ linkage at the ligation junction and named it TG′. We confirmed that TG′ catalyzed the same ligation reaction to generate more of itself (Fig. 4C and S9†), demonstrating true self-reproduction (Fig. 4B), although the extent of catalysis was approximately half than that of TG (Fig. 4D). Whereas previous studies found that RNA containing a fraction of 2′–5′ linkages can assist non-enzymatic RNA polymerization39 and retain functions as aptamers or ribozymes,40 our study further showed that such RNA can also self-reproduce. A time course experiment revealed the gradual appearance of TG′, with the reaction slowing after a 2 day (48 h) incubation (Fig. 4E, F and S10†). The yield of TG′ was positively increased with the concentration of initial TG′, demonstrating its autocatalytic ability. The ligation between >p of AG and BS was confirmed by control reactions performed in the absence of >p or BS, which showed negligible TG′ reproduction (Fig. S11†). We also found that the self-reproduction of TG′ was substantially enhanced at high concentration of Mg2+ (100 mM MgCl2) and temperatures around 22 °C (Fig. S12†), the condition used for incubating the original random RNA pools (Fig. 1A).
Next, we examined the formation of higher-order complexes among AG, BS, and TG′ by native PAGE after co-incubating one, two, or three of these RNAs containing fluorescently labeled TG′ (FAM-TG) or AG (FAM-AG) for 6 h (Fig. 5A). In this experiment, AG contained a monophosphate (-p) instead of >p at the 3′ end to preclude ligation to BS (Fig. S11†). When incubating only TG′, we found that the majority of TG′ existed as a TG′ monomer, with only a fraction (∼11%) forming a TG′·TG′ dimer (Fig. 5B). A TG′·TG′ dimer is presumably a simple self-complementary template dimer (Fig. S14†), but two TG′ molecules may also interact by forming a kissing loop. The prevention of the formation of a TG′·TG′ dimer was partly due to the 2′–5′ linkage, which significantly reduced the dimerization of TG′ (Fig. S13†), consistent with previous studies showing the diminished thermal stability of RNA duplexes in the presence of 2′–5′ linkages.40,41 The amount of TG′·TG′ increased to 23–27% in the presence of either AG or BS. However, in the presence of both AG and BS, the total amount of the TG′·TG′ dimer and a TG′·AG·BS ternary complex decreased to ∼3.8%. When incubating the three RNA molecules with FAM-AG, we detected the formation of a comparable amount of the TG′·AG·BS complex. In addition, we found that the majority (∼80%) of AG was bound to BS, and thus most of the substrates were not freely available, which could explain the low percentage of the TG′·AG·BS complex formation and the limited self-reproduction of TG (Fig. 4E).
The high availability of TG′ as a monomer implies its potential to undergo non-linear amplification by circumventing the strong association of two self-complementary TG′ molecules that form after ligation of AG and BS (Fig. 4B). A common way of examining such a possibility for a template (or an autocatalyst) is to fit the initial rate of its own production to the model of self-reproduction:13,14,23,24,42
The RNA molecule TG′ shares many similarities with the previously engineered 61 nt self-reproducing ligase ribozyme,13 although they catalyzed different ligation chemistries (Fig. S14†). The ribozyme catalyzes the attack of the 3′-OH of an RNA substrate on a 5′ triphosphate of another substrate in a template-directed manner and generates a ligated product identical to the ribozyme. Its self-reproduction was limited because of the strong association of the two substrates, as is also observed in TG′ (Fig. 5A). Nevertheless, both systems exhibited high apparent autocatalytic reaction order (∼1) in an isothermal environment as a consequence of the weak self-binding of the templates, compared to other nucleotide-based template-directed self-reproduction systems that showed an order of ∼0.5.23–25 This could be partly attributed to the intramolecular structural formation of a template, G:U wobble pairs that can facilitate template-directed ligation while supporting dissociation of a duplex,43 and multiple thermodynamically unfavorable bulges in a dimer,44 all of which are commonly observed in both TG′ and the ligase ribozyme (Fig. S14†).
The limited self-reproduction of TG′ resulted from multiple factors. First, the 2′–5′ linkage, while reducing the dimerization of TG′, decreased the ligation efficiency (Fig. 4D). Second, TG′ did not efficiently form an active complex with the substrates AG and BS because most of the two substrates bound to each other and were not freely available (Fig. 5A). These limitations may be overcome if strong chemical activation is adopted instead of >p or in environments that periodically experience low pH, high temperatures, or low MgCl2 concentrations, which destabilize RNA–RNA interactions (e.g., the association of substrates).45–47 Alternatively, as demonstrated for a self-reproducing ligase ribozyme,48 directed evolution with TG′ as the parent RNA may also identify highly efficient reproduction of oligonucleotides in a constant environment. It was shown that only a slight difference, including two critical mutations, was sufficient to convert the original ligase ribozyme13 (Fig. S14†) into a continuously self-reproducible RNA.49 Thus, it is conceivable that there may be a short RNA oligonucleotide capable of unlimited self-reproduction, in a sequence space accessible from TG′ by natural selection.
Our results also give insights into the dynamics of short random RNA mixtures. From completely random pool of 20-mers, we identified a discrete class of related, enriched sequences of which f1-1 appeared to be a canonical representative. The fragment TG′ is a truncated version of f1-1G, a single-mutation variant of f1-1. Both f1-1G and f1-1 were accessible products in both N20 and N20>p pools explored in the present study. However, while f1-1 was highly enriched in both random RNA pools along with many related sequences (e.g., N20-f1 and N20>p-f1), f1-1G was undetected even at a low frequency. On the other hand, biochemical analyses revealed the superiority of f1-1G to f1-1 for its formation through simple ligation of two substrate fragments (Fig. 3B and C). This discrepancy may imply the involvement of other RNA species for the synthesis of f1-1 in the random RNA pools. In the chaos of primordial soup, it is without question that a complex ecology of chemical reactions must have given rise to enriched species sets.50,51 A previous study also reported the inefficient synthesis of some products isolated from random RNA pools.35 Altogether, our results highlight the difficulty of inferring dominant reactions in random RNA mixtures from the analyses of isolated sequences. Nevertheless, the information obtained from examining the random RNA products was valuable in the discovery of the minimal self-reproducing RNA, which exhibited its highest activity in the original environment where the random RNA pools were exposed (Fig. S12†). Future experiments exploring the synthesis of f1-1, f1-1G, or TG′ in combination with random RNA mixtures would give more insights into the likelihood of the emergence of self-reproduction in a primordial RNA soup.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3sc01940c |
This journal is © The Royal Society of Chemistry 2023 |