Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

Elucidating the solution structure of the monomolecular BCL2 RNA G-quadruplex: a new robust NMR assignment approach

Zenghui Wang a, Carla Ferreira Rodriguesa, Simon Jurta, Alicia Domínguez-Martínb, Silke Johannsen*a and Roland K. O. Sigel*a
aDepartment of Chemistry, University of Zürich, 8057 Zürich, Switzerland. E-mail: silke.johannsen@chem.uzh.ch; roland.sigel@chem.uzh.ch
bDepartment of Inorganic Chemistry, Faculty of Pharmacy, University of Granada, 18071 Granada, Spain

Received 21st February 2025 , Accepted 26th March 2025

First published on 26th March 2025


Abstract

5′ untranslated regions (UTRs) of mRNA commonly feature G-quadruplexes (G4s), crucial for translational regulation and promising as drug targets to modulate gene expression. While NMR spectroscopy is well-suited for studying these motifs' structure and dynamics, their guanine-rich nature complicates resonance assignment due to high signal overlap. Exploiting the inherent rigidity of G4 cores, we developed a universally applicable assignment strategy for uniformly isotopically enriched G4 structures, relying solely on through-bond correlations to establish the G-tetrads. Applying this approach, we resolved the solution structures of two triple mutants of the RNA G4 in the 5′ UTR of the human BCL2 proto-oncogene, one of the first natural monomolecular RNA G4 structures available to date. Comparative analysis with other RNA and DNA G4s reveals their notably compact and well-defined cores. Moreover, the sugar pucker geometries of the tetrad guanines are far less stringent than previously assumed, adeptly accommodating specific structural features. This contrasts with the canonical base pairing in RNA and DNA, in which the sugar pucker dictates the type of the double-helical structure. The strategy presented provides a direct path to uncovering G4 structural intricacies, advancing our grasp of their biological roles, and paving the way for RNA-targeted therapeutics.


Introduction

The discovery of G-quadruplexes (G4) in guanine-rich DNA and RNA sequences in 19871,2 constituted a revolutionary advance in nucleic acid research, opening up a new field of study into the structural diversity and their individual functional roles.3–6 Originally identified in telomeres, G4 structures have since been found throughout the human genome, with over 700[thin space (1/6-em)]000 putative DNA G4 structures reported.7 Despite extensive in vitro studies, the in vivo existence of DNA and RNA G4 structures was not confirmed until 2014.8–10 While initial research focused mainly on DNA G4s, recent advancements have revealed the pivotal roles of mRNA G4s in regulating translation, mRNA processing, transcription termination, mRNA localisation, and alternative splicing.11–14 Over 13[thin space (1/6-em)]000 putative RNA G4 sequences have been identified in the human transcriptome15 of which 500 are in 5′ untranslated regions (5′ UTRs), impacting the translation efficiency of key proteins such as NRAS,16 human VEGF,17,18 TGFβ2,19 and Bcl-2.20

G4s consist of a rigid core and flexible interconnecting loops, as shown in Fig. 1B. The core is formed by π–π stacking of tetrads, each a coplanar cyclic arrangement of four Hoogsteen-paired guanines. The core is additionally stabilised by metal ions coordinating the partially negatively charged carbonyl groups of the guanines in the tetrad, thereby forming a metal ion channel in the centre.21 In particular, monovalent metal ions have a high stabilising effect, with potassium(I) having the highest stabilisation energy.22,23 Previously, RNA G4s were believed to be strictly parallel, with all four strands pointing in the same direction. Recent studies, however, have shown synthetic RNA G4 structures with antiparallel and hybrid connectivity,24–27 formerly ascribed only to DNA G4s.28 The guanine-rich regions frequently possess excess guanines beyond the requisite core formation, fostering highly dynamic systems in which guanines within the loops exchange with guanines in the core.29 This dynamic interplay of structures potentially plays a pivotal role in their regulatory function, but at the same time challenges structure elucidation.


image file: d5sc01416f-f1.tif
Fig. 1 The BCL2 G-quadruplex. (A) Schematic representation of the BCL2 DNA sequence with the promoter regions P1, P2, and M. Exons are depicted in light grey, and introns are white. The BCL2 mRNA with the 5′ UTR G4 (light green), located 42 nucleotides upstream of the AUG start codon, is shown below. In addition, the sequences of the 25mer wt, the truncated 22mer wt and the two BCL2 G4 mutants A6A8U17 and A6U8U17 are shown. (B) Schematic representation of a G4 tetrad (left) and the G4 structure with sequence numbering of the core guanines and the mutated nucleotides (right). The nucleotides in the loops are depicted in grey, and the positions of the mutated nucleotides are framed in black. Metal ions are represented as purple spheres. (C) H1 region of the 1H NMR spectra (H2O/D2O (9[thin space (1/6-em)]:[thin space (1/6-em)]1), 2 mM KCl, pH 7, 298 K) of the different BCL2 G4 sequences, 25mer wt (0.1 mM), 22mer wt (0.4 mM) and the two triple mutants A6U8U17 (1 mM) and A6A8U17 (1.3 mM).

To date, roughly 500 G4 structures have been determined, of which around 90 are RNA G4s.30,31 These RNA structures were elucidated with X-ray crystallography and NMR (nuclear magnetic resonance) spectroscopy, and recently, the first Cryo-EM structure of a synthetic aptamer sequence was released.32 Most determined RNA G4 structures are multi-molecular G4s,30,33 RNA aptamers or other synthetic constructs bound to proteins or ligands.26,32,34–46 Less than 10 of 90 determined structures are naturally occurring monomolecular G4s, one of which interacts with a large protein complex,47 while the others are stand-alone RNA G4s.48–51

The relatively small size of G4s (around 20–30 nt) makes NMR spectroscopy an ideal method for determining their structures and analysing their dynamics over different time scales.52,53 NMR structure determination of nucleic acids relies heavily on assigning 1H resonances and their through-space correlations.53,54 In G4 structures, however, the high guanine content leads to low signal dispersion. Furthermore, their dynamic character can cause severe line broadening and increased spectral overlap due to conformational polymorphism, depending on the time scale, complicating chemical shift assignment. For DNA G4s, this challenge is typically resolved through site-specific incorporation of isotope-enriched nucleotides via solid-phase synthesis until complete assignment is achieved.55 The corresponding isotope-enriched ribo-phosphoramidites have only been available for a few years, preventing earlier application of this respective strategy.56 Nowadays, generating multiple RNA constructs for unambiguous assignment via solid-phase synthesis is more feasible, if still time-consuming, costly, laborious and highly dependent on the length of the studied RNA. In vitro transcription is a one-pot, length-independent, alternative method that provides high yields of in vitro folded RNA57 and the ability to insert a wide range of isotopically enriched or otherwise modified nucleotides in a nucleotide-uniform manner. However, NMR structural elucidation of G4s from in vitro transcribed RNA still lacks a robust assignment strategy to manage the high signal density and overlap effectively.

In this study, we present the NMR structure determination of two triple mutants (A6A8U17 and A6U8U17) of the monomolecular BCL2 RNA G4 structure, effectively capturing this G4 in a singular conformation. Employing in vitro transcribed and thus uniformly labelled RNA, we devised a novel assignment strategy that integrates intra- and inter-residual H1–H8 correlations based on a combination of NMR spectra to discern the Hoogsteen base pairing pattern of tetrads. This combination of well-established methods and newly adapted NMR pulse sequences enabled a straightforward and unambiguous assignment for non-site-specifically labelled G4s, a capability previously unavailable. Subsequent in-depth analysis and comparison with other triplanar RNA and DNA G4 structures reveal a unique feature of the core of the two BCL2 G4 mutants and provide new insights into the role of the sugar conformation in G4s. This study not only enhances our understanding of the BCL2 RNA G4 but also signifies a notable advancement in RNA G4 structural biology by facilitating NMR structure determination, a crucial step for obtaining more insights into these regulatory elements.

Results and discussion

Triple mutants trap the BCL2 G4 in a single conformation

In 2010, a highly sequence- and site-conserved guanine-rich sequence was discovered in the 5′ UTR mRNA of the human B-cell lymphoma gene 2 (BCL2) (Fig. 1A).20,58 This gene encodes Bcl-2, an anti-apoptotic protein intricately regulated both transcriptionally and post-transcriptionally.59 Misregulation of this protein has been associated with neurodegenerative diseases such as Parkinson's and Alzheimer's,60,61 as well as with multiple lymphomas and resistance to anticancer therapies.62,63 Furthermore, the BCL2 RNA G4 structure has been demonstrated to affect translation in vitro, making it a potential target for therapeutic intervention to regulate Bcl-2 protein levels.20 Indeed, three natural alkaloids (palmatine, nitidine, jatrorrhizine) with antiproliferative properties bind with high affinity to this G4,64 but their expensive synthesis poses a challenge for further research.65 Thus, a comprehensive understanding of the structural and dynamic features of this BCL2 RNA G4 is essential to identify additional potential drug molecules and to develop alternative treatments targeting this regulatory RNA.

Initial analysis of the 25-nucleotide guanine-rich sequence revealed the formation of a stable parallel G4.20 However, the study did not disclose the highly dynamic nature of this sequence, which is capable of adopting multiple G4 structures. In general, proton resonances at 10–12 ppm indicate characteristic G4 formation and originate from the H1 protons of the guanines inside the core.54 The lack of distinct signals in this region for the initial wild-type sequence (25mer wt, in Fig. 1C) suggests various interchanging conformers dependent on the incorporation of different guanines inside the core. With such a dynamic behaviour, further analysis by NMR – let alone full structure elucidation – is not feasible. Therefore, we systematically restricted the intrinsic folding dynamics by reducing the number of excess guanines to obtain a single G4 conformation. An earlier study on a shortened wild-type sequence (22mer wt), in which two guanines at the 5′ end and one at the 3′ end were removed, also yielded a stable G4 structure.58 In contrast to the 25mer wt, the 22mer displays well-defined resonances in the respective region of the 1D 1H NMR spectrum (Fig. 1A and C), indicating a reduction in dynamics. The 22mer wt sequence consists of four segments with three to four consecutive guanines, likely forming a triplanar G4 and resulting in twelve H1 proton signals. The number of resonances indicates the presence of at least two conformations based on the incorporation of alternative guanines from within either the four-guanine stretch or the loop regions (Fig. 1A). To prevent guanine exchange between the loops and the G4 core, the guanines in the loops (G6 to A6 and G17 to U17) and the first guanine of the four-guanine stretch (G8 to A8 and U8, respectively) were mutated, yielding the mutants A6A8U17 and A6U8U17 (Fig. 1A). The 1H NMR spectra of the two triple mutant constructs indeed show well-resolved and sharp resonances corresponding to twelve H1, demonstrating that the G4 is trapped in a single triplanar conformation (Fig. 1C), an ideal case to establish a new assignment approach.

G4s tend to stack on top of each other (multimerisation), especially at high metal ion and nucleic acid concentrations. G-quadruplexes without flanking nucleotides (non-G nucleotides at 5′ or 3′ end) are even more prone to dimer formation.66 Here, RNA concentrations of ∼0.5–1.3 mM needed for high-resolution NMR spectra, together with a low K+ concentration of 2 mM, were applied. No indication of stable dimer formation upon varying RNA and salt concentration, or temperature (Fig. S2 and S3) was observed. Additionally, the hydrodynamic radius of both mutants matches that of the monomeric 22mer wt (Fig. S4).

CD spectra and melting curves confirm that at 2 mM KCl and pH 7, both mutants adopt parallel G4 structures with similar stability to the BCL2 G4 22mer and 25mer wild-type20,58 (Tm ∼ 60 °C, Fig. S1 and Table S1).

A universal assignment approach for G4 structures

The stacked tetrads are the central feature of all G4 structures, making them the ideal starting point for NMR assignment. While guanines in DNA G4s are typically unequivocally assigned by site-specific labelling, the assignment of guanines in RNA G4s relies primarily on intra- and interplanar H1–H8 NOE correlations. However, the limited signal dispersion poses a challenge, particularly in distinguishing between intra- and interplanar H1–H8 NOEs (see Fig. S10A and D). To overcome this problem, we directly link the guanines within a plane via their H1 and H8 protons using in vitro transcribed uniformly labelled 15N-G or 13C,15N-G RNA. We combine established JR-HMBC and H8N7N2-COSY spectra with one of the two newly adapted pulse sequences, H1N1N2-COSY or H1(N1C2)N2, enabling unambiguous assignment of all guanines in the core and a direct connection to the H1–H8 NOESY region. By focusing solely on through-bond correlations, ambiguity is significantly reduced as only intraplanar H1–H8 correlations are revealed – a major advance in NMR structure determination of RNA G4s, particularly those with intricate structures or those exhibiting polymorphism.

Here, we use the A6A8U17 mutant as an example: we start with a tentative assignment of the H1 protons in the 1D 1H spectrum from left to right (guanines Ga–Gm). Using the JR-HMBC spectrum (Fig. 2A, red line in top spectra), the H1 protons are subsequently connected to the intra-residue H8 protons via C5 through-bond correlations (top Fig. 2B), allowing the tentative H1 assignment to be transferred to the H8 protons.67–69 In a second step, we employed an H8N7N2-COSY (Fig. 2A, grey line to the bottom right spectrum) to correlate the H8 protons to the N2 nitrogens of neighbouring guanines via the N7(α)⋯H21(β)–N2(β) hydrogen bond (Fig. 2B).70 In principle, the O6(α)⋯H1(β)–N1(β) hydrogen bond could also be used to link two guanines. However, the small coupling constant across this connectivity pathway results in insufficient sensitivity, rendering it unsuitable for acquiring high-quality NMR spectra.54 To complete the assignment of the H8–H1 inter-nucleotide linkage, we adapted two pulse sequences to detect the intramolecular H1–N2 correlations:71,72 H1N1N2-COSY (Fig. S5) and H1(N1C2)N2 (Fig. S6) (see also Fig. 2A bottom and Fig. 2B middle). High-resolution spectra were obtained with 0.4 mM 15N-G-labelled and 13C,15N-G-labelled RNA samples, respectively, within 1 hour measuring time. H1N1N2-COSY high-resolution spectra were also obtained at 323 K (Fig. S7). Measuring at such high temperatures can substantially minimise the multimerisation effect, as G4 core stacking is reduced,73,74 which is particularly important when higher potassium(I) concentrations are required or when the G4 has a higher propensity for forming multimers. Using the H1N1N2-COSY (Fig. 2A, following the grey line to the bottom left spectrum), we established the intra-residual link between the H1–N2 and connected it to the inter-nucleotide N2–H8 correlation via the orange lines. The iterative process continues until all four nucleotides within each tetrad are assigned, as illustrated in detail in Fig. S8. In the case of the A6A8U17 BCL2 G4, the assignment of the three tetrads revealed the connections: Ga–Gf–Gb–Gd, Gc–Gh–Gg–Gi, and Ge–Gk–Gm–Gl (Fig. 3A, left). The intraplanar H1–H8 (Fig. S10A) and H1–H1 (Fig. S14) through-space correlations in the NOESY spectrum confirmed this assignment.


image file: d5sc01416f-f2.tif
Fig. 2 Unambiguous assignment of the tetrads. (A) The tetrad assignment is based solely on through-bond correlations, using three different spectra to establish both intra-residue (top panel, red lines) and inter-residue (bottom panel, orange lines) H1–H8 correlations. The intra-residue correlations are established via a JR-HMBC connecting the H1 (top left) and the H8 proton (top right) via the C5 carbon,66–68 and the inter-residue correlations via two HNN-COSY spectra. The H1N1N2-COSY (bottom left) connects H1 to N2 via a 2HJNN scalar coupling (Fig. S4), and the H8N7N2-COSY (bottom right) connects the H8 to N2 via J-coupling through the N7⋯H2–N2 hydrogen bond.69 The assignment is shown for A6A8U17 with the corresponding 1D 1H NMR spectra above, using a tentative proton assignment (a–m) for the twelve guanines of the core. Grey lines indicate the transfer of proton assignment to another spectrum. (B) Schematic representation of the consecutive steps required to establish the tetrad assignment. The protons circled in grey mark the detected protons, H1 and H8, respectively. The red lines mark the intra-residue correlations from H1 to H8 through C5, and the orange lines represent the inter-residue correlations from H1 to H8 through N2. The N7⋯H2–N2 hydrogen bond is marked in light blue. The JR-HMBC spectrum was recorded with 1.3 mM RNA, the HNN-COSYs using a 0.45 mM 15N-G-labelled RNA sample (H2O/D2O (9[thin space (1/6-em)]:[thin space (1/6-em)]1), 2 mM KCl, pH 7, 298 K).

image file: d5sc01416f-f3.tif
Fig. 3 Sequential assignment and structural calculation of the A6A8U17 G-quadruplex construct. (A) H1–H1′ region of the [1H,1H]-NOESY (1.3 mM A6A8U17, H2O/D2O (9[thin space (1/6-em)]:[thin space (1/6-em)]1), 2 mM KCl, pH 7, 298 K) (left) used to assign the individual strands and subsequently to correctly order the tetrads in the core (right). Four line-types differentiate the individual strands, and tetrads are colour-coded with three shades of green. (B) Sequential walk region of the [1H,1H]-NOESY (1.3 mM A6A8U17, 100% D2O, 2 mM KCl, pD 7, 298 K). Showing the correlation of the sugar H1′ with the aromatic H8/H6 protons (left). The assignment of the loop nucleotides (CCAUA) is shown in orange, which was used to assign the strands in the correct order and to replace the tentative assignment of the guanines with the correct sequence numbering. The G4 sequence of the A6A8U17 mutant and a schematic structure with the tetrad guanines in three shades of green and the loop nucleotides in orange and grey, respectively (right). (C) A graphical shorthand notation introduced by Banco and Ferré-D’Amaré89 without sugar pucker information to show the transition from tentative to sequential assignment (circled numbers indicate loop lengths) (top) and block view30 of the calculated A6A8U17 BCL2 mutant (PDB code: 7Q6L) (bottom).

In the next step, the individual tetrads are arranged to assemble the G4 core. Through H2O–D2O exchange experiments, we identified the middle tetrad (Ga–Gf–Gb–Gd) (Fig. S9) characterised by slower deuterium exchange of the H1 protons due to reduced solvent exposure.54 The rest of the core was compiled using H1–H1′ NOE correlations, akin to the sequential walk (Fig. 3A, black lines in the spectrum). Unlike the crowded sequential walk regions H6/H8–H1′ and H6/H8–H2′ (Fig. 3B, S10C and S11), which were used additively to verify the assignment, this NOESY region only displays guanine signals from the tetrads. Applying this approach, we established the order of the guanines along the four edges (5′–Gc→Gd→Ge–3′, 5′–Gg→Gf→Gk–3′, 5′–Gh→Ga→Ge–3′ and 5′–Gi→Gb→Gm–3′) and arranged the tetrads accordingly to form the core. The tetrad arrangement was further corroborated by interplanar H1–H8 NOEs (Fig. S10D) and correlations between sugar H2′ to H1 of the stacking guanine from the left neighbouring strand indicated by dashed black lines in Fig. S10E and F. This latter correlation illustrates the right-handed twist of the G4 core, a structural characteristic also typical for A- and B-helical nucleic acid structures.75

Finally, we replaced the tentative assignment with the correct sequence numbering by identifying the specificity from the connections between G-tracts and unique loop sequences. A6A8U17 contains three loops with different lengths and sequences (CCAUA, U, and AUCU, respectively; Fig. 3B). In principle, a single connection, e.g. a G to two consecutive cytosines (C4 and C5), is sufficient to convert the sequence assignment. In the present case, we could trace correlations over an entire loop with the sequential assignment of the first two G-tracts connected by the CCAUA loop (orange): 5′–Gh–Ga–Ge–C4–C5–A6–U7–A8–Gg–Gf–Gk–3′ (Fig. 3B). The base identities of the loop nucleotides were determined using [13C,1H]-HSQC experiments and then mapped to the NOESY sequential walk region (H6/H8–H1′ region), with the complete sequential assignment shown in Fig. S11. The conversion from the tentative to the sequential assignment is depicted in Fig. 3C with Gh = G1, Ga = G2, Ge = G3, etc. The remaining proton resonances were assigned using additional 2D [1H,1H]-TOCSY and [13C,1H]-HMBC experiments. This comprehensive assignment process was performed for both mutants. Subsequently, the structural models of A6A8U17 (PDB ID 7Q6L)76 and A6U8U17 (PDB ID 7Q48)77 were calculated based on NOE-derived distance restrictions, dihedral angle restrictions, and the definition of hydrogen bonds in the G4 core (Table S2).

Analysis of the BCL2 structures and comparison to other G4s reveals a particularly compact core

Superimposing the 20 lowest-energy models out of the 200 calculated for A6A8U17 and A6U8U17 reveals remarkably similar structures for both mutants (see Fig. 4A and B), characterised by well-defined G4 cores and flexible loops (RMSD: 3.29 ± 1.03 Å (A6A8U17) and 3.02 ± 0.68 Å (A6U8U17)). Notably, the rigid cores adopt highly similar structures, as evidenced by the low RMSD (0.57 ± 0.18 Å) upon superposition. The relatively high overall RMSDs are primarily attributed to the high structural diversity of the two longer loops, CCAUA(U) and AUCU. Nevertheless, several local structural features can be distinguished. The first loop CCAUA(U) connects G3 and G9, where the sequence difference at position 8 between the two mutants appears to exert only a local structural effect. In particular, A8 in A6A8U17 stacks on G1 and G9 from the 5′ tetrad (see Fig. 4A, top), which is well reflected by strong correlations between A8H2 and both G1H1 and G9H1 in the NOESY spectrum (red box in Fig. S10A). In contrast, U8 in the A6A8U17 mutant lacks these long-range NOE correlations and instead displays only interactions between both neighbouring nucleotides (U7 and G9). This leads the U8 to prominently splay away from the G4 core (Fig. 4B, top), which is consistent with the weaker stacking interactions of pyrimidine rings compared to purines.
image file: d5sc01416f-f4.tif
Fig. 4 Structural analysis of the BCL2 G-quadruplexes and comparison to other parallel G-quadruplex structures. (A and B) Top and side view of the overlay of the 20 lowest-energy structures of 7Q6L and 7Q48, respectively. Backbone and ribose sugars are depicted in grey, guanines in green, adenines in red, uracils in cyan, and cytosines in yellow. Some loop nucleotides are transparent for better visualisation of the core. (C) Histogram of the pseudorotation phase angle P of the twelve guanines in the core of the two mutants (7Q6L in grey; 7Q48 in dark grey). The preferred range of P for the endocyclic sugar conformations C3′-endo (0–36°) and C2′-endo (144–180°) is shown in light grey.76 (D) Close-up of the single-nucleotide loop U12 that bridges a C1′–C1′ distance of 8.7 ± 0.2 Å (dashed line) to connect the G9–G11 strand to the G13–G15 strand. The sugar pucker in G11 adopts a C2′-endo conformation (top inset), while G13 (bottom inset) and all other guanine sugar puckers adopt a C3′-endo conformation. Oxygen and phosphorus atoms are shown in red and orange, respectively, while the carbon atoms are grey. The colours of the bases are the same as given in A. (E) Tetrad areas of the two BCL2 mutants 7Q6L and 7Q48 in comparison to the other monomolecular RNA G4s (7PS8 and 2LA5), the bi- and tetramolecular RNA G4s (2KBP, 2M18, 2AWE) and the monomeric, parallel DNA G4s. (F) Schematic representation of the core with the defined diagonal (dashed black line) (left) and a jitter plot (right) comparing the diagonals of the two BCL2 mutants 7Q6L and 7Q48 (green) with the other mono-, bi- and tetramolecular RNA G4s (grey) and the monomeric parallel DNA G4s (black). Filled diamonds indicate NMR structures and unfilled diamonds represent crystal structures. The PDB codes of all structures are shown in the figure.

In the AUCU loop, connecting G15 and G20, a significant number of NOEs are detected in both mutants (A16 to U19), and A16 to C18 in the case of the A6U8U17 mutant. However, even with the inclusion of these long-distance NOEs, the structural diversity could not significantly be reduced. In contrast to the two longer loops, the single nucleotide loop U12 is predominantly stabilised in one orientation (Fig. 4A and B, bottom), pointed away from the G4 core, which is reflected in the NOESY data with only correlations of U12 with G11 and G13 present.

Analysis of the sugar conformation of the G4 core nucleotides via the pseudorotation phase angle P[thin space (1/6-em)]78 shows that all guanines in both constructs adopt the typical C3′-endo sugar conformation, except for G11, which is exclusively in the C2′-endo conformation (Fig. 4C). Sugar puckers were restrained according to the TOCSY data (intensity of H1′–H2′/H3′, Fig. S13B). In the case of G11, the TOCSY intensities were ambiguous. However, the HSQC showed a far upfield H1′–C1′ resonance characteristic for a C2′-endo sugar conformation79 (red box in Fig. S13A). This C2′-endo sugar conformation is rather atypical for base-paired RNA regions, as it is destabilised by about 6 kcal mol−1 compared to the C3′-endo for steric and electronic reasons.80 Interestingly, G11 is immediately followed by the single-nucleotide loop U12, which spans a C1′–C1′ distance of 8.7 ± 0.2 Å to reach G13, the next guanine on the bottom tetrad (Fig. 4D). Generally, the C2′-endo conformation allows for a longer intra-strand P⋯P distance (C2′-endo: 7 Å; C3′-endo: 5.9 Å),75,81 which probably explains the C2′-endo sugar conformation of U12. The C2′-endo conformation of G11 also enables the downward orientation of its C3′ carbon toward G13 (Fig. 4D, top inset).

We expanded our analysis to include additional parameters for a comprehensive G4 core characterisation, encompassing the tetrad area and planarity. The tetrads were designated as tetrad I, II and III for the 5′-, the middle and the 3′-end tetrad, respectively. Calculation of tetrad areas based on N9 distances (Fig. S16A) revealed an intriguing pattern for both BCL2 constructs: tetrad I > tetrad II > tetrad III (Fig. 4E), with a decrease of approximately 10 Å2 between each tetrad. Since the hydrogen bond lengths remain consistent across all three tetrads, they cannot be responsible for the size difference. Overlaying the three tetrads indicates that the guanine bases in tetrad III are more tilted out of the plane than in tetrads I and II (Fig. S16C). The planarity of the tetrads was assessed using the x3DNA-DSSR software,30 quantifying their deviation from perfect planarity. Despite employing the same low weighting in structural calculations, the planarity significantly differs between the planes (Table S3). Specifically, the middle tetrad exhibits significantly higher planarity than the other two (planarity deviation: tetrad II < tetrad I < tetrad III). While this observation is anticipated, given its sandwich position, it fails to elucidate the specific trend observed for the tetrad area. Further analysis of backbone torsion angles and geometric parameters yielded no correlations, leaving the underlying reason for this behaviour unknown.

To assess if the observed trends are typical characteristics of G4s, we compared the two BCL2 G4 constructs with other mono-, bi-, and tetramolecular RNA G4s, as well as parallel monomolecular DNA G4 structures. Among approximately 90 known RNA G4s, only a few are triplanar. We selected two bi- (2KBP, 2M18), one tetra- (2AWE), and two non-canonical monomolecular RNA G4s. One monomolecular RNA G4s (7PS8) is naturally occurring and similar in length, while a duplex stabilises the other (2LA5). Additionally, we chose seven triplanar DNA G4s with similar sequence lengths (5I2V, 6IP3, 7E5P, 1XAV, 2KQH, 2M27, and 6YY4), and one structure stabilised by a hairpin (7CLS).

Sugar pucker analysis revealed the expected pattern, with RNA G4s primarily adopting the C3′-endo conformation, whereas DNA G4s predominantly have the C2′-endo sugar pucker (Fig. S17).12,82 Despite this tendency, our analysis disclosed numerous instances of the C2′-endo conformation in RNA G4s, especially near single-nucleotide loops and bulges. In addition, several NMR structures contain models with different sugar conformations for certain guanines as well as conformations other than the C2′-endo and C3′-endo sugar pucker. These observations show that the occurrence of C2′-endo conformations in RNA G4s is not exceptional and suggest that the adaptation of the sugar pucker is a general mechanism to accommodate specific structural features.

In the next step, we investigated whether the molecularity of the G4s or the preference for a sugar pucker influences the core structure. Analysis of the tetrad areas revealed no differences between RNA and DNA G4s, nor a tendency within the tetrads, suggesting that the observed decreasing trend in our structure was an exception rather than a general feature (Fig. 4E). We also determined the tetrad planarity and found that DNA G4s tend to have slightly more planar tetrads (Tables S3 and S4). However, a direct comparison is not meaningful as X-ray structures generally show higher compactness and planarities. In addition, in NMR structure calculations, the planarities are often set manually.

The overall size of a G4 core mainly depends on the helical rise and the helical twist (Fig. S19), with the diagonal serving as a key metric for this dimension. This diagonal corresponds to the distance between the N9 atoms of a guanine in the 5′ tetrad and a guanine on the right-hand neighbouring strand of the 3′ tetrad and is the largest distance in the G4 core (Fig. 4F, left). The four diagonals of each structure were analysed to evaluate the core size. This approach has the advantage that the diagonal, in contrast to the helical rise and the helical twist, is a measure that can also be used to compare G4s with bulges, different strand directions or other non-canonical features. While a general comparison suggests no significant differences between mono-, bi- and tetramolecular RNA G4s, the two BCL2 structures with minimal diagonal lengths and a narrow distribution (A6A8U17: 13.9 ± 0.4 Å; A6U8U17: 13.8 ± 0.3 Å) stand out as notably compact and symmetrical. The two other monomolecular RNA G4s, 7PS8 and 2LA5, exhibit a less compact core and a distinctively larger distribution of diagonal lengths, most likely due to their non-canonical G4 nature. Unlike canonical G4s, having only consecutive guanines and short loops (<7 nucleotides),83 non-canonical structures contain additional structural features such as interruptions by bulges,84 longer loops or elongated 5′ and 3′ ends that can promote additional secondary structures.85 Among the RNA G4s, 7PS8 is most similar to the BCL2 structures (Fig. S17–S19). However, the presence of a three-nucleotide bulge in the second G-strand seems to push the guanine coming after the loop slightly out of the tetrad, affecting the corresponding diagonal. Comparing the two BCL2 structures with canonical monomolecular parallel DNA G4s underlines the exceptional compactness and symmetry of their cores. In DNA, G4s greater variability in sugar conformation may account for the larger core and the generally wide distribution of diagonal lengths.

Conclusions

Acute lymphoblastic leukaemia (ALL) is the most prevalent form of cancer and leukaemia in children. The remarkable efficacy of Bcl-2 inhibitors in ALL treatments underscores the importance of reducing Bcl-2 protein levels to facilitate apoptosis. Currently, treatments are limited to agents that exclusively act at the protein level.86–88 However, the regulatory involvement of the 5′ UTR BCL2 mRNA G4 in Bcl-2 protein expression presents a promising alternative target for future drug development. A crucial first step in this direction is to understand the structural and dynamic properties of these small regulatory elements, which is best achieved by NMR spectroscopy. However, the highly dynamic nature of the BCL2 RNA G4 (25mer wt) leads to extensive resonance overlap and signal broadening. By truncating and mutating the sequence, we have restricted the G4 to a single conformation and solved one of the first monomolecular RNA G4 structures by applying a new approach for the unambiguous assignment of the core guanines. This method requires no site-specific labelling but utilises uniformly isotope-labelled RNA, along with three aligned NMR spectra and is not only applicable for any G4 type but also holds promise for other Hoogsteen hydrogen bonding motifs, such as A(GGGG)A hexads, CGG or CGA+ triplexes.

Although structural diversity has been dramatically expanded in recent years by non-canonical motifs,36,89 structural comparison with other published parallel triplanar G4s revealed no recognisable structural features of the core specific to RNA or DNA G4s or distinguishing mono- from bi- or tetramolecular structures. Despite RNA G4s favouring C3′-endo sugar puckers and DNA G4s C2′-endo sugar puckers, the sugar pucker is less rigidly defined compared to canonical base pairing, where it dictates the overall structure of the double helix as A- or B-type. Instead, sugar pucker adaptation allows the inclusion of restrictive structural features such as single nucleotide loops and bulges.

The coexistence of multiple G4 structures of the wild-type BCL2 RNA G4 sequence in vitro, together with the highly dynamic loop structures observed in the two triple mutants, suggests a complex scenario in vivo. While this complexity poses a challenge, it also presents intriguing opportunities for drug development as the intrinsic folding dynamics of G4 structures may influence their function by creating distinct molecular recognition sites for natural or artificial biomolecules, including binding proteins, miRNAs, or drug molecules. Therefore, further structural studies of the BCL2 RNA G4 should focus on the elucidation of the different conformers of the wild-type sequence and their dynamic relationship. This novel assignment strategy, allowing for unambiguous assignment of the core guanines, has the potential to elucidate such complex RNA G4 systems in the future, thereby advancing insights into their intrinsic dynamics. This knowledge is crucial for a better understanding of their biological function and a first step in the process of producing new targeted cancer treatments including G4s.

Data availability

Structure coordinates, NMR chemical shifts and spectral peak lists for the BCL-2 RNA A6A8U17 and A6U8U17 mutants have been deposited in the RCSB Protein Data Bank under the PDB IDs 7Q6L and 7Q48.76,77 NMR chemical shifts and spectral peak lists are also available from the Biological Magnetic Resonance Data Bank (BMRB) under the IDs 34676 and 34674. Other data supporting this article have been included as part of the ESI.

Author contributions

Z. W. and C. F. R. contributed equally to this work. Z. W., C. F. R., A. D. M., S. Jo., and R. K. O. S. conceived the project; A. D. M. developed, isolated, and measured the first NMR constructs; Z. W. and S. Ju. developed the methodology; Z. W. and C. F. R. performed the formal analysis, investigation, and validation of the data with assistance from S. Ju. and S. Jo.; S. Jo. and R. K. O. S. administered and supervised the project and helped to interpret the results; C. F. R. and S. Jo. wrote the original manuscript; all authors reviewed and edited the manuscript.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

We are grateful to the Swiss National Science Foundation (165868 to R. K. O. S.), the Candoc Grant of UZH (FK-14-100, FK-15-096 to A. D. M., FK-23-097 to C. F. R.), the student fellowship from the China Scholarship Council (No. 201506190127 to Z. W.), and the UZH core funding for financial support. We thank the MS facility of the Department of Chemistry for the MALDI-TOF measurements. We also thank Dr Susann Zelger-Paulus for carefully reading this manuscript and providing helpful suggestions.

References

  1. E. Henderson, C. C. Hardin, S. K. Walk, I. Tinoco and E. H. Blackburn, Cell, 1987, 51, 899–908 CrossRef CAS PubMed.
  2. D. Sen and W. Gilbert, Nature, 1988, 334, 364–366 CrossRef CAS PubMed.
  3. J. L. Huppert and S. Balasubramanian, Nucleic Acids Res., 2007, 35, 406–413 CrossRef CAS.
  4. J. L. Huppert, A. Bugaut, S. Kumari and S. Balasubramanian, Nucleic Acids Res., 2008, 36, 6260–6268 CrossRef CAS PubMed.
  5. N. Maizels and L. T. Gray, PLoS Genet., 2013, 9, e1003468 Search PubMed.
  6. H. L. Lightfoot, T. Hagen, N. J. Tatum and J. Hall, FEBS Lett., 2019, 593, 2083–2102 CAS.
  7. V. S. Chambers, et al., Nat. Biotechnol., 2015, 33, 877–881 CrossRef CAS PubMed.
  8. G. Biffi, D. Tannahill, J. McCafferty and S. Balasubramanian, Nat. Chem., 2013, 5, 182–186 CrossRef CAS PubMed.
  9. G. Biffi, M. Di Antonio, D. Tannahill and S. Balasubramanian, Nat. Chem., 2014, 6, 75–80 CAS.
  10. A. Henderson, et al., Nucleic Acids Res., 2014, 42, 860–869 CrossRef CAS PubMed.
  11. J. Song, J.-P. Perreault, I. Topisirovic and S. Richard, Translation, 2016, 4, e1244031 Search PubMed.
  12. P. Agarwala, S. Pandey and S. Maiti, Org. Biomol. Chem., 2015, 13, 5570–5585 Search PubMed.
  13. D. Varshney, J. Spiegel, K. Zyner, D. Tannahill and S. Balasubramanian, Nat. Rev. Mol. Cell Biol., 2020, 21, 459–474 Search PubMed.
  14. K. Lyu, E. Y.-C. Chow, X. Mou, T.-F. Chan and C. K. Kwok, Nucleic Acids Res., 2021, 49, 5426–5450 CrossRef CAS.
  15. C. K. Kwok, G. Marsico, A. B. Sahakyan, V. S. Chambers and S. Balasubramanian, NATO ASI Ser., Ser. G, 2016, 13, 841–844 CAS.
  16. S. Kumari, A. Bugaut, J. L. Huppert and S. Balasubramanian, Nat. Chem. Biol., 2007, 3, 218–221 CrossRef CAS PubMed.
  17. M. J. Morris, Y. Negishi, C. Pazsint, J. D. Schonhoft and S. Basu, J. Am. Chem. Soc., 2010, 132, 17831–17839 Search PubMed.
  18. D. Bhattacharyya, P. Diamond and S. Basu, Biochemistry, 2015, 54, 1879–1885 CAS.
  19. P. Agarwala, S. Pandey, K. Mapa and S. Maiti, Biochemistry, 2013, 52, 1528–1538 CrossRef CAS PubMed.
  20. R. Shahid, A. Bugaut and S. Balasubramanian, Biochemistry, 2010, 49, 8300–8306 CrossRef CAS.
  21. N. H. Campbell and S. Neidle, Met. Ions Life Sci., 2012, 10, 119–134 CAS.
  22. H. Guiset Miserachs, D. Donghi, R. Börner, S. Johannsen and R. K. O. Sigel, JBIC, J. Biol. Inorg. Chem., 2016, 21, 975–986 CrossRef CAS PubMed.
  23. A. Włodarczyk, P. Grzybowski, A. Patkowski and A. Dobek, J. Phys. Chem. B, 2005, 109, 3594–3605 CrossRef.
  24. R. J. Trachman, et al., Acta Crystallogr., Sect. F: Struct. Biol. Cryst. Commun., 2019, 75, 547–551 CrossRef CAS.
  25. S. C. Y. Jeng, et al., RNA, 2021, 27, 433–444 CrossRef CAS.
  26. M. Mieczkowski, et al., Nat. Commun., 2021, 12, 3549 CrossRef CAS PubMed.
  27. Y. S. Zhang, G. N. Parkinson and D. G. Wei, 6XRQ, https://www.rcsb.org/.
  28. N. G. Dolinnaya, A. M. Ogloblina and M. G. Yakubovskaya, Biochemistry, 2016, 81, 1602–1649 CAS.
  29. R. W. Harkness and A. K. Mittermaier, Biochim. Biophys. Acta, Proteins Proteomics, 2017, 1865, 1544–1554 CrossRef CAS PubMed.
  30. X.-J. Lu, Nucleic Acids Res., 2020, 48, e74 CrossRef CAS PubMed.
  31. H. Berman, K. Henrick and H. Nakamura, Nat. Struct. Biol., 2003, 10, 980 CrossRef CAS.
  32. E. McRae, N. S. Vallina, B. K. Hansen, A. Boussebayle and E. S. Andersen, 7ZJ5, https://www.rcsb.org/.
  33. M. Malgowska, K. Czajczynska, D. Gudanis, A. Tworak and Z. Gdaniec, Acta Biochim. Pol., 2016, 63, 609–621 CAS.
  34. R. J. Trachman, et al., Structure, 2020, 28, 776–785 CrossRef CAS PubMed.
  35. K. D. Warner, et al., Nat. Struct. Mol. Biol., 2014, 21, 658–663 CrossRef CAS.
  36. P. Fernandez-Millan, A. Autour, E. Ennifar, E. Westhof and M. Ryckelynck, RNA, 2017, 23, 1788–1795 CAS.
  37. R. J. Trachman, et al., Biochemistry, 2018, 57, 3544–3548 CrossRef CAS.
  38. L. Sjekloća and A. R. Ferré-D’Amaré, Cell Chem. Biol., 2019, 26, 1159–1168 CrossRef.
  39. R. J. Trachman, et al., Nat. Chem. Biol., 2019, 15, 472–479 CrossRef PubMed.
  40. L. F. M. Passalacqua, et al., Nat. Commun., 2023, 14, 2969 CrossRef CAS.
  41. T. Mashima, et al., Nucleic Acids Res., 2013, 41, 1355–1362 CrossRef CAS.
  42. T. Hayashi, et al., Nucleic Acids Res., 2014, 42, 6861–6875 CrossRef CAS.
  43. H. Huang, et al., Nat. Chem. Biol., 2014, 10, 686–691 Search PubMed.
  44. N. Vasilyev, et al., Proc. Natl. Acad. Sci. U.S.A., 2015, 112, E5391–E5400 CrossRef CAS.
  45. L. Yatime, et al., Nat. Commun., 2015, 6, 6481 CrossRef CAS.
  46. D. Koirala, et al., Nucleic Acids Res., 2018, 46, 2624–2635 CrossRef CAS.
  47. J. Song, et al., Science, 2023, 381, 1331–1337 CrossRef CAS.
  48. S. Balaratnam, et al., Cell Chem. Biol., 2023, 30, 643–657 CrossRef CAS PubMed.
  49. J. Marquevielle and S. Amrane, 7PS8, https://www.rcsb.org/.
  50. C. A. Escobar, R. Petersen and S. E. Butcher, 8TNS, https://www.rcsb.org/.
  51. M. Orehova, J. Plavec and V. Kocman, ACS Omega, 2024, 9, 7215–7229 CrossRef CAS PubMed.
  52. M. Marušič, J. Schlagnitweit and K. Petzold, ChemBioChem, 2019, 20, 2685–2710 CrossRef.
  53. M. Marušič, M. Toplishek and J. Plavec, Curr. Opin. Struct. Biol., 2023, 79, 102532 CrossRef.
  54. M. Adrian, B. Heddi and A. T. Phan, Methods, 2012, 57, 11–24 CrossRef CAS PubMed.
  55. R. A. Hughes and A. D. Ellington, Cold Spring Harbor Perspect. Biol., 2017, 9, a023812 CrossRef PubMed.
  56. L. T. Olenginski, K. M. Taiwo, R. M. LeBlanc and T. K. Dayie, Molecules, 2021, 26, 5581 CrossRef CAS PubMed.
  57. S. Gallo, M. Furler and R. K. O. Sigel, Chimia, 2005, 59, 812–816 CrossRef CAS.
  58. D.-H. Zhang, et al., Biochemistry, 2010, 49, 4554–4563 CrossRef CAS PubMed.
  59. R. J. Youle and A. Strasser, Nat. Rev. Mol. Cell Biol., 2008, 9, 47–59 CrossRef CAS PubMed.
  60. T. Satou, B. J. Cummings and C. W. Cotman, Brain Res., 1995, 697, 35–43 CrossRef CAS PubMed.
  61. O. Bar-Am, O. Weinreb, T. Amit and M. B. H. Youdim, FASEB J., 2005, 19, 1899–1901 CrossRef CAS PubMed.
  62. B. Desoize, Anticancer Res., 1994, 14, 2291–2294 CAS.
  63. C. A. Schmitt and S. W. Lowe, Blood Cells, Mol., Dis., 2001, 27, 206–216 CAS.
  64. W. Tan and G. Yuan, Rapid Commun. Mass Spectrom., 2013, 27, 560–564 CAS.
  65. P. Sengupta, S. Chattopadhyay and S. Chatterjee, Drug Discovery Today, 2017, 22, 1165–1186 CAS.
  66. N. Q. Do and A. T. Phan, Chem.–Eur. J., 2012, 18, 14752–14759 CAS.
  67. A. T. Phan, J. Biomol. NMR, 2000, 16, 175–178 CrossRef CAS PubMed.
  68. H. Martadinata and A. T. Phan, J. Am. Chem. Soc., 2009, 131, 2570–2578 CAS.
  69. A. T. Phan, M. Guéron and J. L. Leroy, Methods Enzymol., 2001, 338, 341–371 CAS.
  70. A. Majumdar, A. Kettani, E. Skripkin and D. J. Patel, J. Biomol. NMR, 1999, 15, 207–211 CAS.
  71. K. Lu, Y. Miyazaki and M. F. Summers, J. Biomol. NMR, 2010, 46, 113–125 CrossRef CAS PubMed.
  72. L. G. Scott and M. Hennig, Methods Mol. Biol., 2008, 452, 29–61 CrossRef CAS.
  73. P. Tóthová, P. Krafčíková and V. Víglaský, Biochemistry, 2014, 53, 7013–7027 CrossRef.
  74. S. Kolesnikova, M. Hubálek, L. Bednárová, J. Cvacka and E. A. Curtis, Nucleic Acids Res., 2017, 45, 8684–8696 CrossRef CAS.
  75. W. Saenger, Principles of Nucleic Acid Structure, Springer-Verlag, New York, 1984 Search PubMed.
  76. Z. Wang, S. Jurt, A. Dominguez-Martin, S. Johannsen and R. Sigel, 7Q6L, https://www.rcsb.org/.
  77. Z. Wang, S. Jurt, A. Dominguez-Martin, S. Johannsen and R. Sigel, 7Q48, https://www.rcsb.org/.
  78. C. Altona and M. Sundaralingam, J. Am. Chem. Soc., 1972, 94, 8205–8212 CrossRef CAS PubMed.
  79. M. Riad, et al., Nat. Protoc., 2021, 16, 5146–5170 CAS.
  80. L. Li and J. W. Szostak, J. Am. Chem. Soc., 2014, 136, 2858–2865 CrossRef CAS PubMed.
  81. L. J. W. Murray, W. B. Arendall, D. C. Richardson and J. S. Richardson, Proc. Natl. Acad. Sci. U.S.A., 2003, 100, 13904–13909 CAS.
  82. D. Yang, Methods Mol. Biol., 2019, 2035, 1–24 CAS.
  83. J. L. Huppert and S. Balasubramanian, Nucleic Acids Res., 2005, 33, 2908–2916 CAS.
  84. B. Pan, K. Shi and M. Sundaralingam, Proc. Natl. Acad. Sci. U.S.A., 2006, 103, 3130–3134 CAS.
  85. A. T. Phan, et al., Nat. Struct. Mol. Biol., 2011, 18, 796–804 CAS.
  86. A. W. Roberts, et al., N. Engl. J. Med., 2016, 374, 311–322 Search PubMed.
  87. C. M. Adams, S. Clark-Garvey, P. Porcu and C. M. Eischen, Front. Oncol., 2018, 8, 636 Search PubMed.
  88. E. Coustan-Smith, et al., Blood, 1996, 87, 1140–1146 CrossRef CAS.
  89. M. T. Banco and A. R. Ferré-D’Amaré, RNA, 2021, 27, 390–402 CrossRef CAS PubMed.

Footnotes

Electronic supplementary information (ESI) available: Experimental section, Fig. S1–S19, Tables S1–S4, structural analysis raw data. See DOI: https://doi.org/10.1039/d5sc01416f
These authors contributed equally to this work.

This journal is © The Royal Society of Chemistry 2025
Click here to see how this site uses Cookies. View our privacy policy here.