Ol'ha O. Brovarets'*a,
Alona Muradovab and
Dmytro M. Hovorunab
aDepartment of Molecular and Quantum Biophysics, Institute of Molecular Biology and Genetics, National Academy of Sciences of Ukraine, 150 Akademika Zabolotnoho Street, 03680, Kyiv, Ukraine. E-mail: o.o.brovarets@gmail.com
bDepartment of Molecular Biotechnology and Bioinformatics, Institute of High Technologies, Taras Shevchenko National University of Kyiv, 2-h Akademika Hlushkova Avenue, 03022, Kyiv, Ukraine
First published on 27th July 2021
At the MP2/6-311++G(2df,pd)//B3LYP/6-311++G(d,p) level of quantum-mechanical theory, we provide for the first time a comprehensive investigation of the physico-chemical mechanisms of the 55 conformational transformations of the biologically-important G·C nucleobase pairs – Watson–Crick (WC), reverse Watson–Crick (rWC), Hoogsteen (H), reverse Hoogsteen (rH), wobble (w) and reverse wobble (rw) base pairs by the participation of the G and C bases in the canonical and rare tautomeric forms (“r” – means reverse configuration of the base pair). It was established that all these G·C nucleobase pairs can conformationally transform into each other without the changing of the tautomeric status of the G and C bases. These transitions occur through significantly non-planar transition states via the mutual rotation of the G and C bases relative to each other within the G·C nucleobase pair around the upper, middle or lower intermolecular H-bonds: WC ↔ rWC, WC ↔ rwWC, rWC ↔ WC, rWC ↔ wWC, wWC ↔ rwWC, H ↔ rH, H ↔ rwH, rH ↔ H, rH ↔ wH, wH ↔ rwH. Gibbs free energies ΔG of activation for these conformational transformations are ΔG = 2.96–19.04/3.58–13.36 kcal mol−1 (in vacuum under normal conditions (T = 298.15 K)), which means that these reactions proceed quite fast. Obtained conformational transformations are accompanied by the disruption and further formation of the intermolecular specific contacts in the G·C nucleobase pairs (H-bonds and attractive van der Waals contacts). As a result, 76 conformers of the G·C nucleobase pairs were established – 48 base pairs in WC, rWC, wWC and rwWC configurations and 28 base pairs in H, rH, wH and rwH configurations with relative Gibbs free ΔG/electronic ΔE energies in the range ΔG/ΔE = 0.00–44.73/0.00–46.99 and ΔG/ΔE = 0.00–37.52/0.00–38.54 kcal mol−1, respectively (in vacuum under normal conditions). Experimental investigation and verification of the novel G·C nucleobase pairs are promising tasks for the future research. Based on the obtained data, biologically important conclusions were made about the importance of the conformational mobility of the G·C nucleobase pairs for the understanding of the functioning of the DNA and RNA molecules and their transition from the parallel into the anti-parallel duplexes and vice versa.
Shortly after the establishment of the spatial organization of the DNA molecule by James Watson and Francis Crick, the rare tautomeric hypothesis of spontaneous point mutagenesis was formulated,2,3 which considers the transformation or transition of the nucleotide bases from the main (canonical) into the rare (mutagenic) tautomeric forms due to the proton transfers as a main source of the origin of the spontaneous point mutations or structural variability. Since that time, the topic of prototropic tautomerism still remains important over decades to this day.4–14
In general, the topic of the prototropic tautomerism has actively attracted close attention in different areas of research – drug design, proton transfer processes, formation of the mispairs, origin of the spontaneous point mutations, biologically important molecules and also play different significant roles in functioning of the DNA and RNA biomolecules.15–29 This fact enables to open new possibilities for the understanding of the fundamental mechanisms of the functioning of the biomolecules in a living cell.
The point of view, that tautomeric transformations in biological molecules are inseparable from the conformational transformations, is becoming more and more popular recently.30–36 Thus, lately it was found out,37–42 that intrapair proton transfer in the DNA base pairs causes their mutagenic tautomerization and conformational changes due to the mutual shifting of the bases respectively each other into the minor or major DNA grooves as well. These theoretical approaches have been realized for model objects, which, however, correctly reflects the real state of affairs and could be confirmed experimentally.42 Such approach significantly extends the biological importance of tautomerism and indicates that it goes far beyond the framework of the classical Watson–Crick tautomeric hypothesis,2,3 which considers tautomerism as the source of the origin of the spontaneous point mutations, which arise at the DNA replication.
As of today, while the conformationally-tautomeric mobility of the classical A·T DNA base pair has been already exhaustively explored,42–47 investigation of the conformational variety of the classical G·C DNA base pair has been insufficiently provided. Thus, recently it has been provided comprehensive research48,49 of the tautomerization pathways of the reverse Löwdin G*·C*(rWC), Hoogsteen G*t·C*(H) and reverse Hoogsteen G*t·C*(rH) nucleobase pairs by the participation of the bases in the rare, in particular mutagenic, tautomeric forms (marked with an asterisk “*”; “r” – reverse orientation of the base pair; “t” – trans-orientation of the O6H hydroxyl group of the G base) – via the single (SPT) or double (DPT) proton transfer along the neighboring intermolecular H-bonds with Gibbs free energies of activation for these reactions varying in the range 3.64–31.65 kcal mol−1 in vacuum under normal conditions (T = 298.15 K), leading to the novel G·C*O2(rWC), G*N2·C(rWC), G*tN2·C(rWC), G*N7·C(H) and G*tN7·C(rH) conformers by the participation of the canonical and rare tautomers of the G and C bases. Further, these studies have been followed by our research,49 devoted to the physico-chemical mechanisms of the tautomeric wobblization of the biologically-important G·C nucleobase pairs – G*·C*(rWC), G*t·C*(H) and G*t·C*(rH), leading to the novel wobble G·C nucleobase pairs. However, these representations obviously could not explain the full range of the biological and structural properties of the nucleic acids, which continue to challenge by its mystery.
This work is intended to show at the example of the different biologically important G·C nucleobase pairs – classical Watson–Crick G·C(WC), reverse Watson–Crick G·C(rWC), Löwdin G*·C*(WC), reverse Löwdin G*·C*(rWC), Hoogsteen G·C(H) and reverse Hoogsteen G·C(rH), as well as wobble Watson–Crick G·C(wWC), reverse wobble Watson–Crick G·C(rwWC), wobble Löwdin G*·C*(wWC), reverse wobble Löwdin G*·C*(rwWC), wobble Hoogsteen G·C(wH) and reverse wobble Hoogsteen G·C(rwH) base pairs, which contain G and C bases in the main or rare tautomeric forms, that these base pairs are conformationally mobile structures and that rotations of the bases around the individual intermolecular H-bonds are closely interconnected with the tautomeric status of the base pairs.
So, the aim of this study is to establish the conformational pathways for the different G·C nucleobase pairs through the mutual rotation of the G and C bases around the intermolecular H-bonds: WC ↔ rWC, WC ↔ rwWC, rWC ↔ WC, rWC ↔ wWC, wWC ↔ rwWC; H ↔ rH, H ↔ rwH, rH ↔ H, rH ↔ wH, wH ↔ rwH.
Such statement of the task is quite promising, since it enables to discover novel interpretations of the functional role of the so-called excited states – different conformers of the G·C nucleobase pair by the participation of the canonical and rare tautomers, which are now attributed to the biologically-important functions, especially in the RNA molecule.15–20
As a results of this investigation, for the first time except the classical Watson–Crick G·C(WC) and Hoogsteen G·C(H) base pairs, it was revealed wide variety of the novel base pairs by the participation of the canonical and rare tautomers of the G and C bases and physico-chemical mechanisms of their mutual conformational transformations into each other. Altogether, it was discovered 76 conformations of the G·C nucleobase pairs – 48 base pairs in WC, rWC, wWC and rwWC configurations with relative Gibbs free ΔG/electronic ΔE energies in the range – ΔG/ΔE = 0.00–44.73/0.00–46.99 kcal mol−1, respectively, and 28 base pairs in H, rH, wH and rwH configurations with ΔG/ΔE = 0.00–37.52/0.00–38.54 kcal mol−1, respectively, in vacuum under normal conditions (T = 298.15 K).
Based on these data, it was expressed assumption about their possible biological role in the conformational transformations of the DNA and RNA15–20 from parallel to anti-parallel orientation and vice versa50 without proton transfer and changing of the tautomeric status of the nucleobases.
These obtained results extend the existing thoughts about the microstructural mechanisms of these processes, as well as about their functional role.
We have confirmed local minima and TSs, localized by Synchronous Transit-guided Quasi-Newton method,59 on the potential energy landscape by the absence or presence, respectively, of one imaginary frequency in the vibrational spectra of the complexes. All reaction pathways have been reliably confirmed by performing the optimization of the structures, which are close to the TS in the forward and reverse directions at the B3LYP/6-311++G(d,p) level of theory.
All calculations have been performed in the continuum with ε = 1, that adequately reflects the processes occurring in real biological systems without deprivation of the structurally-functional properties of the bases in the composition of the DNA or RNA molecules and satisfactorily models the substantially hydrophobic recognition pocket of the DNA–polymerase machinery as a part of the replisome.60,61
The Gibbs free energy G for all structures was obtained in the following way:
G = Eel + Ecorr, | (1) |
The atomic numbering scheme for the DNA bases is conventional.36 In this study rare tautomeric forms of the G and C nucleobases are marked by an asterisk.10
Altogether, it was revealed 62 novel G·C nucleobase pairs with wobble (w) and reverse wobble (rw) geometries from the side of the Watson–Crick and Hoogsten edges (ΔG/ΔE = 0.00–33.62/0.00–35.41 and 0.00–35.23/35.50 kcal mol−1), respectively, which are planar and significantly non-planar pairs, stabilized by the participation of at least two H-bonds, one of which could be connected to the amino nitrogen atom of the G or C bases (Table 1). Such configuration of base pairs with necessity entails their tautomerization via the protonated amino group as transition state. However, they would become the subjects of our precise consideration in further investigations.
Altogether, for the G·C nucleobase pairs in the Watson–Crick (WC) and Hoogsteen (H) configurations involving basic and rare tautomeric forms of the G and C bases, we have localized 55 transition states (34 for the G·C base pairs from the WC side and 21 for the G·C base pairs from the H side), defining the conformational transformations by the mutual rotations of the bases around the individual intermolecular H-bonds (Tables 1 and 2). These conformational transformations of the G·C nucleobase pairs occur with or without the changing of the general geometry of the base pair and break of the intermolecular specific contacts with their further formation (H-bonds and van der Waals contacts), leading to the reorientation of the base pairs with cis-oriented N1H/N9H glycosidic bonds to the base pairs with trans-oriented N1H/N9H glycosidic bonds and vice versa: WC ↔ rWC, WC ↔ rwWC, rWC ↔ WC, rWC ↔ wWC, wWC ↔ rwWC, H ↔ rH, H ↔ rwH, rH ↔ H, rH ↔ wH, wH ↔ rwH (Tables 1 and 2).
Conformational transformation | νi TSa | ΔGb | ΔEc | ΔΔGTSd | ΔΔETSe | μTSf |
---|---|---|---|---|---|---|
a Imaginary frequency at the TS of the conformational transformation, cm−1.b Relative Gibbs free energy of the formed G·C nucleobase pair (T = 298.15 K), kcal mol−1.c Relative electronic energy of the formed G·C nucleobase pair, kcal mol−1.d Relative Gibbs free energy of the TS of the conformational transformation (T = 298.15 K), kcal mol−1.e Relative electronic energy of the TS of the conformational transformation, kcal mol−1.f Dipole moment of the TS, D. | ||||||
Watson–Crick (WC), reverse Watson–Crick (rWC), wobble Watson–Crick (wWC) and reverse wobble Watson–Crick (rwWC) configurations | ||||||
G·C(WC) ↔ G·C(rwWC) | 24.2 | 11.53 | 13.09 | 12.63 | 13.30 | 7.94 |
G+·C(rWC) ↔ G·C(WC) | 37.2 | −14.55 | −15.33 | 6.93 | 7.06 | 4.12 |
G*·C*(WC) ↔ G*·C*(rWC) | 37.2 | 2.03 | 2.07 | 13.11 | 14.53 | 4.12 |
G*·C*(WC) ↔ G*·C*(rwWC/H) | 17.7 | 7.63 | 10.21 | 10.05 | 11.17 | 6.47 |
G*·C*(WC) ↔ G*·C*(rwWC) | 30.3 | 11.95 | 14.30 | 12.98 | 14.53 | 5.21 |
G*·C*(rWC) ↔ G*·C*(wWC/H) | 11.8 | 8.40 | 11.16 | 10.29 | 11.67 | 6.09 |
G*·C*(rWC) ↔ G*·C*(wWC) | 33.3 | 8.76 | 11.00 | 9.95 | 11.28 | 5.86 |
G·C*O2(rWC) ↔ G·C*O2(wWC) | 43.4 | 16.74 | 17.33 | 19.04 | 19.92 | 7.45 |
G*·C*(rwWC/H) ↔ G*·C*(wWC/H) | 18.8 | 2.80 | 3.02 | 7.49 | 7.16 | 3.94 |
G*t·C*(rwWC/H) ↔ G*t·C*(wWC/H) | 11.2 | 4.37 | 4.76 | 8.48 | 8.35 | 3.30 |
G*t·C*(rWC) ↔ G*t·C*(WC) | 24.0 | 1.34 | 1.54 | 5.23 | 6.28 | 1.08 |
G*t·C*(rWC) ↔ G*t·C*(wWC) | 14.6 | 2.74 | 4.24 | 4.13 | 3.97 | 2.10 |
G·C*tO2(rWC) ↔ G·C*tO2(wWC) | 36.4 | 9.20 | 9.84 | 12.28 | 13.65 | 7.40 |
G*t·C*t(WC) ↔ G*t·C*t(rwWC)↑ | 21.7 | 3.69 | 5.37 | 5.05 | 6.01 | 3.51 |
G*t·C*t(WC) ↔ G*t·C*t(rwWC) | 27.3 | 5.36 | 7.03 | 6.69 | 6.91 | 1.92 |
G·C*(wWC)↑ ↔ G·C*(rwWC) | 18.1 | 3.20 | 3.88 | 4.97 | 5.50 | 9.37 |
G·C*(rwWC) ↔ G·C*(wWC) | 24.2 | 1.83 | 2.37 | 7.36 | 7.73 | 10.57 |
G·C*(rwWC)↑ ↔ G·C*(wWC)↑ | 26.5 | 1.38 | 1.55 | 9.98 | 11.47 | 6.10 |
G·C*(rwWC)↑ ↔ G·C*(wWC) | 37.4 | 2.75 | 3.05 | 4.34 | 5.80 | 8.68 |
G*·C(rwWC)↑ ↔ G*·C(wWC/H) | 38.1 | 2.29 | 4.06 | 3.73 | 4.71 | 7.39 |
G*·C(rwWC)↑ ↔ G*·C(wWC)↓ | 29.8 | 3.32 | 4.09 | 8.09 | 9.14 | 6.05 |
G*·C(wWC)↓ ↔ G*·C(rwWC) | 36.5 | 2.08 | 2.81 | 2.96 | 2.91 | 5.63 |
G*·C*O2(wWC)↑ ↔ G*·C*O2(rwWC)↓ | 42.8 | 3.09 | 3.82 | 12.20 | 11.29 | 4.64 |
G*·C*O2(wWC)↑ ↔ G*·C*O2(rwWC/H) | 55.3 | 8.74 | 9.01 | 11.52 | 11.00 | 6.00 |
G*·C*O2(rwWC)↓ ↔ G*·C*O2(wWC) | 87.3 | 7.07 | 6.14 | 9.10 | 7.84 | 4.48 |
G*N2·C*(wWC)↓ ↔ G*N2·C*(rwWC)↓ | 36.5 | 1.68 | 1.00 | 13.90 | 15.15 | 2.62 |
G*t·C*O2(rwWC)↓ ↔ G*t·C*O2(wWC) | 72.6 | 8.04 | 8.49 | 9.55 | 9.27 | 1.75 |
G*·C*tO2(wWC)↑ ↔ G*·C*tO2(rwWC/H) | 51.0 | 8.10 | 8.45 | 11.26 | 11.09 | 5.39 |
G*·C*tO2(wWC)↑ ↔ G*·C*tO2(rwWC)↓ | 39.5 | 5.73 | 5.65 | 11.27 | 10.80 | 5.74 |
G*·C*tO2(rwWC)↓ ↔ G*·C*tO2(wWC) | 101.9 | 4.58 | 4.75 | 6.87 | 6.58 | 4.03 |
G*t·C*tO2(rwWC)↓ ↔ G*t·C*tO2(wWC) | 92.9 | 6.12 | 6.78 | 7.82 | 7.96 | 4.60 |
G*tN2·C*(wWC)↓ ↔ G*tN2·C*(rwWC)↓ | 28.3 | 3.04 | 3.02 | 11.90 | 13.01 | 3.01 |
G*tN2·C*(rwWC)↓ ↔ G*tN2·C*(wWC)↑ | 36.4 | 7.28 | 9.50 | 8.06 | 8.02 | 3.67 |
G*tN2·C*t(rwWC)↓ ↔ G*tN2·C*t(wWC)↑ | 28.9 | 10.08 | 11.62 | 9.92 | 9.99 | 4.01 |
Hoogsteen (H), reverse Hoogsteen (rH), wobble Hoogsteen (wH) and reverse wobble Hoogsteen (rwH) configurations | ||||||
G*·C*(rH) ↔ G*·C*(H) | 11.3 | 0.76 | 0.75 | 3.58 | 2.90 | 1.45 |
G*t·C*(H) ↔ G*t·C*(rH) | 25.8 | 2.59 | 2.72 | 11.35 | 12.71 | 3.33 |
G*t·C*(H) ↔ G*t·C*(rwH) | 19.5 | 5.28 | 7.08 | 8.37 | 8.98 | 7.54 |
G*t·C*(rH) ↔ G*t·C*(wWC/H) | 12.0 | 7.05 | 9.12 | 8.79 | 9.31 | 6.73 |
G*t·C*O2(wH)↑ ↔ G*t·C*O2(rwH)↓ | 48.0 | 3.93 | 4.69 | 8.53 | 9.22 | 4.25 |
G*t·C*O2(wH)↑ ↔ G*t·C*O2(rwH) | 104.7 | 4.84 | 6.47 | 9.87 | 11.67 | 7.58 |
G*t·C*O2(wH)↑ ↔ G*t·C*O2(rwWC/H) | 42.0 | 8.47 | 9.73 | 9.67 | 9.96 | 7.55 |
G*t·C*tO2(wH)↑ ↔ G*t·C*tO2(rwH)↑ | 32.2 | 6.55 | 6.58 | 10.62 | 10.82 | 6.28 |
G*t·C*tO2(wH)↑ ↔ G*t·C*tO2(rwH)↓ | 45.2 | 6.31 | 7.84 | 9.15 | 9.55 | 5.92 |
G*t·C*tO2(rwH)↓ ↔ G*t·C*tO2(wH)↓ | 14.2 | 7.28 | 9.78 | 8.89 | 9.99 | 8.04 |
G*t·C(wH)↓ ↔ G*t·C(rwH) | 26.7 | 2.45 | 1.23 | 4.74 | 3.84 | 3.41 |
G*t·C(rwH)↑ ↔ G*t·C(wH)↓ | 25.5 | 2.73 | 4.24 | 5.72 | 3.92 | 4.50 |
G*t·C(rwH)↑ ↔ G*t·C(wH)↑ | 32.2 | 2.00 | 3.34 | 3.67 | 3.64 | 8.84 |
G*N7·C*(rwH)↑ ↔ G*N7·C*(wH)↑ | 34.7 | 3.25 | 3.12 | 13.36 | 13.06 | 9.49 |
G*N7·C*(rwH)↑ ↔ G*N7·C*(wH) | 19.5 | 4.62 | 4.71 | 8.70 | 7.53 | 4.58 |
G*N7·C*t(wH)↑ ↔ G*N7·C*t(rwH)↑ | 46.0 | 6.43 | 6.23 | 12.63 | 12.92 | 9.89 |
G*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↓ | 20.2 | 2.29 | 2.42 | 8.82 | 9.66 | 1.52 |
G*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↑ | 15.7 | 4.17 | 4.14 | 8.12 | 8.84 | 5.47 |
G*O6/N7·C*(rwH)↑ ↔ G*O6/N7·C*(wH)↑ | 16.3 | 1.66 | 1.97 | 6.72 | 7.98 | 4.79 |
G*O6/N7·C*(rwH)↓ ↔ G*O6/N7·C*(wH)↑ | 20.6 | 3.54 | 3.68 | 6.75 | 7.56 | 3.19 |
G*tO6/N7·C*(wH)↓ ↔ G*tO6/N7·C*(rwH)↓ | 18.1 | 2.25 | 2.28 | 10.40 | 11.67 | 2.71 |
In those cases, when TSs are joined by single intermolecular H-bond (Table 1), in most cases their lengths exceed the length of the analogical H-bond at the starting base pair, which is stabilized by two anti-parallel H-bonds. This fact clearly indicates that the latest are cooperative and mutually reinforce each other. By the way, this approach could be also applied for the numerical estimation of the cooperativity of the intermolecular H-bonds.
Characteristic feature of all TSs is quite low values of the imaginary frequencies νi = 11.2–101.9/11.3–104.7 cm−1 (Tables 1 and 2), which indicates the structural softness of these transformations, and relatively low values of their barriers (ΔΔGTS/ΔΔETS = 2.96–19.04/3.58–13.36 kcal mol−1 under normal conditions) (T = 298.15 K), pointing on the quite high speed of these conformational transformations. Just only these facts soundly testify that G·C pairs of nucleotide bases in the basic and rare tautomeric forms are conformationally mobile structures. It is quite evident that this biological observation can be transferred on the other pairs of nucleotide bases, independently from their nature and structure, almost without any quality restrictions.
It should be attracted attention to the unexpected results of this investigation and their comprehensive discussion (Tables 1 and 2).
1. It was revealed novel mechanism of the conformational transformation of the classical Watson–Crick G·C(WC) nucleobase pair into the reverse wobble Watson–Crick G·C(rwWC) nucleobase pair. This G·C(WC) ↔ G·C(rwWC) conformational transformation occurs by the rotation of the G and C bases accordingly each other around the middle intermolecular (G)N1H⋯N3(C) H-bond and is accompanied by the rebuilding of the intermolecular H-bonds: (C)N4H⋯O6(G), (G)N1H⋯N3(C), (G)N2H⋯O2(C) → (G)N1H⋯N3(C), (G)N2H⋯N3(C), (C)N4H⋯N2(G). Unusually, this transition leads to the changing of the geometry from the Watson–Crick-like to wobble-like, which is significantly non-planar with significantly non-planar NH2 amino group of the G base. Notably, that during this transition nucleobases do not change their tautomeric status. This obtained data complement the results of the previous works.42,48,49
2. It was established that rotation of the G and C bases in the reverse Hoogsteen G*·C*(rH) base pair around the middle intermolecular (C)N3H⋯N7(G) H-bond leads to the Hoogsteen G*·C*(H) base pair, causing the transformation of the intermolecular specific contacts: (G)O6⋯O2(C), (C)N3H⋯N7(G), (G)C8H⋯N4(C) → (G)O6⋯N4(C), (C)N3H⋯N7(G), (G)C8H⋯O2(C) (Table 1).
3. It was shown that tight G+·C−(rWC) ion pair relaxes to the classical Watson–Crick G·C(WC) nucleobase pair by the rotation around the middle (G)N1H⋯N3(C) H-bond, which is assisted by the proton transfer along this H-bond from the G+ to C− base. As a result of this transition, it is formed molecular structure, which is not ionic and is characterized by the low value of the imaginary frequency (νi = 37.2 cm−1) (Table 1).
4. It was proven that the Watson–Crick G*·C*(WC) and reverse Watson–Crick G*·C*(rWC) base pairs, from the one side, and the Hoogsteen G*t·C*(H) and reverse Hoogsteen G*t·C*(rH) base pairs, from the other side, turn one into another by the rotations around the middle intermolecular H-bond: G*·C*(WC) ↔ G*·C*(rWC) and G*t·C*(H) ↔ G*t·C*(rH) conformational transitions (Table 1). Conformational transition of the so-called Löwdin G*·C*(WC) nucleobase pair, involving G* and C* rare tautomers of the nucleobases, occurs through the rotation of the G* and C* tautomers around the intermolecular (C)N3H⋯N1(G) H-bond through the non-planar TSG*·C*(WC)↔G*·C*(rWC), stabilized by single (C)N3H⋯N1(G) H-bond (Table 1). This G*·C*(WC) ↔ G*·C*(rWC) conformational transition leads to the so-called reverse Löwdin's G*·C*(rWC) nucleobase pair with trans-oriented N1H and N9H glycosidic bonds. Geometry of the formed G*·C*(rWC) nucleobase pair is Watson–Crick-like. Both Hoogsteen G*t·C*(H) and reverse Hoogsteen G*t·C*(rH) base mispairs conformationally transform into the reverse wobble G*t·C*(rwH) and wobble G*t·C*(wWC/H) base mispairs, respectively, via the mutual rotations of the bases around the intermolecular H-bonds: G*t·C*(H) ↔ G*t·C*(rwH) and G*t·C*(rH) ↔ G*t·C*(wWC/H), respectively.
5. Rotation of the G* and C* rare tautomers in the Löwdin's G*·C*(WC) nucleobase pair around the upper (G)O6H⋯N4(C) or lower (G)N2H⋯O2(C) H-bonds leads to the formation of the two reverse wobble G*·C*(rwWC/H) and G*·C*(rwWC) base mispairs, respectively: G*·C*(WC) ↔ G*·C*(rwWC/H) and G*·C*(WC) ↔ G*·C*(rwWC) conformational transformations, accordingly (Table 1).
6. The same situation is observed for the reverse Löwdin's G*·C*(rWC) nucleobase pair: G*·C*(rWC) ↔ G*·C*(wWC/H) and G*·C*(rWC) ↔ G*·C*(wWC) conformational transformations, which occur via the mutual rotations of the bases around the upper (G)O6H⋯O2(C) and lower (G)N2H⋯N4(C) H-bonds, respectively, leading to the formation of the wobble G*·C*(wWC/H) and G*·C*(wWC) base mispairs, accordingly (Table 1).
7. The G·C*O2(rWC) ↔ G·C*O2(wWC) conformational transformation of the reverse Watson–Crick G·C*O2(rWC) nucleobase pair by the participation of the G base and C*O2 tautomer of the C nucleobase proceeds through the rotation of the G and C*O2 bases around the middle (G)N1H⋯N3(C) H-bond and leads to the formation of the wobble G·C*O2(wWC) base pair, which has non-planar geometry. So, this conformational transformation greatly changes the geometry of the initial G·C*O2(rWC) base pair from reverse Watson–Crick geometry to wobble Watson–Crick geometry.
8. The G*·C*(rwWC/H) and G*·C*(wWC/H) base mispairs mutually transform into each other through the mutual rotation of the bases around the intermolecular lower (C)N3H⋯O6(G) H-bond: G*·C*(rwWC/H) ↔ G*·C*(wWC/H) (Table 1). Similar transformation occurs also for the G*t·C*(rwWC/H) and G*t·C*(wWC/H) base mispairs by the participation of the G*t base with trans-oriented O6H hydroxyl group: G*t·C*(rwWC/H) ↔ G*t·C*(wWC/H) (Table 1).
9. The G*t·C*(rWC) ↔ G*t·C*(WC) and G*t·C*(rWC) ↔ G*t·C*(wWC) conformational transformations of the reverse Watson–Crick G*t·C*(rWC) base pair occur via the mutual rotations of the bases around the middle (C)N3H⋯N1(G) and lower (G)N2H⋯N4(C) H-bonds, respectively, and lead to the Watson–Crick G*t·C*(WC) or wobble Watson–Crick G*t·C*(wWC) nucleobase mispairs, respectively (Table 1). The G·C*tO2(rWC) ↔ G·C*tO2(wWC) conformational transformation proceeds for the reverse Watson–Crick G·C*tO2(rWC) base mispair by the participation of the G base and C*tO2 tautomer with trans-oriented N4H imino group and leads to the wobble Watson–Crick G·C*tO2(wWC) base mispair (Table 1).
10. It is especially interesting to consider the conformational transformations of the novel, unusual Watson–Crick G*t·C*t(WC) and reverse Watson–Crick G*t·C*(rWC) base pairs by the participation of the G*t and C*t rare tautomers with trans-oriented hydroxyl group of the G*t base and trans-oriented imino group of the C*t base, respectively.
Thus, the G*t·C*t(WC) ↔ G*t·C*t(rwWC)↑ and G*t·C*t(WC) ↔ G*t·C*t(rwWC) conformational transformations of the G*t·C*t(WC) base mispair via the mutual rotations of the bases around the middle (C)N3H⋯N1(G) or lower (G)N2H⋯O2(C) H-bonds lead to the formation of the reverse wobble G*t·C*t(rwWC)↑ or G*t·C*t(rwWC) base pairs, respectively (Table 1). It was also revealed mutual conformational transformations between formed wobble base mispairs.
11. In general, conformational transformations of the G·C base pairs in Watson–Crick (WC), reverse Watson–Crick (rWC), wobble Watson–Crick (wWC) and reverse wobble Watson–Crick (rwWC) configurations occur via the mutual rotation of the bases:
– around the upper H-bond: G*·C*(WC) ↔ G*·C*(rwWC/H), G*·C*(rWC) ↔ G*·C*(wWC/H), G*t·C*(rwWC/H) ↔ G*t·C*(wWC/H), G·C*(rwWC)↑ ↔ G·C*(wWC)↑, G*·C(rwWC)↑ ↔ G*·C(wWC/H), G*·C*O2(wWC)↑ ↔ G*·C*O2(rwWC/H), G*·C*tO2(wWC)↑ ↔ G*·C*tO2(rwWC/H), G*tN2·C*(rwWC)↓ ↔ G*tN2·C*(wWC)↑, G*tN2·C*t(rwWC)↓ ↔ G*tN2·C*t(wWC)↑;
– around the middle H-bond: G·C(WC) ↔ G·C(rwWC), G+·C−(rWC) ↔ G·C(WC), G*·C*(WC) ↔ G*·C*(rWC), G·C*O2(rWC) ↔ G·C*O2(wWC), G*t·C*(rWC) ↔ G*t·C*(WC), G·C*tO2(rWC) ↔ G·C*tO2(wWC), G*t·C*t(WC) ↔ G*t·C*t(rwWC)↑;
– around the lower H-bond: G*·C*(WC) ↔ G*·C*(rwWC), G*·C*(rWC) ↔ G*·C*(wWC), G*·C*(rwWC/H) ↔ G*·C*(wWC/H), G*t·C*(rWC) ↔ G*t·C*(wWC), G*t·C*t(WC) ↔ G*t·C*t(rwWC), G·C*(wWC)↑ ↔ G·C*(rwWC), G·C*(rwWC) ↔ G·C*(wWC), G·C*(rwWC)↑ ↔ G·C*(wWC), G*·C(rwWC)↑ ↔ G*·C(wWC)↓, G*·C(wWC)↓ ↔ G*·C(rwWC), G*·C*O2(wWC)↑ ↔ G*·C*O2(rwWC)↓, G*·C*O2(rwWC)↓ ↔ G*·C*O2(wWC), G*N2·C*(wWC)↓ ↔ G*N2·C*(rwWC)↓, G*t·C*O2(rwWC)↓ ↔ G*t·C*O2(wWC), G*·C*tO2(wWC)↑ ↔ G*·C*tO2(rwWC)↓, G*·C*tO2(rwWC)↓ ↔ G*·C*tO2(wWC), G*t·C*tO2(rwWC)↓ ↔ G*t·C*tO2(wWC), G*tN2·C*(wWC)↓ ↔ G*tN2·C*(rwWC)↓.
G·C nucleobase pairs in the WC, rWC, wWC and rwWC configurations are connected by the NH⋯O, NH⋯N, OH⋯N, OH⋯O H-bonds and attractive O⋯O/O⋯N van der Waals contacts.
Among the WC, rWC, wWC and rwWC G·C base pairs the following complexes have significantly non-planar geometry: G·C(rwWC), G*·C*(rwWC), G*·C*(wWC), G*t·C*t(rwWC), G*t·C*(rWC), G*t·C*(WC), G*t·C*(wWC), G·C*O2(wWC), G·C*tO2(rWC), G·C*tO2(wWC), G·C*(rwWC), G·C*(wWC), G*·C(rwWC)↑, G*·C(wWC/H), G*·C(wWC)↓, G*·C(rwWC), G*·C*O2(wWC), G*t·C*O2(wWC), G*·C*tO2(wWC) and G*t·C*tO2(wWC). This non-planarity of the base pairs is caused by the sp3 hybridization of the NH2 amino group of the G base and its non-planarity.
12. Conformational transitions of the G·C nucleobase pairs in H, rH, wH and rwH configurations proceed through the mutual rotations of the bases:
– around the upper H-bond: G*t·C*(H) ↔ G*t·C*(rwH), G*t·C*(rH) ↔ G*t·C*(wWC/H), G*t·C*O2(wH)↑ ↔ G*t·C*O2(rwH), G*t·C*O2(wH)↑ ↔ G*t·C*O2(rwWC/H), G*t·C(wH)↓ ↔ G*t·C(rwH), G*t·C(rwH)↑ ↔ G*t·C(wH)↑, G*N7·C*(rwH)↑ ↔ G*N7·C*(wH)↑, G*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↑, G*O6/N7·C*(rwH)↑ ↔ G*O6/N7·C*(wH)↑, G*O6/N7·C*(rwH)↓ ↔ G*O6/N7·C*(wH)↑;
– around the middle H-bond: G*·C*(rH) ↔ G*·C*(H), G*t·C*(H) ↔ G*t·C*(rH), G*t·C*tO2(wH)↑ ↔ G*t·C*tO2(rwH)↑, G*N7·C*t(wH)↑ ↔ G*N7·C*t(rwH)↑;
– around the lower H-bond: G*t·C*O2(wH)↑ ↔ G*t·C*O2(rwH)↓, G*t·C*tO2(wH)↑ ↔ G*t·C*tO2(rwH)↓, G*t·C*tO2(rwH)↓ ↔ G*t·C*tO2(wH)↓, G*t·C(rwH)↑ ↔ G*t·C(wH)↓, G*N7·C*(rwH)↑ ↔ G*N7·C*(wH), G*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↓, G*tO6/N7·C*(wH)↓ ↔ G*tO6/N7·C*(rwH)↓.
By using QTAIM analysis it was identified that H, rH, wH and rwH conformers of the G·C nucleobase pair are bounded by the CH⋯N, NH⋯C, NH⋯O, NH⋯N, OH⋯N and OH⋯O H-bonds.
Among the G·C nucleobase pairs in H, rH, wH and rwH configurations the following complexes have significantly non-planar geometry: G*t·C*O2(wH)↑, G*t·C*O2(rwWC/H), G*t·C*tO2(rwH)↑, G*t·C*tO2(wH)↓, G*t·C(rwH), G*t·C(rwH)↑, G*t·C(wH)↑ and G*N7·C*t(rwH)↑. This non-planarity of the complexes is also caused by the non-planarity of the NH2 amino groups of the G and C nucleobases.
13. It is especially interesting to note the cases, when rotation of the bases within the G·C nucleobase pairs leads to the changing of the geometry from Watson–Crick to Hoogsteen and vice versa through the intermediate wobble (wWC/H) or reverse wobble (rwWC/H) base pairs via the G*·C*(WC) ↔ G*·C*(rwWC/H), G*·C*(rWC) ↔ G*·C*(wWC/H) and G*t·C*(rH) ↔ G*t·C*(wWC/H) conformational transformations. This became possible due to the transformations of the Watson–Crick G·C(WC), reverse Watson–Crick G·C(rWC) and reverse Hoogsteen G·C(rH) base pairs into their tautomerised states – G*·C*(WC), G*·C*(rWC) and G*t·C*(rH), respectively.
14. All TSs of the conformational transformations are stabilized by specific intermolecular contacts – H-bonds and attractive van der Waals contacts (from 1 to 3). In the most variety of cases TSs of the conformational transformations are joined by single intermolecular H-bond (Table 1): NH⋯N (TSG+·C-(rWC) ↔ G·C(WC), TSG*·C*(WC) ↔ G*·C*(rWC), TSG*t·C*(rWC) ↔ G*t·C*(WC), TSG*t·C*t(WC) ↔ G*t·C*t(rwWC)↑, TSG*·C(rwWC)↑ ↔ G*·C(wWC)↓, TSG*N2·C*(wWC)↓ ↔ G*N2·C*(rwWC)↓, TSG*tN2·C*(wWC)↓ ↔ G*tN2·C*(rwWC)↓; TSG*·C*(rH) ↔ G*·C*(H), TSG*t·C*(H) ↔ G*t·C*(rH), TSG*t·C(rwH)↑ ↔ G*t·C(wH)↓, TSG*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↑); OH⋯N (TSG*·C*(WC) ↔ G*·C*(rwWC/H), TSG*·C(rwWC)↑ ↔ G*·C(wWC/H), TSG*·C*O2(wWC)↑ ↔ G*·C*O2(rwWC)↓, TSG*·C*O2(wWC)↑ ↔ G*·C*O2(rwWC/H), TSG*·C*tO2(wWC)↑ ↔ G*·C*tO2(rwWC/H), TSG*·C*tO2(wWC)↑ ↔ G*·C*tO2(rwWC)↓); OH⋯O (TSG*·C*(rWC) ↔ G*·C*(wWC/H)); NH⋯O (TSG·C*(rwWC)↑ ↔ G·C*(wWC)↑; TSG*O6/N7·C*(rwH)↓ ↔ G*O6/N7·C*(wH)↑); NH⋯C (TSG*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↓, TSG*tO6/N7·C*(wH)↓ ↔ G*tO6/N7·C*(rwH)↓).
At the same time, there are also cases of the simultaneous co-existence of the two H-bonds at the TSs (NH⋯N, NH⋯O, OH⋯N, OH⋯O and CH⋯N) (Table 1): TSG*t·C*O2(wH)↑ ↔ G*t·C*O2(rwH)↓, TSG*t·C*tO2(wH)↑ ↔ G*t·C*tO2(rwH)↓, TSG*t·C*tO2(rwH)↓ ↔ G*t·C*tO2(wH)↓, TSG*N7·C*(rwH)↑ ↔ G*N7·C*(wH)↑ and TSG*O6/N7·C*(rwH)↑ ↔ G*O6/N7·C*(wH)↑, especially those involving NH2 amino groups of the G and C bases (Table 1): TSG*·C*(WC) ↔ G*·C*(rwWC), TSG*·C*(rWC) ↔ G*·C*(wWC), TSG·C*O2(rWC) ↔ G·C*O2(wWC), TSG·C*tO2(rWC) ↔ G·C*tO2(wWC), TSG*t·C*t(WC) ↔ G*t·C*t(rwWC), TSG·C*(wWC)↑ ↔ G·C*(rwWC), TSG·C*(rwWC)↑ ↔ G·C*(wWC), TSG*·C(wWC)↓ ↔ G*·C(rwWC), TSG*·C*O2(rwWC)↓ ↔ G*·C*O2(wWC), TSG*t·C*O2(rwWC)↓ ↔ G*t·C*O2(wWC), TSG*·C*tO2(rwWC)↓ ↔ G*·C*tO2(wWC), TSG*t·C*tO2(rwWC)↓ ↔ G*t·C*tO2(wWC); TSG*t·C(wH)↓ ↔ G*t·C(rwH).
Especially interesting are the cases of the following TSs – TSG·C*O2(rWC) ↔ G·C*O2(wWC), TSG·C*tO2(rWC) ↔ G·C*tO2(wWC), TSG·C*(wWC)↑ ↔ G·C*(rwWC), TSG·C*(rwWC)↑ ↔ G·C*(wWC), where the N1H⋯N3 and N2H⋯N3/N1H⋯N3 and N2H⋯N3/N1H⋯O2 and N2H⋯O2/N1H⋯N4 and N2H⋯N4 H-bonds are focused on one common N3/N3/O2/N4 atom, respectively.
Also, there are cases of the H-bonds (NH⋯O, NH⋯N and OH⋯N), which are combined with attractive van der Waals contacts (O⋯O, N⋯C, O⋯N and N⋯N): TSG*·C*(wWC/H) ↔ G*·C*(rwWC/H), TSG*t·C*(rwWC/H) ↔ G*t·C*(wWC/H), TSG*t·C*(rWC) ↔ G*t·C*(wWC), TSG·C*(rwWC) ↔ G·C*(wWC), TSG*tN2·C*(rwWC)↓ ↔ G*tN2·C*(wWC)↑, TSG*tN2·C*t(rwWC)↓ ↔ G*tN2·C*t(wWC)↑; TSG*t·C*(H) ↔ G*t·C*(rwH), TSG*t·C*(rH) ↔ G*t·C*(wWC/H), TSG*t·C*O2(wH)↑ ↔ G*t·C*O2(rwH), TSG*t·C*O2(wH)↑ ↔ G*t·C*O2(rwWC/H), TSG*t·C*tO2(wH)↑ ↔ G*t·C*tO2(rwH)↑, TSG*t·C(rwH)↑ ↔ G*t·C(wH)↑, TSG*N7·C*(rwH)↑ ↔ G*N7·C*(wH).
Finally, there are also TSs stabilized by three intermolecular H-bonds (Table 1): TSG·C(WC) ↔ G·C(rwWC) and TSG*N7·C*t(wH)↑ ↔ G*N7·C*t(rwH)↑.
It is especially interesting to note the cases of the TSs – TSG*t·C*(H) ↔ G*t·C*(rwH), TSG*t·C*(rH) ↔ G*t·C*(wWC/H), TSG*t·C*O2(wH)↑ ↔ G*t·C*O2(rwH), TSG*t·C*O2(wH)↑ ↔ G*t·C*O2(rwWC/H), TSG*t·C*tO2(wH)↑ ↔ G*t·C*tO2(rwH)↑, TSG*t·C(rwH)↑ ↔ G*t·C(wH)↑, TSG*N7·C*(rwH)↑ ↔ G*N7·C*(wH), TSG*N7·C*t(wH)↑ ↔ G*N7·C*t(rwH)↑, where the O6H⋯N4 and N7⋯N4/O6H⋯O2 and N7⋯O2/O6H⋯N3 and N7⋯N3/O6H⋯N3 and N7⋯N3/O6H⋯N3 and N7⋯N3/O6H⋯N3 and N7⋯N3/O6⋯N4 and N7H⋯N4/N4H⋯O6 and N3H⋯O6 H-bonds and attractive van der Waals contacts are focused on one common N4/O2/N3/N3/N3/N3/N4/O6 atom, respectively.
15. Moreover, careful analysis of the specificities of the geometrical structures of the TSs shows that the term “rotation around the individual intermolecular H-bond” should not be considered literally, in the sense of the rotation around some imaginary non-deformable fixed axis, but in the sense of the rotation around labile “axis” of rotation. Notably, that exactly this circumstance significantly complicates the procedure of the localization of the corresponding transition states, which to some extent could be considered as an art. Localization of the TSs is also complicated by the fact that corresponding hypersurface of the electronic energy is quite diverse. From our point of view, some results evidence about this, when rotations around one and the same H-bond, which of course are controlled by the different TSs, lead to the different structural consequences, that is to the different conformers of the G·C nucleobase pair.
16. Investigated conformational transformations are dipole active, for which the dipole moments of the TSs change in the wide range of values: μ = 1.08–10.57/1.45–9.89 D (Table 1). At this, in the vast majority of cases, the dipole moments of the TSs are less than dipole moments of the starting and final structures (Table 1).
17. Especial interest attracts conformational transformations in the G·C(wH)/G·C(rwH) base pairs, where G is in the yilidic tautomeric form. In all these cases, conformational degree of freedom ensures the unusual intermolecular C8⋯HN H-bond: G*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↓, G*O6/N7·C*(wH)↓ ↔ G*O6/N7·C*(rwH)↑, G*O6/N7·C*(rwH)↓ ↔ G*O6/N7·C*(wH)↑ and G*tO6/N7·C*(wH)↓ ↔ G*tO6/N7·C*(rwH)↓ (Table 1).
Provided investigation of the conformational transformations of the biologically important G·C (WC, rWC, wWC, rwWC, H, rH, wH and rwH) nucleobase pairs, which is caused by the rotational mobility of the bases around the individual intermolecular H-bonds, significantly extends the existing pull of the unusual conformers of the G·C base pairs and thoughts about the microstructural mechanisms of these processes, as well as about their functional role.
As a result of this investigation, we have revealed a wide set of the surprising conformers-rotamers of the G·C nucleobase pairs, which could be incorporated into the double helix of the parallel or anti-parallel50 DNA and RNA molecules. High-energy conformers of the G·C nucleobase pair, formed by the interbonded non-dissociative conformational transformations, most likely in our opinion play the outstanding role in support of the unique spatial structure of the nucleic acids, especially of the RNA molecule, and their functionally-important rebuildings, which are usually caused by the proteins.
And finally, it deserves to pay especial attention to the fact that localization of the TSs, describing the conformational mobility of the G·C nucleobase pairs, is quite delicate procedure in its essence, approaching to the art. So, we hope that presented here results would simplify further work in this biologically-perspective direction.
This journal is © The Royal Society of Chemistry 2021 |