Mechanical codes of chemical-scale specificity in DNA motifs†
Abstract
In gene transcription, certain sequences of double-stranded (ds)DNA play a vital role in nucleosome positioning and expression initiation. That dsDNA is deformed to various extents in these processes leads us to ask: Could the genomic DNA also have sequence specificity in its chemical-scale mechanical properties? We approach this question using statistical machine learning to determine the rigidity between DNA chemical moieties. What emerges for the polyA, polyG, TpA, and CpG sequences studied here is a unique trigram that contains the quantitative mechanical strengths between bases and along the backbone. In a way, such a sequence-dependent trigram could be viewed as a DNA mechanical code. Interestingly, we discover a compensatory competition between the axial base-stacking interaction and the transverse base-pairing interaction, and such a reciprocal relationship constitutes the most discriminating feature of the mechanical code. Our results also provide chemical-scale understanding for experimental observables. For example, the long polyA persistence length is shown to have strong base stacking while its complement (polyAc) exhibits high backbone rigidity. The mechanical code concept enables a direct reading of the physical interactions encoded in the sequence which, with further development, is expected to shed new light on DNA allostery and DNA-binding drugs.