Can modified DNA base pairs with chalcogen bonding expand the genetic alphabet? A combined quantum chemical and molecular dynamics simulation study†
Abstract
A comprehensive (DFT and MD) computational study is presented with the goal to design and analyze model chalcogen-bonded modified nucleobase pairs that replace one (i.e., AXY:T, G:CXY, GXY:C) or two (GXY:CX′Y′, X/X′ = S, Se and Y/Y′ = F, Cl, Br) Watson–Crick (WC) hydrogen bonds of the canonical A:T or G:C pair with chalcogen bond(s). DFT calculations on 18 base pair combinations that replace one WC hydrogen bond with a chalcogen bond reveal that the bases favorably interact in the gas phase (binding strengths up to −140 kJ mol−1) and water (up to −85 kJ mol−1). Although the remaining hydrogen bond(s) exhibits similar characteristics to those in the canonical base pairs, the structural features of the (Y–X⋯O) chalcogen bond(s) change significantly with the identity of X and Y. The 36 doubly-substituted (GXY:CX′Y′) base pairs have structural deviations from canonical G:C similar to those of the singly-substituted modifications (G:CXY or GXY:C). Furthermore, despite the replacement of two strong hydrogen bonds with chalcogen bonds, some GXY:CX′Y′ pairs possess comparable binding energies (up to −132 kJ mol−1 in the gas phase and up to −92 kJ mol−1 in water) to the most stable G:CXY or GXY:C pairs, as well as canonical G:C. More importantly, G:C-modified pairs containing X = Se (high polarizability) and Y = F (high electronegativity) are the most stable, with comparable or slightly larger (by up to 13 kJ mol−1) binding energies than G:C. Further characterization of the chalcogen bonding in all modified base pairs (AIM, NBO and NCI analyses) reveals that the differences in the binding energies of modified base pairs are mainly dictated by the differences in the strengths of their chalcogen bonds. Finally, MD simulations on DNA oligonucleotides containing the most stable chalcogen-bonded base pair from each of the four classifications (AXY:T, G:CXY, GXY:C and GXY:CX′Y′) reveal that the singly-modified G:C pairs best retain the local helical structure and pairing stability to a greater extent than the modified A:T pair. Overall, our study identifies two (G:CSeF and GSeF:C) promising pairs that retain chalcogen bonding in DNA and should be synthesized and further explored in terms of their potential to expand the genetic alphabet.