Cheng-Jie
Ma‡
a,
Lin
Li‡
b,
Wen-Xuan
Shao
a,
Jiang-Hui
Ding
a,
Xiao-Li
Cai
c,
Zhao-Rong
Lun
c,
Bi-Feng
Yuan
*ad and
Yu-Qi
Feng
ad
aSauvage Center for Molecular Sciences, Department of Chemistry, Wuhan University, Wuhan 430072, China. E-mail: bfyuan@whu.edu.cn
bSchool of Pharmacy, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
cCenter for Parasitic Organisms, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-Sen University, Guangzhou 510275, P. R. China
dSchool of Public Health, Wuhan University, Wuhan 430071, China
First published on 4th October 2021
DNA 5-hydroxymethyluracil (5hmU) is a thymine modification existing in the genomes of various organisms. The post-replicative formation of 5hmU occurs via hydroxylation of thymine by ten-eleven translocation (TET) dioxygenases in mammals and J-binding proteins (JBPs) in protozoans, respectively. In addition, 5hmU can also be generated through oxidation of thymine by reactive oxygen species or deamination of 5hmC by cytidine deaminase. While the biological roles of 5hmU have not yet been fully explored, determining its genomic location will highly assist in elucidating its functions. Herein, we report a novel enzyme-mediated bioorthogonal labeling method for selective enrichment of 5hmU in genomes. 5hmU DNA kinase (5hmUDK) was utilized to selectively install an azide (N3) group or alkynyl group into the hydroxyl moiety of 5hmU followed by incorporation of the biotin linker through click chemistry, which enabled the capture of 5hmU-containing DNA fragments via streptavidin pull-down. The enriched fragments were applied to deep sequencing to determine the genomic distribution of 5hmU. With this established enzyme-mediated bioorthogonal labeling strategy, we achieved the genome-wide mapping of 5hmU in Trypanosoma brucei. The method described here will allow for a better understanding of the functional roles and dynamics of 5hmU in genomes.
5hmU is a thymine base modification present in the genomes of diverse organisms ranging from bacteriophages to mammals.14 5hmU has long been known as a DNA lesion formed from the oxidation of thymine by reactive oxygen species.15,16 Enzyme-mediated replicative incorporation of 5hmU into bacteriophage genomes indicates that 5hmU has functional importance.17 It has been reported that thymine in genomes of some bacteriophages and dinoflagellates is fully or partially replaced by 5hmU.18 It is worth noting that the loss of nucleotide salvage factor DNPH1 can give rise to aberrant incorporation of 5-hmdU into human cells.19 The post-replicative formation of 5hmU occurs via hydroxylation of thymine, which can be mediated by ten-eleven translocation (TET) dioxygenases in mammals20 and J-binding proteins (JBPs) in protozoan genomes,21,22 respectively. Another proposed mechanism for the formation of 5hmU is through the deamination of 5hmC by activation-induced cytidine deaminase (AID) or apolipoprotein B mRNA-editing catalytic polypeptide-like (APOBEC) family enzymes.23 Hydroxylation of thymine to 5hmU could generate a 5hmU:A base pair, while deamination of 5hmC would give rise to a 5hmU:G mispair.
It has been demonstrated that the majority of 5hmU within mouse embryonic stem cells (mESCs) is produced by mammalian TET dioxygenases, while only minor 5hmU is generated through 5hmC deamination or reactive oxygen species.20 Thus, the majority of 5hmU in the genome is matched (5hmU:A) but not mismatched (5hmU:G). The level of 5hmU was found to be dynamic throughout mESC differentiation, suggesting that 5hmU may have functional importance.20 Although it was reported that 5hmU could affect biological processes such as protein–DNA interactions and transcription factor binding,24,25 the consequences of 5hmU formation in genomes have not been fully explored.
Revealing the functions of 5hmU relies on the sensitive and precise detection and mapping of 5hmU within genomes. In recent years, several strategies have been developed to map 5hmU in DNA. A method through KRuO4 oxidation of 5hmU to generate 5fU followed by biotinylation using a hydrazide-biotin probe was developed to localize 5hmU.26,27 However, other aldehyde groups present, such as 5 fC and abasic sites (AP sites) in genomic DNA, may also react with the hydrazide-biotin probe, which could interfere with the accurate mapping of 5hmU. The chemical conversion of 5hmU to 5fU by KRuO4 oxidation could induce partial T-to-C base transition in polymerase extension owing to the ability of 5fU to form a 5fU:G mispair, which was then employed as the signature to map original 5hmU.28 The applicability of this method was demonstrated using synthetic oligonucleotides and part of the genome of a eukaryotic pathogen. However, only ∼40% of reads were cytosine at the 5hmU sites even under optimized conditions, which therefore required a sophisticated algorithm to analyze the sequencing data and identify the sites of 5hmU in DNA. In addition, β-glycosyltransferase (β-GT) was applied to tag a modified N3-glucose onto the hydroxyl group of 5hmU followed by incorporation of the biotin linker through click chemistry.29 After the capture of 5hmU-containing fragments with streptavidin-coupled beads, the enriched DNA fragments can be applied to deep sequencing to map the distribution of 5hmU. However, β-GT only works on mismatched 5hmU:G but not matched 5hmU:A.29 Thus, this method is not applicable to map 5hmU derived from thymine.
Herein, we report a novel enzyme-mediated bioorthogonal labeling method to selectively enrich genomic regions containing 5hmU. 5hmU DNA kinase (5hmUDK) was utilized to selectively install an azide (N3) group or alkynyl group into the hydroxyl group of 5hmU using γ-(2-azidoethyl)-adenosine 5′-triphosphate (N3-ATP) or γ-[(propargyl)-imido]-adenosine 5′-triphosphate (alkynyl-ATP) as the cofactor. The N3 group or alkynyl group in 5hmU was then utilized to incorporate the biotin linker through click chemistry. The enrichment of 5hmU-containing DNA fragments was performed via streptavidin pull-down. The enriched fragments were applied to deep sequencing to map the distribution of 5hmU (Fig. 1). With this established enzyme-mediated bioorthogonal labeling strategy, we achieved the genome-wide mapping of 5hmU in Trypanosoma brucei (T. brucei).
Shrimp alkaline phosphatase (SAP), calf intestine alkaline phosphatase (CIAP), E. coli C75 alkaline phosphatase (EAP), and TB Green® Premix Ex Taq™ II (Tli RNaseH Plus) were purchased from Takara Biotechnology Co., Ltd. (Dalian, China). γ-(2-Azidoethyl)-adenosine 5′-triphosphate (N3-ATP) and γ-[(propargyl)-imido]-adenosine 5′-triphosphate (alkynyl-ATP) were bought from Sigma-Aldrich (Beijing, China). DBCO-SS-biotin was bought from Confluore Biological Technology Co., Ltd. (Xi'an, China). 5-Hydroxymethyluridine DNA kinase (5hmUDK), NcoI-HF, hSMUG1 (single-strand-selective monofunctional uracil-DNA glycosylase 1), and NEBNext® Multiplex Oligos for Illumina® (Index Primers Set 1) were purchased from New England Biolabs (Beijing, China).
As for the NcoI restriction enzyme digestion, the resulting DNA was incubated with 10 U NcoI-HF in 1× CutSmart buffer at 37 °C for 1 h followed by polyacrylamide gel electrophoresis analysis. As for the hSMUG1 cleavage assay, the resulting DNA was incubated with 5 U hSMUG1 in 1× NEBuffer 1 at 37 °C for 2 h. Then the DNA was treated with 100 mM NaOH at 95 °C for 10 min followed by polyacrylamide gel electrophoresis analysis. The gel was visualized using a Tanon fluorescence imager (Shanghai, China).
Real-time qPCR was carried out to evaluate the enrichment efficiency. Briefly, the enriched DNA was dissolved in 20 μL H2O. Then 2 μL of enriched DNA, 1 μL of the forward primer (10 μM), 1 μL of the reverse primer (10 μM), and 6 μL of H2O were added into a Takara TB Green Premix Ex Taq™ II qPCR mix (10 μL), to give a final volume of 20 μL. The real-time qPCR program was performed for 45 cycles at 95 °C for 10 s, 54 °C for 30 s and 72 °C for 1 min.
The enrichment efficiency was calculated using the following equations.
ΔCt input = Ct (input 5hmU-DNA) − Ct (input control DNA), |
ΔCt enrich = Ct (enriched 5hmU-DNA) − Ct (enriched control DNA), |
ΔΔCt = ΔCt enrich − ΔCt input, |
Enrichment fold = 2−ΔΔCt |
To evaluate the PCR amplification efficiencies, the standard curves for amplification of 60-bp 5hmU-DNA and 129-bp control DNA were generated. Briefly, a series of 60-bp 5hmU-DNA and 129-bp control DNA with different concentrations were prepared and used as templates for real-time qPCR. The PCR amplification efficiency was calculated using the following equation.
Amplification efficiency = (10−1/slope − 1) × 100%. |
For the streptavidin pull-down assay, streptavidin-coupled beads were pre-washed three times with 1× binding buffer (5 mM Tris–HCl, pH 7.0, 0.5 mM EDTA, 1 M NaCl, and 0.05% Tween 20) and resuspended in 50 μL of 2× binding buffer (10 mM Tris–HCl, pH 7.0, 1 mM EDTA, 2 M NaCl, and 0.1% Tween 20). The biotin-labeled DNA was added into resuspended streptavidin-coupled beads and incubated at 25 °C for 25 min with gentle rotation. Then the beads were washed five times with 1× binding buffer. To release the biotin labeled DNA, 50 mM freshly prepared dithiothreitol (DTT) was added into the beads and incubated at 37 °C for 2 h. Then the supernatant was collected, and DNA was purified by ethanol precipitation. The enriched DNA was subjected to library preparation for high-throughput sequencing.
hSMUG can specifically catalyze the hydrolysis of the N-glycosidic bond of 5hmU to form an AP site that can be broken to generate a gap by alkaline hydrolysis at high temperature.32 We also employed the properties of hSMUG to examine the phosphorylation of 5hmU. A duplex 60-bp 5hmU-DNA was processed with 5hmUKD followed by sequential hSMUG and NaOH treatment (Fig. 2D). The results showed that phosphorylation of 5hmU in duplex 5hmU-DNA by 5hmUDK could prevent the hydrolysis of the N-glycosidic bond of 5hmU by hSMUG (Fig. 2E, lane 2). However, the 60-bp 5hmU-DNA without 5hmUDK treatment was cleaved by sequential hSMUG and NaOH treatment (Fig. 2E, lane 3). These results further confirmed the successful phosphorylation of 5hmU. It was reported that 5hmU could form a 5hmU:A base pair (herein 5hmU was generated from oxidation of thymine by ROS or TETs) and 5hmU:G mispair (herein 5hmU was generated from deamination of 5hmC).23 Using the same assay, we found that 5hmU in both 5hmU:A base pair and 5hmU:G mispair can be phosphorylated by 5hmUDK and thus resisted the subsequent hydrolysis by hSMUG and NaOH treatment (Fig. 2E, lanes 2 and 4). Interestingly, the hSMUG assay demonstrated that 5hmU in ssDNA could also be phosphorylated by 5hmUDK (Fig. 2E, lanes 6 and 7).
The reaction for phosphorylation of 5hmU was also carried out for different times (0.5 h and 12 h). It can be seen that complete conversion of 5hmU to alkynyl-5hmU or N3-5hmU could be achieved within 0.5 h (Fig. 3D). In addition, the reaction at 37 °C for 12 h didn't lead to the obvious degradation of alkynyl- or N3-labeled DNA (Fig. 3D), suggesting that the alkynyl- or N3-labeled DNA was relatively stable. The treatment of 5pmU-DNA by calf intestine alkaline phosphatase (CIAP) and E. coli C75 alkaline phosphatase (EAP) resulted in the cleavage of 5pmU-DNA (Fig. S3 in the ESI†), indicating that the phosphate group in 5hmU could be removed by CIAP or EAP. Unlike CIAP and EAP, shrimp alkaline phosphatase (SAP) showed weak dephosphorylation activity toward 5pmU (Fig. S3 in the ESI†). However, all these three alkaline phosphatases showed weak dephosphorylation activity toward alkynyl-5hmU and no dephosphorylation activity toward N3-5hmU (Fig. S3 in the ESI†).
The selective labeling of 5hmU with N3 or alkynyl groups endows the 5hmU-containing DNA with appropriate groups for the bioorthogonal reaction, which can be employed to incorporate the biotin linker for subsequent enrichment. To this end, we evaluated the bioorthogonal reaction between N3-5hmU and DBCO-Cy3. The results showed that the reaction between N3-5hmU DNA and DBCO-Cy3 led to a slow shift compared to the unlabeled DNA (Fig. S4 in the ESI†), indicating the successful bioorthogonal reaction between N3-5hmU DNA and DBCO-Cy3.
Since the N3 or alkynyl group could be successfully added to 5hmU, we used two reagents of N3-biotin and DBCO-SS-biotin to react with alkynyl-5hmU and N3-5hmU, respectively (Fig. 4A). The results showed that the reaction of N3-5hmU with DBCO-SS-biotin or alkynyl-5hmU with N3-biotin led to the slow shift in gel electrophoresis (Fig. 4B, lanes 8 and 9), indicating the successful bioorthogonal reaction. In addition, the liquid chromatography-ultraviolet detection also confirmed the successful installation of the N3-phosphate group on 5hmU and the bioorthogonal labelling of DBCO-SS-biotin (Fig. S5 in the ESI†). In contrast, neither the T-DNA nor U-DNA substrate was labeled by N3-biotin or DBCO-SS-biotin (Fig. 4B, lanes 1–6). The bioorthogonal reaction between alkynyl-5hmU and N3-biotin required a 12 h incubation to achieve complete labeling. However, the bioorthogonal reaction between N3-5hmU and DBCO-SS-biotin required only a 2 h incubation to achieve the complete labeling. Thus, we chose N3-ATP to label 5hmU and incorporated the biotin tag with DBCO-SS-biotin in the subsequent experiments. We also used a mixture of ATP and N3-ATP with different ratios to label 5hmU-DNA followed by biotinylation with DBCO-SS-biotin. The results demonstrated that N3-ATP could be proportionally transferred to 5hmU-DNA (Fig. S6 in the ESI†), indicating that 5hmUDK has no significant preference to ATP over N3-ATP.
We further used synthesized DNA (5hmU-containing 60-bp duplex DNA and 129-bp control duplex DNA, Table S1 in the ESI†) to evaluate the enrichment efficiency by the 5hmUDK-mediated bioorthogonal labeling assay. A mixture of 0.1 ng of 60-bp duplex 5hmU-DNA, 0.1 ng of 129-bp duplex control DNA, and 10 μg of fragmented HEK293T genomic DNA was processed with the 5hmUDK-mediated N3 incorporation and DBCO-SS-biotin bioorthogonal reaction. The resulting biotin-labeled DNA was enriched using streptavidin-coupled beads and then amplified using primers specific for 5hmU-containing 60-bp duplex DNA or 129-bp control duplex DNA. The real-time qPCR amplification efficiencies were evaluated with the constructed standard curves. The results showed that each standard curve had an R2 value of at least 0.99 and the amplification efficiencies were between 90% and 103% (Fig. S7 in the ESI†), which are suitable for the enrichment evaluation. The quantification results showed that a 170-fold enrichment for 5hmU-DNA was obtained (Fig. 4C), demonstrating that 5hmU-DNA could be efficiently enriched by the 5hmUDK-mediated bioorthogonal labeling assay.
We found 2348 peaks in the two pull-down replicates in comparison with the input samples, and an overlap of 679 peaks between two pull-down replicates (Fig. 5A and S10 in the ESI†). The results showed that 62.15% peaks were located in the gene regions, 27.69% peaks in the intergenic regions, 9.57% peaks in the promoter regions, and 0.59% peaks in the downstream regions (Fig. 5B). These peaks were mainly enriched between the transcription start sites (TSS) and transcription end sites (TES) (Fig. 5C). We used real-time qPCR to verify the 5hmU-rich region obtained by high-throughput sequencing. The results showed that all 7 examined regions were enriched compared to the control (Fig. S11A and S12 in the ESI†), indicating that these regions should contain 5hmU modification. In contrast, no enrichment was observed for the vicinal regions of these peaks (Fig. S11B in the ESI†). Moreover, use of ATP instead of N3-ATP in the assay led to no enrichment of 5hmU-containing DNA fragments (Fig. S13 in the ESI†), further confirming that the enriched DNA fragments should contain 5hmU. We could also clearly observe the fragment enrichment of 5hmU in comparison with that of the input sample (Fig. 5D and S12 in the ESI†). The motif analysis showed that 5hmU preferentially occurred at the sequence of AATATGCCA (Fig. 5E).
Given that 5hmUDK specifically catalyzes the phosphorylation of 5hmU, DNA pull-down is enriched with 5hmU and is ready for high-throughput sequencing. Based on the 5hmUDK-mediated bioorthogonal labeling and enrichment procedure presented here, we provide the first map of 5hmU in the T. brucei genome. While 5hmUDK can modify both matched (5hmU:A) and mismatched 5hmU (5hmU:G), the method should be useful in mapping 5hmU from both sources. Taken together, our study highlights the application of this method in mapping the genomic distribution of 5hmU and provides a useful tool for probing the functions of 5hmU. Future experiments will utilize this method for 5hmU profiling in other organisms, such as mESC genomic DNA.
Footnotes |
† Electronic supplementary information (ESI) available: Tables S1–S3; Fig. S1–S13. See DOI: 10.1039/d1sc03812e |
‡ These authors contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2021 |