Anna
Witte
a,
Álvaro
Muñoz-López
a,
Malte
Metz
b,
Michal R.
Schweiger
cd,
Petra
Janning
b and
Daniel
Summerer
*a
aFaculty of Chemistry and Chemical Biology, TU Dortmund University, Otto-Hahn Str. 4a, 44227 Dortmund, Germany
bMax-Planck Institute for Molecular Physiology, Otto-Hahn Str. 4a, 44227 Dortmund, Germany
cInstitute for Translational Epigenetics, Medical Faculty, University of Cologne, Weyertal 115b, 50931 Köln, Germany
dCenter for Molecular Medicine Cologne, Robert-Koch-Str. 21, 50931 Cologne, Germany. E-mail: Daniel.summerer@tu-dortmund.de
First published on 20th October 2020
Enrichment of chromatin segments from specific genomic loci of living cells is an important goal in chromatin biology, since it enables establishing local molecular compositions as the basis of locus function. A central enrichment strategy relies on the expression of DNA-binding domains that selectively interact with a local target sequence followed by fixation and isolation of the associated chromatin segment. The efficiency and selectivity of this approach critically depend on the employed enrichment tag and the strategy used for its introduction into the DNA-binding domain or close-by proteins. We here report chromatin enrichment by expressing programmable transcription-activator-like effectors (TALEs) bearing single strained alkynes or alkenes introduced via genetic code expansion. This enables in situ biotinylation at a defined TALE site via strain-promoted inverse electron demand Diels Alder cycloadditions for single-step, high affinity enrichment. By targeting human pericentromeric SATIII repeats, the origin of nuclear stress bodies, we demonstrate enrichment of SATIII DNA and SATIII-associated proteins, and identify factors enriched during heat stress.
A critical component of expressed DBDs is thus the employed tag and the affinity and selectivity of the interaction used for capturing the target chromatin segments. Moreover, the strategy used for introducing the tag into the DBD itself or into other proteins in proximity critically affects the selectivity and thus the successful enrichment and identification of locus-bound chromatin proteins. Besides the use of standard epitope tags for immunoprecipitation that are fused to the employed DBD, strategies to introduce biotin moieties into the target chromatin segment have become increasingly important, given the superior affinity and selectivity of streptavidin-based enrichment.2 This has so far been achieved by enzymatic approaches, such as the use of BirA biotin ligase that is either co-expressed as a wild-type enzyme to biotinylate target peptides contained in the DBD,4 or fused to the DBD itself as a mutant with nonspecific activity (BirA*) to biotinylate other proteins.5 Alternatively, biotinylation has been reported with fusions of the DBD to APEX2, an ascorbate peroxidase that can generate highly reactive biotin–phenoxyl radicals for nonspecific tagging of other proteins.6–8 These concepts differ in the extent of off-target biotinylation and thus overall selectivity due to differential inherent needs for overexpressing the biotinylating enzyme, to diffusion of reactive species, or to biotinylation of proteins that are in spatial proximity without being associated with the target locus.
We aimed to introduce a complementary chemical–biological biotinylation approach that substantially differs in concept and thus in off-target biotinylation behavior. We envisaged to develop encoded programmable DNA binding domains that bear a single strained alkene or alkyne moiety for rapid and bioorthogonal click-biotinylation at defined sites for subsequent streptavidin-bead enrichment of target loci (Fig. 1a). We chose to install alkenes/alkynes in the form of noncanonical amino acid (ncAA) via genetic code expansion.9–11 A range of lysine-based ncAA bearing strained alkene and alkyne side chains are available that can be incorporated in response to the amber codon (TAG) with high efficiency and fidelity12via co-expression of orthogonal pyrrolysyl-tRNA-synthetase (PylRS)/tRNAPyl pairs.13 These react with tetrazines in strain-promoted inverse electron demand Diels–Alder cycloadditions (SPIEDAC) with high bioorthogonality and rapid kinetics.12 For engineering, we employed transcription-activator-like effector (TALE) proteins that feature a central repeat domain (CRD) recognizing DNA in a one-repeat versus one-nucleobase-pairing mode via a variable di-residue in each repeat.14,15 This offers a high degree of programmability and selective targeting of genomic loci in vivo.16 Incorporation of suited ncAA via amber suppression should allow the expression and target-binding of click-functionalized TALEs in mammalian cells, followed by chromatin fixation, click-biotinylation with biotin–tetrazine conjugates, and single step, high-affinity enrichment with streptavidin beads for downstream analysis (Fig. 1a and b).
Fig. 1 Enrichment of user-defined chromatin segments by encoded, click-reactive DNA binding domains. (a) Concept. (b) Workflow. FP: fluorescent protein. |
We targeted repetitive pericentromeric satellite III DNA (SATIII) for enrichment. SATIII is the origin of nuclear stress bodies (nSB),17,18 a class of membrane-less organelle19,20 that is activated by recruitment of the transcription factor heat shock factor 1 (HSF1) upon stress conditions such as heat. This leads to transcription of long noncoding SATIII RNA that seem to act as architectural RNA in stress body formation. Besides the poorly understood mechanisms of nSB formation, the relatively high genomic abundance of repeats makes SATIII a realistic initial target for the development of a single step enrichment methodology.21 We designed the construct “SATIII-TALE_wt”, bearing an N-terminal nuclear localization sequence, the TALE N-terminal region (NTR), the programmable CRD and a C-terminal mCherry domain (Fig. 2a). SATIII-TALE_wt targets the CpG-free SATIII sequence TGATTCCATTCCATTCCATT,22 allowing DNA-binding without potential interference by CpG cytosine 5-methylation.23 We initially confirmed selective binding of SATIII-TALE_wt to its target sequence in HEK293T cells by co-imaging with HSF1 after heat shock (Fig. S2†), and by activation of a luciferase reporter model (Fig. S3†).
Fig. 2 Expression of click-reactive TALE proteins in HEK293T cells. (a) Overview of the employed TALE-mCherry fusion constructs used in this study. (b) Alkene- and alkyne-bearing ncAAs used in this study (Boc = Nε-Boc-L-lysine serving as a non-reactive control ncAA). (c) Amber codon positions within the TALE NTR studied for the efficiency and fidelity of ncAA incorporation. (d) Fluorescence images of HEK293T cells transfected with PylRS_AF and indicated pSATIII-TALE amber mutants grown in the presence or absence of ncAA TCO. The image on the right is from cells transfected with non-amber pSATIII-TALE_wt only (scale bar: 100 μm). (e) Cell images of indicated pSATIII-TALE amber mutants showing ability or inability to bind SATIII loci (scale bars: 10 μm). See Fig. S1† for Hoechst co-stain. |
We selected three alkene- or alkyne-bearing lysine ncAAs for incorporation, bearing either a trans-cyclooctene (TCO24,25), a [6.1.0]-bicyclononyne (BCN25,26), or a cyclooctyne (SCO27) moiety attached to the lysine Nε-atom via a carbamate linker (Fig. 2b). These ncAAs are substrates for tRNAPyl aminoacylation by the polyspecific Methanosarcina mazei PylRS mutant Y306A, Y384F (PylRS_AF, bearing an additional nuclear export sequence28) for incorporation in response to the amber codon in mammalian cells, and can be employed for SPIEDAC reactions with a variety of tetrazines. Though arylazide ncAAs have successfully been used for enrichment and MS analysis of overexpressed proteins,29 the higher reaction rates of tetrazines should better match the particular sensitivity challenges associated with the enrichment of highly dilute chromatin segments from repetitive or even single loci.2 Since the efficiency and fidelity of ncAA-incorporation are critical for sensitive and selective enrichment, we initially evaluated a total of ten different amino acid positions in the TALE NTR (Fig. 2c).
We chose positions to be surface-exposed, and to not be involved in DNA-binding or stabilization of important secondary structures. We co-transfected HEK293T cells with vectors pPylRS_AF encoding the (PylRS)/tRNAPyl pair as well as pSATIII-TALE vectors encoding the respective TALE amber mutant, and grew the cells for 24 h in the presence or absence of 0.25 mM ncAA TCO. We imaged the cells, and recorded the fluorescence signals of the C-terminal mCherry domain to judge the efficiency and fidelity of suppression with TCO at each amber codon. We observed a high degree of context-dependence,30 with certain mutants exhibiting low efficiency (Fig. 2d, see V92, I98), or low fidelity of TCO incorporation (indicated by fluorescence in the absence of ncAA, see A104). Moreover, even for positions with good incorporation efficiency and fidelity, the ncAA was able to cause a loss of SATIII DNA binding, visible by the absence of the characteristic foci (i.e. position L80, Fig. 2e).
We selected the SATIII-TALE_S36, _E50 and _S58 amber mutants for further analyses, since they exhibited correct binding and high incorporation fidelity, with SATIII-TALE_S36 amber even showing expression levels comparable to those of the non-amber SATIII-TALE_wt (Fig. 2d). We employed these mutants in SPIEDAC reactions with fluorescein (FAM)-bearing H-tetrazine conjugates (FAM-tetrazine) in fixed cells, since this allows simple visual assessment of correct SATIII binding after the treatment, but also of SPIEDAC reactivity and selectivity directly in the relevant, cross-linked chromatin context. We transfected HEK293T cells and grew them in the presence of TCO as before, fixed them with formaldehyde, and reacted them with FAM-tetrazine. Imaging of mCherry revealed the SATIII-typical genomic localization of all three TALEs (Fig. 3a). However, only SATIII-TALE_S36TCO showed a co-localized FAM signal, indicating that the TCO side chain was accessible for the tetrazine only at this incorporation site (Fig. 3a). We repeated the experiment with this TALE in cells grown either in the presence of TCO or the control ncAA Boc (Fig. 2b). Reactions with a biotin–tetrazine conjugate followed by staining with streptavidin-Cy7 and imaging revealed a Cy7 signal co-localized to mCherry only for the TCO cells (Fig. 3b). This shows that biotin can be introduced site-selectively at the trans-cyclooctene moiety of SATIII-TALE_S36TCO by SPIEDAC, and that the biotin is available for later streptavidin binding.
To evaluate the three alkene- and alkyne-ncAA TCO, SCO and BCN for their biotinylation efficiencies in cross-linked chromatin, we performed time-resolved measurements. We stained cells grown in the presence of each ncAA or the control ncAA Boc with Cy7-streptavidin after incubation with biotin–tetrazine for 5 min, 1 hour, or 3 hours. We observed a weak Cy7 signal for TCO already after 5 min, whereas signals for BCN (and faintly for SCO), were visible only after 3 hours (Fig. 3c; no signal was observed for the Boc negative control). In model reactions using ncAA-bearing GFP protein, SCO reacts comparatively slowly in SPIEDAC with H-tetrazines (0.67 × 103 M−1 s−1) in vitro.31 In contrast, under the same experimental conditions, both BCN and TCO have been reported to react much faster, and with similar rate constants (1.6 × 104 M−1 s−1 for BCN and 1.3 × 104 M−1 s−1 for TCO). The difference we observe for TCO and BCN compared to the in vitro model system could arise for various reasons, including differential ncAA accessibility in fixed chromatin, or different off-target reactivity with thioles.31,32
We next asked, if SATIII loci could be selectively enriched by streptavidin bead capture with sheared chromatin from fixed cells expressing SATIII-TALE_S36TCO. We cultured, fixed, and reacted cells with biotin–tetrazine, purified the nuclei, and sheared them with ultrasound. We performed a capture/purification step with streptavidin magnetic beads, reversed the formaldehyde crosslinks with heat, and purified the released DNA. qPCR analyses showed a clear enrichment of SATIII DNA for cells expressing SATIII-TALE_S36TCO and reacted with biotin–tetrazine compared to identically treated cells expressing TALE_S36Boc (Fig. 3d). Similarly, cells expressing SATIII-TALE_S36TCO and not reacted with biotin–tetrazine also showed low enrichment (Fig. 3d). This indicates that enrichment of SATIII DNA depends on a successful click reaction between the installed TCO and biotin–tetrazine. We performed identical experiments with cells expressing a TALE_S36TCO with the same repeat composition, but randomized repeat sequence (scrambled TALE, sc) that does not bind to the SATIII target sequence in our in vivo luciferase reporter model (Fig. S3†). We again observed a low enrichment of SATIII DNA, indicating that the enrichment depends on the selective interaction of the SATIII TALE with its target sequence. Moreover, a potential background of endogenous, amber-terminated proteins bearing TCO via off-target amber suppression seems to not strongly contribute to the enrichment of the SATIII target (this background should be identical for SATIII-TALE_S36TCO and scTALE_S36TCO). Finally, we analysed DNA captured from cells expressing SATIII-TALE_S36TCO and reacted or not reacted with biotin–tetrazine for enrichment of an off-target locus in the MGMT gene by qPCR. However we observed a ∼100-fold enrichment of the target SATIII locus for biotin–tetrazine-reacted as compared to non-reacted cells, we observed only a ∼4-fold enrichment of the MGMT locus (Fig. 3e). This again indicates enrichment via a selective interaction of the TALE with the SATIII target sequence. It should be noted that SATIII repeats are heterogenous in sequence and our qPCR based on two (30 and 22 nt) primers is thus expected to amplify only a very small subfraction of SATIII DNA, whereas the single 20 nt sequence bound by our TALE is expected to occur much more frequently. Hence, our qPCR data are expected to strongly underestimate the actual enrichment by the TALE and should not directly be extrapolated to proteomics analyses of enriched DNA, since this will measure all TALE-enriched chromatin segments.
To compare the enrichment selectivity of our approach with an alternative, established biotinylation approach for targeted chromatin enrichment, we designed fusion constructs consisting of the SATIII-TALE or the scTALE and a BirA* domain. In imaging studies with HEK293T cells, both TALEs exhibited expected localization under heat shock (HS) and normal (-HS) conditions, i.e. colocalization with HSF1 of the SATIII-TALE under HS conditions, and diffuse nuclear localization of the scTALE under both conditions (Fig. 3g). Enrichment experiments conducted under identical conditions to those used to generate the results shown in Fig. 3d showed successful enrichment of SATIII DNA with the SATIII-TALE-BirA* that depended on the presence of biotin in the growth medium (Fig. 3h). However, we observed an even higher enrichment of SATIII DNA for the scTALE-BirA*, indicating that this enrichment was not dependent on the selective binding of the TALE to the SATIII target sequence (Fig. 3h). Though the two approaches are not fully comparable because of their different biotinylation concepts, the data show a higher enrichment selectivity of our approach for this DNA target.
We next subjected enriched chromatin samples to on-bead tryptic digests and nano-high performance liquid chromatography mass spectrometry (nanoHPLC-MS/MS), and analysed the amount of enriched TALE by label-free quantification (LFQ). We detected TALE-specific peptides in chromatin of SATIII-TALE_S36TCO expressing cells only if they had been reacted with biotin–tetrazine (Fig. 3e). Similarly, no TALE peptides were detectable for HEK293T or HeLa cells expressing TALE_S36Boc (Fig. 3e, similar data were observed for cells grown under heat-shock conditions, Fig. S4†).
To discover proteins that are recruited to SATIII DNA during nuclear stress body formation, we next performed proteomics studies with cells grown under heat shock and normal conditions. To take advantage of our two-step biotinylation approach compared to potential direct incorporation of a biotin ncAA,33 we blocked endogenous biotin sites before click-biotinylation to reduce background enrichment. Specifically, we followed the protocol as before, but before the addition of biotin–tetrazine, we incubated the cells with avidin, added an excess of free biotin to block remaining binding sites at the avidin, and washed the cells. HEK293T and HeLa cells expressing SATIII-TALE_S36TCO showed normal behaviour after HS, i.e. co-localization of endogenous, immuno-stained HSF1 and TALE at SATIII-typical foci (Fig. 4a). We first compared HEK293T and HeLa cells transfected with pSATIII-TALE_S36TCO or an empty vector (EV) negative control driving the expression of a non-TALE, mCherry-containing truncation product under nHS conditions. In both cases, the TALE was significantly enriched compared to the negative control, whereas mCherry was not (Fig. 4b). We further found enrichment of proteins known to be associated with centromeric and/or pericentromeric regions, including satellite DNA (XRCC5 for HEK293T cells; NAP1L4, YWHAH for HeLa cells, Fig. 4b).34–37 We next compared cells expressing SATIII-TALE_S36TCO under HS and nHS conditions. To our surprise, we did not observe enrichment of HSF1 under HS conditions, despite the co-localization with TALE-marked foci we observed in the immunostainings. To evaluate, if this co-localization corresponds to an actual interaction of HSF1 with SATIII DNA that can be stabilized by fixation as prerequisite for enrichment and MS-detection, we conducted a chromatin-immunoprecipitation (ChIP) with an anti-HSF1 antibody after fixation, and analysed the enrichment by qPCR. HSF1 was strongly enriched under HS conditions, suggesting that the particular detection limit of HSF1 in nanoHPLC-MS/MS analyses is higher than the amount of protein after enrichment (Fig. 4c). Apart from the particular case of HSF1, we found heat-shock-associated enrichment of other proteins. This included the nuclear ubiquitin-activating enzyme UBA1 that is involved in stress-induced DNA damage repair38 and was enriched in both cell types (Fig. 4d). Similarly, the purine biosynthesis gene ATIC was enriched that is known to associate with a number of heat shock proteins as part of the purinosome.39 We also found for heat-stressed HeLa cells enrichment of LGALS3 that has been proposed in stabilizing DNA-interactions of CREB,40 a transcription factor associated with SATIII DNA18 (Fig. 4d). Finally, though not enriched for TCO HS versus TCO nHS cells, we found the nucleotide-exchange factor HSPH1 to be enriched in TCO HS versus EV HS cells (Fig. S5†). HSPH1 is known to be heat-induced and to control substrate turnover of the chaperones HSPA1A/B by ADP release, and its expression is strictly dependent on HSF1.41 These combined results suggest protein enrichment by our click-mediated biotinylation and single step, high affinity capture that is sufficient for MS-identification of proteins that are recruited to our target locus by heat-stress.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0sc02707c |
This journal is © The Royal Society of Chemistry 2020 |