This is a method to identify novel DNA-interacting proteins at specific target loci, relying on sequence-specific capture of crosslinked chromatin for subsequent proteomic analyses. No prior knowledge about potential binding proteins, nor cell modifications are required. Initially developed for yeast, the technology has now been adapted for mammalian cells.
The hybridization capture of chromatin-associated proteins for proteomics (HyCCAPP) technology was initially developed to uncover novel DNA-protein interactions in yeast. It allows analysis of a target region of interest without the need for prior knowledge about likely proteins bound to the target region. This, in theory, allows HyCCAPP to be used to analyze any genomic region of interest, and it provides sufficient flexibility to work in different cell systems. This method is not meant to study binding sites of known transcription factors, a task better suited for Chromatin Immunoprecipitation (ChIP) and ChIP-like methods. The strength of HyCCAPP lies in its ability to explore DNA regions for which there is limited or no knowledge about the proteins bound to it. It can also be a convenient method to avoid biases (present in ChIP-like methods) introduced by protein-based chromatin enrichment using antibodies. Potentially, HyCCAPP can be a powerful tool to uncover truly novel DNA-protein interactions. To date, the technology has been predominantly applied to yeast cells or to high copy repeat sequences in mammalian cells. In order to become the powerful tool we envision, HyCCAPP approaches need to be optimized to efficiently capture single-copy loci in mammalian cells. Here, we present our adaptation of the initial yeast HyCCAPP capture protocol to human cell lines, and show that single-copy chromatin regions can be efficiently isolated with this modified protocol.
During the past decade, there has seen a dramatic improvement in sequencing technologies, allowing the study of a wide range of genomes in large numbers of samples, and with astonishing resolution. The Encyclopedia of DNA Elements (ENCODE) Consortium, a large-scale multi-institutional effort spearheaded by the National Human Genome Research Institute of the National Institutes of Health, has provided insights into how individual transcription factors and other regulatory proteins bind to and interact with the genome. The initial effort characterized specific DNA-protein interactions, as assessed by Chromatin immunoprecipitation (ChIP) for over 100 known DNA-binding proteins1. Alternative methods such as DNase footprinting2 and formaldehyde assisted isolation of regulatory elements (FAIRE)3 have also been used to locate specific regions of the genome interacting with proteins, but with the obvious limitation that these experimental approaches do not identify the interacting proteins. Despite the extensive efforts over the past years, no technology has emerged that efficiently allows the comprehensive characterization of protein-DNA interactions in chromatin, and the identification and quantification of chromatin-associated proteins.
To address this need, we developed a novel approach which we termed as Hybridization Capture of Chromatin-Associated Proteins for Proteomics (HyCCAPP). Initially developed in yeast4,5,6, the approach isolates crosslinked chromatin regions of interest (with bound proteins) using sequence-specific hybridization capture. After isolation of the protein-DNA complexes, approaches such as mass spectrometry can be used to characterize the set of proteins bound to the sequence of interest. Thus, HyCCAPP can be considered as a non-biased approach to uncover novel DNA-protein interactions, in the sense that it does not rely on antibodies and it is completely agnostic about the proteins that might be found. There are other approaches capable to uncover novel DNA-interacting proteins7, but most rely on ChIP-like methods8,9,10, plasmid insertions11,12,13,14, or regions with high copy numbers15. In contrast, HyCCAPP can be applied to multi- and single-copy regions, and it does not require any prior information about the proteins in the region. In addition, while some of the methods mentioned above have valuable features, notably avoiding the need for DNA-protein crosslinking reactions, the unique feature of HyCCAPP is that it can be applied to single-copy regions in unmodified cells, and without any prior knowledge about putative binding proteins, or available antibodies.
At this point, HyCCAPP has predominantly been applied to the analysis of various genomic regions in yeast4,5,6, and was recently used to analyze protein-DNA interactions in alpha-satellite DNA, a repeat region in the human genome16. As part of our ongoing work, we have adapted the hybridization capture approach initially developed for yeast chromatin to be applicable to the analysis of human cells, and present here a modified protocol that allows the selective capture of single-copy target regions in the human genome with efficiencies similar to our initial studies in yeast. This new optimized protocol now allows the adaptation and utilization of the technology to interrogate protein-DNA interactions across the human genome, using mass spectrometry or other analytical approaches.
It is important to emphasize that the HyCCAPP method is meant for the analysis of specific target regions and is not yet suitable for genome-wide analyses. The technology is especially useful when dealing with regions for which there is scarce information about interacting proteins, or when a more comprehensive in-depth analysis of interacting proteins at a specific genome locus is desired. HyCCAPP is meant to uncover DNA-binding proteins but not characterize accurately the specific protein binding sites in genomic DNA. In its current implementation, the methodology does not provide information about the DNA binding sequences or motifs for individual proteins. Therefore, it nicely complements existing technologies such as FAIRE, and may allow the identification of novel binding proteins in genomic regions identified by an initial FAIRE analysis.
1. Capture Oligonucleotide Design
2. Cell Culture
3. Lysis and Shearing
4. Hybridization Capture
5. Evaluation of Capture Yield by qPCR with Primers for the Target Capture Region of Interest
Due to the need for large input amounts of chromatin for HyCCAPP to succeed, cells are grown to relatively high levels of confluency. Trypan blue staining is used to confirm that cell death rates are moderate (<10 %). In single copy experiments, chromatin content prior to hybridization capture needs to be in the femtomolar range, which usually requires at least 109 cells as starting material. Prior to full scale experiments, it is recommended to test capture oligonucleotide performance in smaller batches. Hybridization captures using a titration of oligonucleotides are shown in Figure 1. The number of capture oligonucleotides is gradually increased while the total concentration remains constant. This experiment clearly shows three key aspects. First, it helps to determine, for that particular region, how many oligonucleotides are needed to reach optimal (and maximal) hybridization efficiencies. Secondly, it reveals if there are any obvious detrimental interactions between a particular set of oligonucleotides. Lastly, it shows if there are any oligonucleotides with poor specificity that carry substantial amounts of background (in this case measured by qPCR analysis with primers targeting other regions in the genome).
The capture hybridization yields will appear modest due to the cross-linked nature of the chromatin sample. A large proportion of fragments in the crosslinked chromatin will not be amenable for hybridization capture. Serial hybridization experiments demonstrate this (Figure 2). If the hybridization capture is successfully run, a subsequent capture experiment of the same chromatin material using a different set of capture oligonucleotides will result in comparable (slightly lower) yields than when using fresh chromatin (~11% lower on average). But if the same region is targeted again with the same set of oligonucleotides in a second capture experiment using the chromatin sample remaining after the first capture experiment, the yield will decrease about 90%, confirming that most of the hybridization-amenable material has been captured. If such a decrease in capture is not observed after subsequent captures with the same set of oligonucleotides, it indicates a technical problem in the hybridization capture step.
In simple systems like yeast, we have observed capture efficiencies approaching 4%5, which led to the identification of 9 proteins differentially enriched in 2 regions when yeast was grown under different conditions. In human cells however, having a genome ~250 times larger than yeast, hybridization yields are typically below 1% (Figure 3). As Figure 3 shows, a single experiment might yield insufficient amounts for proteomics analyses. In that case, several captured samples will have to be pooled in order to perform a subsequent mass spectrometry analysis. Alternatively, lower concentrations of captured target region chromatin can be used for targeted analysis approaches such as western blotting and mass spectrometry-based selected reaction monitoring assays.
Figure 1: Box plot representation of capture oligonucleotide titration. Increasing number of capture oligonucleotides were tested and analyzed by qPCR. Background refers to qPCR assays targeting other regions in the genome. Values are presented as fold change from hybridization capture using a single oligonucleotide. Whiskers represent the minimum and maximum values. Please click here to view a larger version of this figure.
Figure 2: Box plot representation of serial captures. Sequential hybridization captures with the same set of oligonucleotides (A_A) show a strong drop in enrichment compared to sequential hybridization captures using different set of oligonucleotides (A_B) (p = 3.88 e-5). Enrichment is measured by qPCR and values are shown relative to captures using fresh chromatin. Whiskers represent the minimum and maximum values. Please click here to view a larger version of this figure.
Figure 3: Hybridization capture using human cells. Hybridization capture of regions located in different chromosomes. Each region serves as a negative control for the other. Additionally, a hybridization capture is performed using the scrambled (Scr) sequence as a true negative control. Error bars represent standard deviation. Please click here to view a larger version of this figure.
The HyCCAPP method described here has many unique features that make it a powerful approach to uncover DNA-interactions that otherwise would remain elusive. The nature of the process gives HyCCAPP the flexibility to work in different organisms and regions of the genome. It is a method, however, that has several limitations to be considered.
HyCCAPP is a method that avoids any cell modifications so that it can potentially be applied in primary cells, cell culture systems, or even tissue samples. Because of that, however, it requires samples to be crosslinked, which reduces sensitivity in the proteomic analysis and therefore requires large input amounts. There are emerging approaches that rely on CRISPR and biotinylation of surrounding proteins that can function without crosslinking reactions12,13,14,18. These approaches can be very powerful if plasmid insertions do not represent a constraint, but can only be used in cell culture systems. Other approaches are very well suited to identify general protein associations based on a chromatin state or on the presence of a particular transcription factors, but are not meant to study individual loci9,10,19.
The HyCCAPP approach presents an alternative approach with broad applicability, but it is likely that applications in different cell systems or organisms will require optimization. The most critical optimization step in the approach involves the design and selection of capture oligonucleotides. A series of small scale tests should show if the designed oligonucleotides are appropriate and if the sequence in the region is effectively captured. There are several factors that should be considered when designing and testing capture oligonucleotides, such as oligonucleotide length, degree of specificity to the target region, and interactions among the set of oligonucleotides to be used. All these interactions can be readily tested by using a qPCR assay specific to the target region. If desired, a ChIP-Seq-like workflow can be used to sequence the resulting DNA fragments and assure that the specificity of the enrichment is satisfactory across the entire genome5.
Finally, the cross-linking conditions are also highly important in the HyCCAPP procedure, and we have observed noticeable differences between different systems, obtaining the best results in human cells by cross-linking with 1% formaldehyde, whereas in yeast, cross-linking with 3% formaldehyde yielded more reproducible results. It is possible that the more intense cross-linking with 3% formaldehyde in human cells with a more complex chromatin structure does not provide sufficient accessibility for hybridization capture oligonucleotides. It is also conceivable that the presence of a cell wall in yeast hinders formaldehyde access into the cells, increasing the amount of formaldehyde needed for successful crosslinking levels in the chromatin.
Performing ChIP and other capture approaches targeting crosslinked human chromatin, we have observed that only about 1% of the target DNA can actually be captured. For most DNA-based approaches and analyses, this is not a limiting factor. However, unlike ChIP, where the final readout is based on amplifiable DNA, HyCCAPP relies on protein content for the final readout. Due to this intrinsic limitation, large amounts of input material (cultured cells in the experiments presented here) are required: this constrain should be carefully considered before exploring this methodology, especially when applied to single-copy regions. Not all systems will be able to produce the number of cells needed, or the cost of growing the required cell numbers might be prohibitive. The input amounts that HyCCAPP requires (~109 cells) is comparable to other methods that rely on crosslinked material with input amounts ranging between 108 and 1011 cells9,11,15. Future modifications of the HyCCAPP technology will explore approaches to artificially increase copy numbers of the target regions, to make this method more broadly applicable. At the same time, we will work to continue increasing the overall efficiency of the process that together with continuous advancements in mass spectrometry should reduce the input amounts needed and make this technology feasible in more systems.
Biotinylation based approaches using CRISPR, like enCHiP18 and others13,14, have shown to be very effective by eliminating the need of crosslinked material, increasing yields, and significantly reducing input requirements. The elaborate processing of cells in these methods however, does not allow these techniques to be applied to tissue samples, a direction that we have begun to pursue with HyCCAPP, and that is producing promising results.
The authors have nothing to disclose.
This work was supported by NIH Grants P50HG004952 and R01GM109099 to MO.
PrimerQuest tool | IDT | ||
OligoAnalyzer 3.1 | IDT | Analyze capture oligonucleotids | |
PairFold | UBC | Analyze interactions | |
RPMI 1640 media | Thermo Fisher | 11875093 | |
Penicillin-streptomycin | Thermo Fisher | 15140122 | |
L-Glutamine | Thermo Fisher | 25030081 | |
Fetal bovine serum | Thermo Fisher | 26140079 | |
Countess automated cell counter | Thermo Fisher | ||
850 cm2 roller bottle | Greiner Bio-one | 680058 | |
Roller system | Wheaton | 22-288-525 | |
37 % formaldehyde | Sigma-Aldrich | F8775 | |
Glycine | Sigma-Aldrich | ||
Igepal CA-630 | Sigma-Aldrich | I3021 | |
Protease inhibitor cocktail | Sigma-Aldrich | P4380 | |
HEPES | Thermo Fisher | 15630080 | |
Branson digital sonifier SFX 150 | Emerson | ||
Qubit 3.0 fluorometer | Thermo Fisher | ||
Qubit dsDNA BR assay kit | Thermo Fisher | Q32850 | |
Bioanalyzer 2100 | Agilent | ||
Agilent DNA 1000 kit | Agilent | 5067-1504 | |
MES sodium salt | Sigma-Aldrich | M3885 | |
NaCl 5M | Thermo Fisher | AM9760G | |
EDTA | Sigma-Aldrich | ||
DynaMag-2 magnet | Thermo Fisher | 12321D | |
DynaMag-15 magnet | Thermo Fisher | 12301D | |
Dynabeads M-280 atreptavidin | Thermo Fisher | 60210 | |
Low binding tubes | Eppendorf | 22431081 | |
Hybridization oven | SciGene | ||
Tube shaker and rotator | Thermo Fisher | 415110Q | |
DNase I (RNase-free) | New England BioLabs | M0303 | |
SSC buffer 20× concentrate | Sigma-Aldrich | S6639 |