This protocol describes the Capture Hi-C method used to characterize the 3D organization of megabased-sized targeted genomic regions at high-resolution, including boundaries of topologically associating domains (TADs) and long-range chromatin interactions between regulatory and other DNA sequence elements.
The spatial organization of the genome contributes to its function and regulation in many contexts, including transcription, replication, recombination, and repair. Understanding the exact causality between genome topology and function is therefore crucial and increasingly the subject of intensive research. Chromosome conformation capture technologies (3C) allow inferring the 3D structure of chromatin by measuring the frequency of interactions between any region of the genome. Here we describe a fast and simple protocol to perform Capture Hi-C, a 3C-based target enrichment method that characterizes the allele-specific 3D organization of megabased-sized genomic targets at high-resolution. In Capture Hi-C, target regions are captured by an array of biotinylated probes before downstream high-throughput sequencing. Thus, higher resolution and allele-specificity are achieved while improving the time-effectiveness and affordability of the technology. To demonstrate its strengths, the Capture Hi-C protocol was applied to the mouse X-inactivation center (Xic), the master regulatory locus of X-chromosome inactivation (XCI).
The linear genome holds all the information necessary for an organism to undergo embryonic development and survive throughout adulthood. However, instructing genetically identical cells to perform different functions is fundamental for accurately controlling which information is used in specific contexts, including different tissues and/or developmental stages. The three-dimensional organization of the genome is thought to participate in this accurate spatio-temporal regulation of gene activity by facilitating or preventing the physical interaction between regulatory elements that can be separated by several hundred kilobases in the linear genome (for reviews1,2,3). In the last 20 years, our understanding of the interplay between genome folding and activity has rapidly increased, largely owing to the development of chromosome conformation capture technologies (3C) (for review4,5,6,7). These methods measure the frequency of interactions between any regions of the genome and rely on the ligation of DNA sequences that are in close 3D proximity within the nucleus. The most common 3C protocols start with the fixation of cell populations with a cross-linking agent such as formaldehyde. The cross-linked chromatin is then digested with a restriction enzyme, although MNase digestion has also been used8,9. After digestion, free DNA ends in close spatial proximity are re-ligated, and cross-linking is reversed. This step gives rise to the 3C 'library' or 'template', a mixed pool of hybrid fragments in which sequences that were in 3D proximity to the nucleus have higher chances of getting ligated in the same DNA fragment. The downstream quantification of these hybrid fragments enables inferring the 3D conformation of genomic regions that are located thousands of base pairs apart in the linear genome but might interact in the 3D space.
Many different approaches have been developed to characterize the 3C library, differing both in terms of which subsets of ligation fragments are analyzed and which technology is used for their downstream quantification. The original 3C protocol relied on the selection of two regions of interest and the quantification of their 'one versus one' interaction frequency by PCR10,11. The 4C approach (circular chromosome conformation capture) measures the interactions between a single locus of interest (i.e., the 'view-point') and the rest of the genome ('one versus all')12,13,14. In 4C, the 3C library undergoes a second round of digestion and re-ligation to generate small circular DNA molecules that are PCR amplified by view-point specific primers15. 5C (chromosome conformation capture carbon copy) enables the characterization of 3D interactions across larger regions of interest, providing insights into higher-order chromatin folding within that region ('many versus many')16. In 5C, the 3C library is hybridized to a pool of oligonucleotides overlapping restriction sites that can be subsequently amplified by multiplex PCR with universal primers15. In both 4C and 5C, the informative DNA fragments were initially quantified by microarrays and later by next-generation sequencing (NGS)17,18,19. These strategies characterize targeted regions of interest but cannot be applied to map genome-wide interactions. This latter goal is achieved with Hi-C, a 3C-based high-throughput strategy in which massively parallel sequencing of the 3C template allows the unbiased characterization of chromatin folding at the genome-wide level ('all versus all')20. The Hi-C protocol includes the incorporation of a biotinylated residue at the digested fragments' ends, which is followed by pull-down of ligation fragments with streptavidin beads to increase the recovery of ligated fragments20.
Hi-C revealed that mammalian genomes are structurally organized at multiple scales in the 3D nucleus. At the megabase scale, the genome is divided into regions of active and inactive chromatin, the A and B compartments, respectively20,21. The existence of further subcompartments represented by different chromatin and activity states was also subsequently shown22. At higher resolution, the genome is further partitioned into sub-megabase self-interacting domains called topologically associating domains (TADs), first revealed by Hi-C and 5C analysis of the human and mouse genomes23,24. Unlike compartments which vary in a tissue-specific manner, TADs tend to be constant (although there are many exceptions). Importantly, TAD boundaries are conserved across species25. In mammalian cells, TADs frequently encompass genes sharing the same regulatory landscape and have been shown to represent a structural framework that facilitates gene co-regulation while limiting the interactions with neighboring regulatory domains (for review3,26,27,28). Furthermore, within TADs, interactions due to CTCF sites at the base of cohesin-extruded loops may increase the probability of promoter-enhancer or enhancer-enhancer interactions (for review29).
In Hi-C, compartments and TADs can be detected at 1 Mb to 40 kb resolution, but higher resolution can be achieved to characterize smaller scale contacts such as looping interactions between distal elements at the scale of 5-10 kb. However, increasing the resolution to be able to detect such loops efficiently by HiC requires a significant increase in sequencing depth and, therefore, sequencing costs. This is exacerbated if the analysis needs to be allele-specific. Indeed, an X-fold increase in resolution requires an X2 increase in sequencing depth, meaning that high-resolution and allele-specific genome-wide approaches can be prohibitively expensive30.
To improve cost-effectiveness and affordability while maintaining high-resolution, target regions of interest can be physically pulled down from genome-wide 3C or Hi-C libraries following their hybridization with complementary biotin-labeled oligonucleotide probes before downstream sequencing. These target enrichment strategies are referred to as Capture-C methods and allow the interrogation of interactions of hundreds of target loci scattered across the genome (i.e., Promoter Capture (PC) Hi-C; Next Generation (NG) Capture-C; Low Input (LI) Capture-C; Nuclear Titrated (NuTi) Capture-C; Tri-C)31,32,33,34,35,36,37,38,39,40, or across regions spanning up to several megabases (i.e., Capture HiC; HYbrid Capture Hi-C (Hi-C2); Tiled-C)41,42,43. Two aspects can vary in capture-based methods: (1) the nature and design of biotinylated oligonucleotides (i.e., RNA or DNA, single oligos capturing dispersed genomic targets or multiple oligos tiling a region of interest); and (2) the template that is used for pulling down targets which can be the 3C or Hi-C library, the latter consisting of biotinylated restriction fragments pulled down from the 3C library.
Here, a Capture Hi-C protocol based on the enrichment of target contacts from the 3C library is described. The protocol relies on the design of a custom-tailored tiling array of biotinylated RNA probes and can be performed in 1 week from the 3C library preparation to the NGS sequencing. The protocol is fast, simple, and allows characterizing the higher order 3D organization of megabase-sized regions of interest at 5 kb resolution while improving time effectiveness and affordability in comparison with other 3C methods. The Capture Hi-C protocol was applied to the master regulatory locus of X-chromosome inactivation (XCI), the X-inactivation center (Xic), which hosts the Xist noncoding RNA. The Xic has previously been the subject of extensive structural and functional analyses (for review44,45). In mammals, XCI compensates for the dosage of X-linked genes between females (XX) and males (XY) and involves the transcriptional silencing of almost the entirety of one of the two X chromosomes in female cells. The Xic has represented a powerful, gold standard locus for studies in 3D genome topology and the interplay with gene regulation44. 5C analysis of the Xic in mouse embryonic stem cells (mESCs) led to the discovery and naming of TADs, providing the first insights into the functional relevance of topological partitioning and gene co-regulation24. The topological organization of the Xic was subsequently shown to be critically involved in the appropriate developmental timing of Xist upregulation and XCI46, and unsuspected cis-regulatory elements that can influence gene activity within and between TADs were also recently discovered within the Xic47,48,49. Applying Capture Hi-C to 3 Mb of the mouse X chromosome spanning the Xic demonstrates the power of this approach in dissecting large-scale chromatin folding at high resolution. A detailed and easy-to-follow protocol is provided, starting from the design of the array of biotinylated probes across every DpnII restriction site within the region of interest to the generation of the genome-wide 3C library, the hybridization and capture of target contacts, and downstream data analysis. An overview of the appropriate quality controls and expected results is also included, and both the strengths and limitations of the approach are discussed in light of similar existing methods.
The mouse embryonic stem cells (mESCs) used in this study were derived from a cross of a TX/TX R26rtTA/rtTA female50 with a Mus musculus castaneus male according to the animal care guidelines of Institut Curie (Paris)51.
1. Probe design
2. Experimental procedure
3. Data analysis
The described Capture Hi-C protocol is based on the preparation of the genome-wide 3C-template using a four-base cutter (DpnII). The subsequent enrichment of ligation fragments across the genomic region of interest is obtained by hybridization of an array of tiling RNA probes and their streptavidin-based capture according to the target enrichment system used in this study (Figure 1). Biotinylated RNA probes were selected as they show tighter binding affinity to their targets compared to DNA probes52,60. Captured libraries are then indexed and pooled for multiplexed high-throughput sequencing. Capture Hi-C data can be visualized as high-resolution Hi-C interaction maps but also as 4C-like single view-point contact maps to specifically visualize the interactions of smaller sequences such as promoters or enhancers within the entire captured region. The workflow of the protocol is shown in Figure 4. Pre-sequencing quality controls are shown in Figure 2 and include the assessment of proper digestion and re-ligation of the 3C template and its efficient shearing and purification across the different steps of the protocol. The sheared 3C template DNA is expected to run between 150 to 700 bp, and no enrichment of fragments >2 kb should be detected. During the following steps, several bead-based DNA cleanup and size selection steps are performed, first after the shearing, then after the pre-capture and post-capture PCRs. Cleaned libraries show a distinct fragment enrichment profile as visualized on a high sensitivity DNA bioanalyzer (Figure 2). The mean fragment size increases over the course of the library preparation due to the ligation of adaptors, sequencing, and indexing primers. Post-sequencing quality controls are obtained via Hi-C Pro and shown in Figure 3. Many different bioinformatics software applications have been proposed for 3C-like data processing and analysis. Among them, the HiC-Pro pipeline is one of the most popular solutions, allowing the processing of raw sequencing data to the final contact maps at various resolutions55. HiC-Pro uses a two-step mapping strategy to align the sequencing reads on the reference genome. The 3C products are then reconstructed and filtered out to remove non-informative pairs of contact and to generate the contact maps. In addition, it is able to use a list of known polymorphisms to perform allele-specific analysis and to separate the contacts coming from the two parental alleles in distinct contact maps. More recently, HiC-Pro has been included and extended into the nf-core framework (nf-core-hic), providing a highly scalable and reproducible community-driven pipeline61,62.
To capture the mouse Xic, an array of 28,913 RNA probes tiling 3 Mb of the X chromosome was designed. This region includes the key player in XCI, the long noncoding gene Xist, and its known ~800 kb regulatory landscape (Figure 5). This ~800 kb region is partitioned into two TADs: one including the Xist promoter and its known positive regulators (i.e., the noncoding transcripts Ftx, Jpx, and Xert and the protein-coding gene Rnf12), and the neighboring TAD encompassing the negative cis-regulators of Xist (i.e., its antisense transcript Tsix, the enhancer element Xite, and the noncoding transcript Linx) (for review44,45).
By applying the described Capture Hi-C protocol to the Xic, the topological organization of this locus was obtained at unprecedented resolution (Figure 6 and Figure 7). This is particularly clear when comparing the Capture Hi-C profile to previously published 5C47 (Figure 6 and Figure 7; Supplementary Table 1) and Hi-C61 (Figure 6 and Figure 7; Supplementary Table 1) profiles. For instance, sub-TAD structures are more evident — the TAD containing the Xist promoter (Xist-TAD) is clearly subdivided into two smaller domains (Figure 6A, blue arrowhead). Previously, this could only be visually "guessed" from the 5C profile (Figure 6B), albeit the detection of a boundary in this region using the insulation score algorithm. Likewise, the resolution of the Capture Hi-C profile allows the identification of two smaller domains in the neighboring TAD (Figure 6A, B), which contains the promoter of the Tsix locus (Tsix-TAD); this was not previously achieved with 5C (Figure 6B). Of note, topological boundaries determined by the insulation score from the Capture Hi-C and 5C data are generally detected at slightly different locations and with different relative strengths.
Moreover, other sub-TAD structures such as contact loops are clearly visible from the Capture Hi-C data, such as the loop between Xist and Ftx (Figure 7A), previously identified with Capture-C63, and the loop between Xist and Xert (Figure 7B), recently identified using a similar protocol for Capture Hi-C48. Other contacts can also be mapped more precisely due to the increased resolution of the Capture Hi-C profiles, such as those forming the known contact hotspots within the Tsix-TAD between the Linx, Chic1, and Xite loci (Figure 7A).
In comparison with the Hi-C data shown in Figure 7, Capture Hi-C allowed for a fourfold increase in resolution, yet it required only one-fourth of sequencing depth (i.e., 126 M reads versus 571 M) (Supplementary Table 1). This increase in resolution allows for the detection of subTADs and looping interactions that could not be detected by Hi-C at the sequencing depth shown in Figure 6 and Figure 7. The described protocol for Capture Hi-C thus allows for a much more detailed, high-resolution characterization of a large genomic region of interest, when compared to previous approaches.
Figure 1: Probe design. Schematic representation of the strategy used for probe design. Regions of 300 bp upstream and downstream of each DpnII restriction site across the 3 Mb target region were selected and tiled with overlapping biotinylated RNA probes. One of these selected regions is shown, chrX: 102,474,805-102,475,500. No more than 40 bases of repetitive sequences are allowed in each probe. Please click here to view a larger version of this figure.
Figure 2: Capture Hi-C pre-sequencing quality controls. (A) Representative example of 3C template quality controls. 200 ng of DNA were loaded on a 1% agarose gel. Lane 1: 1 kb ladder. Lane 2: Undigested, cross-linked, and intact chromatin runs as a sharp band at >10 kb. Lane 3: DpnII-digested cross-linked chromatin runs as a smear between 1 kb to 3 kb in size. Lane 4: Final 3C library or template; free ends of digested cross-linked DNA fragments are re-ligated. The DNA smear of lower molecular size is almost undetectable, and the ligation product is detected as a band of >10 kb. (B) Representative examples of high sensitivity bioanalyzer DNA profiles. Top left: successfully sheared 3C library showing a distribution of fragment size between 150 bp and 700 bp. Top right: unsatisfactory sheared 3C library. Unsheared DNA is detected as broad enrichment of fragments >2 kb. (C) Bottom left: sheared DNA sample following a 1:1 left side-size selection using SPRI beads. Fragments of ~300 bp are enriched. Bottom middle: Pre-capture PCR profile after ligation of paired-end adaptors according to the manufacturer's protocol. Bottom right: final Capture Hi-C library including adaptors, sequencing, and indexing primers for multiplexed sequencing. Abbreviations: bp = base pairs, FU = arbitrary fluorescence unit. Please click here to view a larger version of this figure.
Figure 3: Capture Hi-C post-sequencing quality controls with HiC-Pro. (A) Example of mapping rate on the reference genome for the first mate of the sequencing pairs. The light blue fraction represents the reads aligned by HiC-Pro and spanning a ligation junction. This metric can thus be used to validate the experimental ligation step. (B) Once sequencing mates are aligned on the genome, only uniquely aligned read pairs are kept for analysis. (C) Non-valid pairs (in red) such as dangling-end, self-circle, or re-ligation are discarded from the analysis. The fraction of valid pairs is a good indicator of the ligation and pull-down efficiency. (D) The valid pairs can be further divided into intra/inter-chromosomal and short/long-range contacts. Duplicated read pairs that are likely to represent PCR artifacts are discarded from the analysis. (E) For allele-specific analysis, HiC-Pro reports the number of allelic reads supported by either one or two mates for each parental genome (i.e., C57BL/6J x CASTEi/J). The same fraction of reads assigned to the maternal and paternal allele are expected. (F) Finally, only valid pairs overlapping the capture region are selected to build the contact maps. Capture-capture pairs represent contacts within the targeted region, while capture-reporter pairs involve interaction between the targeted region and an off-target one. Please click here to view a larger version of this figure.
Figure 4: Workflow of Capture Hi-C protocol. Schematic representation of different protocol steps. To generate the genome-wide 3C template, chromatin is first cross-linked with formaldehyde and then digested with the DpnII restriction enzyme. Free DNA ends are then re-ligated, cross-linking is reversed, and DNA is purified. To enrich fragments encompassing the target region, an array of biotinylated RNA probes is hybridized to the 3C template and captured by streptavidin-mediated pull-down. Capture libraries are processed for multiplexed sequencing, and valid ligation fragments are quantified to infer the frequency of chromatin contacts across the target, which are visualized as high-resolution interaction maps. Please click here to view a larger version of this figure.
Figure 5: Overview of the region encompassing the Xic on the mouse X chromosome. Schematic representation of the mouse X chromosome and zoom in of the 3 Mb captured region (ChrX: 102,475,000-105,475,000). The targeted region includes ~800 kb of DNA corresponding to the Xic, the master regulatory locus of XCI. The Xic includes the long noncoding genes, Xist, a key player of XCI, and its regulatory landscape. Positive regulators of Xist are shown in green, and negative regulators in purple. Please click here to view a larger version of this figure.
Figure 6: Capture Hi-C, 5C, and Hi-C interaction maps across the 3 Mb captured region. (A) Capture Hi-C interaction map of the 3 Mb target encompassing the mouse Xic at 10 kb resolution (this study). (B) 5C interaction map of the same target region as in A at 6 kb resolution (data reprocessed from47). Repetitive regions not included in the analyses are masked in white. The 5C data require their own bioinformatics processing (see47). After cleaning and alignment, the 5C maps at the primer resolution are binned using a running median (window = 30 kb, step = 5) to reach a final resolution of 6 kb. (C) Hi-C interaction map of the same genomic region as in A and B at 40 kb resolution (data reprocessed from64). All interaction maps were generated from mouse ESCs. The insulation score was calculated using cooltools and is represented as histograms with insulation minimas at TAD boundaries. TAD boundaries are shown as vertical lines below the map. The height of each line indicates boundary strength. Genes are shown as arrows pointing in the direction of transcription. Sub-TAD boundaries that are detected exclusively or more precisely in Capture Hi-C maps are indicated by magenta and blue arrowheads for sub-TADs in the Tsix and Xist TADs, respectively. Please click here to view a larger version of this figure.
Figure 7: Capture Hi-C, 5C, and Hi-C interaction maps across 1 Mb within the captured region. (A) Capture Hi-C interaction map of the 1 Mb genomic region encompassing the mouse Xic at 5 kb resolution (this study). (B) 5C interaction map of the same genomic region as in A. at 6 kb resolution (data reprocessed from47). Repetitive regions not included in the analyses are masked in white. Of note, the 5C data require their own bioinformatics processing (see47). After cleaning and alignment, the 5C maps at the primer resolution are binned using a running median (window = 30 kb, step = 5) to reach a final resolution of 6 kb. (C) Hi-C interaction map of the same genomic region as in A and B of Hi-C at 20 kb resolution (data reprocessed from64). All interaction maps were generated from mESCs. The insulation score was calculated using cooltools and is represented as histograms with insulation minimas at TAD boundaries. TAD boundaries are shown as vertical lines below the map. The height of each line indicates boundary strength. Genes are shown as arrows pointing to the direction of transcription. Contact loops that are detected exclusively or more precisely in Capture Hi-C are indicated by magenta and blue asterisks for loops in the Tsix and Xist TADs, respectively. Please click here to view a larger version of this figure.
Supplementary Table 1: Post-sequencing statistics for the datasets used in this manuscript: Capture Hi-C (this study), Hi-C64, and 5C47. Please click here to download this File.
Here we describe a relatively quick and easy Capture Hi-C protocol to characterize the higher-order organization of megabase-sized genomic regions at 5-10 kb resolution. Capture Hi-C belongs to the family of Capture-C technologies that are designed to enrich targeted chromatin interactions from genome-wide 3C or Hi-C templates. To date, the large majority of Capture-C applications have been exploited to map chromatin contacts of relatively small regulatory elements scattered across the entire genome. In the first Capture-C protocol, multiple overlapping RNA biotinylated probes were used to capture >400 pre-selected promoters in 3C libraries prepared from erythroid cells31. The same strategy was subsequently improved in Next Generation (NG) and Nuclear Titrated (NuTi) Capture-C to achieve high-resolution interaction profiles of >8,000 promoters by using single 120 bp DNA baits spanning single restriction sites and two sequential rounds of Capture to maximize the enrichment of informative ligation fragments32,40. These strategies led to the functional dissection of cis-acting elements in many different contexts, including mouse embryonic development, cell differentiation, X-chromosome inactivation, and gene mis-regulation in pathological conditions46,63,65,66,67,68,69,70,71.
In Promoter Capture Hi-C (PCHi-C), >22,000 annotated promoters containing restriction fragments were pulled down from Hi-C libraries by hybridization of single RNA 120-mer biotinylated probes at either or both ends of the restriction fragment34,72. This method allowed dissection of the interactome of thousands of promoters in a rapidly increasing number of cell types, including mouse embryonic stem cells, fetal liver cells, and adipocytes34,35,72,73, but also human lymphoblastoid lines, hematopoietic progenitors, epidermal keratinocytes, and pluripotent cells37,74,75,76,77.
In comparison with these target enrichment technologies, Capture Hi-C targets contiguous genomic regions up to the megabase scale, thereby spanning one or more TADs and encompassing regulatory landscapes of genes. The entire region of interest must be tiled with an array of biotinylated probes encompassing each DpnII restriction site within the target. The hybridization of the biotinylated array to the 3C template, its subsequent streptavidin-based capture, and processing for multiplexed sequencing is performed using a target enrichment system for Illumina Paired-End multiplexed sequencing. The entire protocol is fast, as it can be performed in 1 week from 3C library preparation through to NGS sequencing, and it requires only minor adaptations and/or custom-specific troubleshooting.
The protocol also provides advantages in comparison with other 3C-based methods. To obtain interaction maps at a resolution of 5-10 kb, we sequenced 100-120 M paired-end reads. As a comparison, we used here a Hi-C dataset of 571 M reads to reach a 20 kb resolution64 (GSM2053973), and at least 1 billion reads would be required to reach a 5 kb resolution with chromosome-wide Hi-C22.
Capture Hi-C as used in the present study reaches a much higher resolution than the previously published 5C based on a 6-bp cutter restriction enzyme47 (Supplementary Table 1). Importantly, the strategy designed to enrich and amplify targeted interactions in 5C does not allow for allele-specific analysis of chromatin interactions. On the contrary, Capture Hi-C data can be mapped allele-specifically, allowing the dissection of the 3D structural landscapes of pairs of homologous chromosomes, for example in human cells or in F1 hybrid cell lines derived by crossing genetically different mouse strains78. To generate allele-specific Capture Hi-C interaction maps at 5 kb resolution, we sequenced 150 bp paired-end reads to increase SNP coverage. Similar allele-specific approaches can be applied to human cell lines, for which the annotation of SNPs is available22.
Importantly, although Capture Hi-C generally ensures high resolution while improving the affordability of sequencing costs, the production of custom-tailored biotinylated oligonucleotides does have an impact on the overall cost of this method. Therefore, the choice of the most suitable 3C method will differ for different applications, and will depend on the biological question that is being addressed and the resolution required, as well as the size of the region of interest. Other Capture Hi-C protocols developed share key features with the protocol described here. For example, a Capture Hi-C strategy was applied to characterize ~50 kb to 1 Mb genomic regions spanning noncoding variants associated with breast and colorectal cancer risk; in this protocol, target regions were pulled down from Hi-C libraries by hybridizing 120-mer RNA baits tiling the target regions at a 3x coverage33,38,79. Similarly, HYbrid Capture Hi-C (Hi-C2) was used to target interactions within regions of interest up to 2 Mb80. In both protocols, the use of a Hi-C template enriched for biotin pulled-down ligation fragments increased the percentage of total informative reads compared to our protocol. For example, in the Hi-C dataset we used here for comparison64 (GSM2053973), the percentage of valid pairs after the removal of duplicates is 4.8-fold higher than the valid pairs obtained in Capture Hi-C as described in Figure 3 and Supplementary Table 1. However, the consecutive pull-down of biotinylated ligated fragments and hybridized probes makes the protocol significantly more complex and time consuming while possibly decreasing the complexity of the captured region.
Another available method to enrich 3C templates with tiling probes is Tiled-C, which was applied to study chromatin architecture at high spatial and temporal resolution during mouse erythroid differentiation43. In Tiled-C, a panel of 70 bp biotinylated probes is used to enrich contacts within large-scale regions in two consecutive rounds of capture to generate very high-resolution maps of targeted interactions43,81. The double capture enrichment also makes the protocol longer and more complex when compared to Capture Hi-C. However, different to the Capture-C strategies targeting single restriction sites, in Tiled-C the second round of capture does not seem to significantly increase the capture efficiency, and therefore can probably be omitted43. Finally, a similar tiling approach based on the same target enrichment strategy used in this study was applied to the dissection of regulatory landscapes encompassing structural variants described in patients with congenital malformations and re-engineered in transgenic mice41,42. In this case, the tiling array of probes was designed across the entire target rather than in the proximity of DpnII restriction sites41. Nevertheless, this work was seminal in highlighting the sensitivity and power of this strategy to achieve high-resolution characterization of large genomic regions in different contexts41,42,48.
In conclusion, the protocol described here represents an easy, robust, and powerful strategy for the high-resolution 3D characterization of any genomic regions of interest. The application of this approach to different model systems, cell types, developmentally-regulated chromatin landscapes, and gene regulation in healthy and pathological conditions is likely to facilitate our understanding of the interplay and causality between genome topology and gene regulation, one of the fundamental open questions in the epigenetics field. Furthermore, applying Capture Hi-C to map long-range interactions and higher-order chromatin folding of risk variants identified by GWAS studies has the potential to reveal the functional relevance of noncoding genomic loci associated with human diseases in different contexts, thereby providing novel insights into the processes potentially underlying pathogenesis.
The authors have nothing to disclose.
Work in the Heard laboratory was supported by a European Research Council Advanced Investigator award (XPRESS – AdG671027). A.L. is supported by a European Union Marie Skłodowska-Curie Actions Individual Fellowship (IF-838408). A.H. is supported by the ITN Innovative and Interdisciplinary Network ChromDesign, under the Marie Skłodowska-Curie Grant agreement 813327. The authors are thankful to Daniel Ibrahim (MPI for Molecular Genetics, Berlin) for helpful technical advice, to the NGS platform at Institut Curie (Paris), and to Vladimir Benes and the Genomics Core Facility at EMBL (Heidelberg), for support and assistance.
10x PBS pH 7.4 | Gibco | 10010-023 | |
37% (vol/vol) paraformaldehyde solution | Electron Microscopy Sciences | 15686 | single use glass-vials; do not reuse |
50 mL PP conical tube | Falcon | 352070 | |
Agarose | Sigma | A9539-500g | |
Bioanalyzer | Agilent | G2939BA | |
Cell Scrapers – 25 cm Handle and 3.0 cm Blade | Falcon | 353089 | |
CHIR99021 | Axon Medchem BV | Axon 1386 | |
cOmplete Mini, Protease inhibitor cocktail (EDTA-free) | Merck | 11836170001 | |
Countess Cell Counting Chamber Slides | Invitrogen | C10228 | |
Countess II FL | Invitrogen | ZGEXSCCOUNTESS2FL | Automated cell counter |
Covaris S2 | Covaris | 500217 | Sonicator |
DNA LoBind tube, 1.5 mL | Eppendorf | 30108051 | |
DpnII (50000 units/mL) | New England Biolabs | R0543M | |
Dulbecco's Modified Eagle Medium (DMEM) | Merck | D6429 | |
Ethanol (100%) | Merck | 1.00983.2500 | |
Fetal Bovine Serum (FBS) | Thermo Scientific | 10270106 | |
gelatine from porcine skin | Sigma | G1890 | |
GeneRuler 1 kb Plus DNA Ladder | Thermo Scientific | SM0313 | |
GlycoBlue | Thermo Scientific | AM9516 | Coprecipitant |
High-Sensitivity Bioanlayzer chips | Agilent | 5067-4626 | |
Large Cooling Centrifuge 5920 R | Eppendorf | 5948000018 | |
leukaemia inhibitory factor (LIF) | Merck | ESG1107 | |
Liquiport | KNF | NF300 | Benchtop aspiration system |
Low-binding filter tips | Biozym | VT0260U, VT0240, VT0220, VT0200U | |
Molecular biology grade water | Merck | W3500-6x500ML | |
Next Seq 500 | Illumina | SY-415-1001 | |
Next Seq 500 High Output v2 Kit (300 cycles) | Illumina | FC-404-2004 | |
Nonidet P40 Substitute (NP40) | Merck | 11332473001 | |
PD0325901 | Axon Medchem BV | Axon 1408 | |
Protease inhibitor cocktail (EDTA-free) | Merck | 11873580001 | |
Proteinase K – recombinant, PCR-grade (20 mg/mL) | Thermo Scientific | EO0491 | |
Qubit 2.0 | Thermo Scientific | Q32871 | |
Qubit assay tubes | Thermo Scientific | Q32856 | |
Qubit dsDNA High Sensitivity kit | Thermo Scientific | Q32851 | |
RNase A (10 mg/mL) | Thermo Scientific | EN0531 | |
Sodium acetate pH 5.2 (3M) | Merck | S7899 | |
speed vacuum concentrator | Eppendorf | EP5305000100-1EA | |
Agencourt AMPureXP | Beckman Coulter | A63881 | SPRI beads |
SureSelect Target Enrichment Box 1 | Agilent | 5190-8645 | |
SureSelect Target Enrichment Kit ILM Indexing Hyb Module Box 2 | Agilent | 5190-4455 | |
SureSelect XT Library Prep Kit ILM | Agilent | 5500-0132 | |
T4 ligase (30 units/µL) | Thermo Scientific | EL0013 | |
table-top Centrifuge 5427 R | Eppendorf | 5409000012 | |
Triton-X-100 (500 mL) | Merck | X100-500ML | |
Trypan Blue | Invitrogen | T10282 | |
Trypsine | Thermo Scientific | 25300054 | |
UltraPure Glycine | Thermo Scientific | 15527013 | |
β-mercaptoethanol | Thermo Scientific | 31350010 |