We report the application of quantitative chromosome conformation capture followed by high-throughput sequencing in embryoid bodies generated from embryonic stem cells. This technique allows to identify and quantitate the contacts between putative enhancers and promoter regions of a given gene during embryonic stem cell differentiation.
During mammalian development, cell fates are determined through the establishment of regulatory networks that define the specificity, timing, and spatial patterns of gene expression. Embryoid bodies (EBs) derived from pluripotent stem cells have been a popular model to study the differentiation of the main three germ layers and to define regulatory circuits during cell fate specification. Although it is well-known that tissue-specific enhancers play an important role in these networks by interacting with promoters, assigning them to their relevant target genes still remains challenging. To make this possible, quantitative approaches are needed to study enhancer-promoter contacts and their dynamics during development. Here, we adapted a 4C method to define enhancers and their contacts with cognate promoters in the EB differentiation model. The method uses frequently cutting restriction enzymes, sonication, and a nested-ligation-mediated PCR protocol compatible with commercial DNA library preparation kits. Subsequently, the 4C libraries are subjected to high-throughput sequencing and analyzed bioinformatically, allowing detection and quantification of all sequences that have contacts with a chosen promoter. The resulting sequencing data can also be used to gain information about the dynamics of enhancer-promoter contacts during differentiation. The technique described for the EB differentiation model is easy to implement.
In mice, the inner cell mass (ICM) of 3.5-day-old embryos contains embryonic pluripotent stem cells. The ICM further develops into the epiblast at day 4.5, generating ectoderm, mesoderm, and endoderm cells, the main three germ layers in the embryo. Although pluripotent cells in the ICM exist only transiently in vivo, they can be captured in culture by the establishment of mouse embryonic stem cells (mESCs)1,2,3. The mESCs remain in an undifferentiated state and proliferate indefinitely, yet upon intrinsic and extrinsic stimuli they are also capable of exiting the pluripotency state and generating cells of the three developmental germ layers2,4. Interestingly, when cultured in suspension in small droplets, mESCs form three-dimensional aggregates (i.e., EBs) that differentiate into all three germ layers5. The EB formation assay is an important tool to study the early lineage specification process.
During lineage specification, cells of each germ layer acquire a specific gene expression program4. The precise spatiotemporal expression of genes is regulated by diverse cis-regulatory elements, including core promoters, enhancers, silencers, and insulators6,7,8,9. Enhancers, regulatory DNA segments typically spanning a few hundred base pairs, coordinate tissue-specific gene expression8. Enhancers are activated or silenced by binding of transcription factors and cofactors that regulate local chromatin structure8,10. Commonly used techniques to identify putative enhancers are genome-wide chromatin immunoprecipitation followed by sequencing (ChIP-seq) and the assay for transposase-accessible chromatin using sequencing (ATAC-seq) techniques. Thus, active enhancers are characterized by specific active histone marks and by increased local DNA accessibility11,12,13,14. In addition, developmental enhancers are believed to require physical interaction with their cognate promoter8,9. Indeed, it has been shown that enhancer variants and deletions that disrupt enhancer-promoter contacts can lead to developmental malformations15. Therefore, there is a need for novel techniques that provide additional information for the identification of functional enhancers that control developmental gene expression.
Since the development of the chromosome conformation capture (3C) technique16, the mapping of chromosomal contacts has been intensively used to assess physical distance between regulatory elements. Importantly, high-throughput variants of 3C techniques have recently been developed, providing different strategies for fixation, digestion, ligation, and recovery of contacts between chromatin fragments17. Among them, in situ Hi-C has become a popular technique allowing the sequencing of 3C ligation products genome-wide18. However, the high sequencing costs required to reach a resolution suitable for the analysis of enhancer-promoter contacts makes this technique impractical for the study of specific loci. Therefore, alternative methods were developed to analyze targeted loci at higher resolution19,20,21,22. One of these methods, namely 4C, known as a one versus all strategy, allows detection of all sequences that contact a site selected as viewpoint. However, a disadvantage of the standard 4C technique is the inverse PCR required, which amplifies differently sized fragments, favoring small products and biasing quantification after high-throughput sequencing. Recently, UMI-4C, a new variant of the 4C technique using unique molecular identifiers (UMI) has been developed for quantitative and targeted chromosomal contact profiling that circumvents this problem23. This approach uses frequent cutters, sonication, and a nested-ligation-mediated PCR protocol, thereby involving amplification of DNA fragments with relatively uniform length distribution. This homogeneity reduces biases in the amplification process of PCR preferences for shorter sequences and allows efficient recovery and accurate counting of spatially connected molecules/fragments.
Here we describe a protocol that adapts the UMI-4C technique to identify and quantify chromatin contacts between promoters and enhancers of lineage instructive transcription factors during EB differentiation.
1. Embryoid body generation from mouse embryonic stem cells
2. Dissociation of EBs
3. Fixation
4. Cell lysis and restriction enzyme digest
5. Proximity ligation and crosslink reversal
6. DNA shearing and size selection
7. Library preparation for sequencing
8. 4C chromatin interaction library amplification and purification
Six days after the induction of ESC differentiation in the hanging drops, we obtained a homogenous population of EBs that were used for further analyses (Figure 1). We adapted the UMI-4C method23 to quantify specific chromatin interaction at promoters of lineage specific genes in EBs24. A schematic overview of the protocol with representative quality control gels at different steps is shown in Figure 2A. The first quality control was carried out to determine the efficiency of the MboI restriction enzyme digestion. Efficient digestion showed a fragment size of less than 3 kbp (Figure 2B). Of note, mESC and EB chromatin digestion was difficult and sometimes residual undigested chromatin persisted. The second quality control was carried out after ligation to verify that most of the fragments were now > 3 kbp (Figure 2B). Then, chromatin fragments obtained after sonication were analyzed by gel electrophoresis. Fragment sizes of 400-500 bp were expected (Figure 2B).
After dephosphorylation and single-end adapter ligation, two rounds of PCR were performed to amplify the targets of interest. A nested approach was used to design a set of two primers for each locus. This helped improve specificity. Each target was amplified separately with two different primer pairs to optimize PCR conditions (i.e., primer pairs A and B for the Pou5f1 locus and primer pairs C and D for the T locus, respectively) and resulted in a DNA smear around 400 bp (Figure 2C). Alternatively, multiplex PCR was performed to amplify targets A and C simultaneously (Figure 2D) and resulted in a similar fragment size after purification (Figure 2D). Primers used for 4C library preparation (loci of Pou5f1 and T) can be found in Table 6.
For data analysis, raw sequencing reads were first aligned against the refence mm10 mouse genome, were all duplicated, and low quality (< 20) reads were removed. For each bait, the information on each restriction fragment was obtained by computing the number of read fragments, and a raw contact profile was obtained. Next, the region of interest was defined as all restriction fragments with 2 kbp and 250 kbp distance to the bait. The size of each restriction fragment was increased by aggregating the adjacent restriction fragments sequentially to smoothen the profiles until a threshold of 5% of the total number of raw contacts was reached in the region of interest. To ensure that the replicates were integrated, and conditions were compared, we included both slopes and random intercepts on the restriction fragment level. The average profile per condition and the fold change between them were plotted as shown in Figure 3. During EB differentiation, the contacts between the enhancers and the promoter of the pluripotency gene Pou5f1 decreased, while enhancer-promoter contacts of mesendoderm lineage instructive transcription factor T increased (Figure 3), providing functional insights about these developmental enhancers.
Figure 1: Representative images of mESC and derived embryoid bodies. Day 0 mESC cultured in serum-free conditions (left) and homogenous day 6 EBs (right) observed by an inverted microscope. Scale Bar = 500 μm. Please click here to view a larger version of this figure.
Figure 2: 4C workflow and representative images of the main steps of the protocol. (A) Schematic workflow of the quantitative 4C. RS = restriction site; US = upstream; DS = downstream; UP = universal primer; D = the distance between RS and DS should ideally be 5-15 bp. (B) Examples of MboI-digested chromatin (I), in-nuclei ligated chromatin (II), and sonicated chromatin (III). The numbers on the left indicate the DNA sizes determined by the DNA ladder run for each sample. (C) Examples of PCR amplification at the two loci: Pou5f1 (primers A and B) and T (primer C and D). (D) Examples of multiplex PCR amplification at Pou5f1 and T loci using primers A and C. ES = embryonic stem cells; EB = embryoid bodies. Please click here to view a larger version of this figure.
Figure 3: Examples of 4C profiles. Quantitative 4C profiles for baits located on the Pou5f1 and T gene promoters assayed in mESCs and Day 6 EBS. The top panel shows plots of average contacts generated from two independent biological replicates; the bottom panel shows the average contact fold change of Day 6 EBs versus mESCs (average of the two replicates). Light blue boxes indicate the location of enhancers with dynamic changes during differentiation. Figure adapted from Tian et al.24. Please click here to view a larger version of this figure.
For 5mL | |
1M Tris-HCl, pH8.0 | 50 µL |
5M NaCl | 10 µL |
10% Igepal CA630 | 100 µL |
50x Roche complete protease inhibitors | 100 µL |
MilliQ Water | 4.74 mL |
Table 1: Lysis buffer.
For 1000µL | |
MilliQ Water | 869 µL |
10X NEB T4 DNA Ligase Buffer | 120 µL |
20mg/mL Bovine Serum Albumin | 6 µL |
2000 U/µL T4 DNA Ligase | 5 µL |
Table 2: Ligation master mix preparation.
For 15 µL | |
5X Quick Ligation Reaction Buffer | 10 µL |
NEBNext Adaptor | 3 µL |
Quick T4 DNA ligase | 2 µL |
Table 3: Adapter ligation reaction.
PCR setup | |
Adaptor ligated library-on-deads | 10 µL |
PCR grade water | 20.25 µL |
10 µM Target specific primer | 3.75 µL |
10 µM NEB Index primer | 3.75 µL |
Herculase II 5X buffer | 10 µL |
10 Mm dNTPs | 1.25 µL |
Herculase II polymerase | 1 µL |
Total volume | 50 µL |
PCR program | |
Step 1: 98 °C – 2 min | |
Step 2: 98 °C – 20s | |
Step 3: 65 °C – 30s | |
Step 4: 72 °C – 45s | |
Step 5: go to step 2 to make a total of 15-18 cycles | |
Step 6: 72 °C – 3 min | |
Step 7: 4 °C – hold |
Table 4: 4C chromatin interaction library amplification, first PCR.
Nested PCR setup | |
DNA fragment from the first PCR | 10 µL |
PCR grade water | 20.25 µL |
10 µM specific primer+P5 Illumina primer | 3.75 µL |
10 µM P7 Illumina primer | 3.75 µL |
Herculase II 5X buffer | 10 µL |
10 Mm dNTPs | 1.25 µL |
Herculase II polymerase | 1 µL |
Total volume | 50 µL |
Nested PCR program | |
Step 1: 98 °C – 2 min | |
Step 2: 98 °C – 20s | |
Step 3: 65 °C – 30s | |
Step 4: 72 °C – 45s | |
Step 5: go to step 2 to make a total of 15-18 cycles | |
Step 6: 72 °C – 3 min | |
Step 7: 4 °C – hold |
Table 5: 4C chromatin interaction library amplification, nested PCR.
Name | Sequence (5'-3') |
DS-Oct4-A | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG CTCTTCCGATCTTCTTGCAAAGATAACTAAGCACCAGGCCAG |
US-Oct4-A | TCTCTTGCAAAGATAACTAAGCACCAGGCC |
DS-Oct4-B | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG CTCTTCCGATCTGTGATGGGTCAGCAGGGCTGGAGCCGGGCT |
US-Oct4-B | ACCAGGTGGGGGTGATGGGTCAGCAGGGCT |
DS-T-C | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG CTCTTCCGATCTCCTGGGTCCCTGCACATTCGCCAAAGGAGC |
US-T-C | GATTACACCTGGGTCCCTGCACATTCGCCAA |
DS-T-D | AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG CTCTTCCGATCTGGCTTTGGAGAGGTCAAGGAGACCCGGGAG |
US-T-D | GCTGAGGCTTTGGAGAGGTCAAGGAGACC |
UP-4C | CAAGCAGAAGACGGCATACGA |
Adap-i1 | CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATC |
Adap-i2 | CAAGCAGAAGACGGCATACGAGATACATCGGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATC |
Adap-i3 | CAAGCAGAAGACGGCATACGAGATGCCTAAGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATC |
Adap-i4 | CAAGCAGAAGACGGCATACGAGATTGGTCAGTGACTGGAGTTCAGA CGTGTGCTCTTCCGATC |
Table 6: Primers used for 4C library preparation.
The hanging drop culture method does not need additional growth factors or cytokines and reproducibly generates homogeneous populations of EBs from a predetermined number of mESCs5. Here we describe a protocol of quantitative 4C adapted from the UMI-4C approach to quantify enhancer-promoter contact of lineage specific transcription factors in the EB differentiation model. We identified chromatin regions that contact promoters of Pou5f1 and T genes in a dynamic fashion during EB differentiation. Pou5f1 was downregulated during EB differentiation and the contact frequency between the Pou5f1 promoter and its distal enhancer decreased. Conversely, T was upregulated during EB differentiation and we identified three enhancers for which contact frequencies with their promoter are decreased (Figure 3). To confirm the identification, a chromatin immunoprecipitation (ChIP) assay of active histone mark H3K27ac can be performed24, as this histone mark has been shown to be associated with enhancer activation and enhancers lose this mark during their inactivation11.
A standard 4C technique has been extensively used to survey the chromatin contact profile of specific genomic sites25. However, this approach is difficult to interpret quantitatively even after extensive normalization26,27,28 because of the biases introduced by the heterogeneity of PCR fragment size and the impossibility to distinguish PCR duplicates. Our quantitative 4C method is largely identical to the UMI-4C technique that allows the quantification of single molecules using sonication and a nested-ligation-mediated PCR step to bypass the limitation of the classic 4C approach23. However, unlike the UMI-4C that uses unique molecular identifiers, our quantitative 4C protocol allows the quantification of single molecules based on the specific DNA break produced by the sonication step. It makes our protocol compatible with commercial DNA library preparation kits, obviating the need of primers with unique molecular identifiers.
Our protocol involves several key steps that should be considered. As in the classical 4C method28, critical factors of our protocol are the efficiency of the digestion and the ligation during the preparation of the 3C molecules. Low digestion/ligation efficiencies can dramatically decrease the complexity of interaction with a fragment of interest, resulting in a reduced resolution. As previously described23, another critical step of the protocol is the design of the primers for the library amplification. The second PCR reaction primers should be located 5-15 nt from the interrogated restriction site. In a 75 nt sequencing read, this allows for at least 40 nt left of the capture length for mapping. The primer used in the first PCR reaction should be designed upstream of the second primer with no overlap and both should be specific enough to ensure efficient DNA amplification. For multiplexing, primers should be designed independently, aiming for a melting temperature (Tm) of 60-65 °C. Moreover, as for other 3C techniques, the resolution of the quantitative 4C method is determined by the restriction enzyme used in the protocol25. This protocol uses a restriction enzyme with a 4 bp recognition site, MboI. The maximum resolution with this enzyme is around 500 bp, but this is highly locus dependent and rarely achieved. Another limitation is that interactions that occur between elements located in the same restriction fragment are not detectable. In addition, interactions occurring at a distance of one restriction site cannot be distinguished from the undigested background. The use of a fill-in step prior to ligation might allow the detection of these interactions.
Quantitative 4C is ideally suited to interrogate chromatin contacts of targeted loci. However, the specific PCR amplification step limits the number of loci that can be investigated simultaneously. A way to increase the number of targeted loci is to multiplex the PCR steps to simultaneously amplify several targets, but this requires compatibility of the primers used and testing each primer pair prior to implementation. If global changes of chromatin architecture at promoters are desired, genome-wide approaches such as Hi-C, PC Hi-C, or HiChIP would be more appropriate29,30,31.
The authors have nothing to disclose.
We would like to thank F. Le Dily, R. Stadhouders and members of the Graf laboratory for their advice and discussions. G.S. was supported by a Marie Sklodowska-Curie fellowship (H2020-MSCA-IF-2016, miRStem), T.V.T by a Juan de la Cierva postdoctoral fellowship (MINECO, FJCI-2014-22946). This work was supported by the European Research Council under the 7th Framework Programme FP7 (ERC Synergy Grant 4D-Genome, grant agreement 609989 to T.G.), the Spanish Ministry of Economy, Industry and Competitiveness (MEIC) to the EMBL partnership, Centro de Excelencia Severo Ochoa 2013-2017 and CERCA Program Generalitat de Catalunya.
0.1% EmbryoMax gelatin | EMD Millipore | ES-006-B | Cell culture |
0.25% Trypsin-EDTA | 25200072 | ||
AMPure XP | Beckman Coulter | 10136224 | 4C/DNA purification |
B27 supplement | Gibco | 17504044 | Cell culture |
Beta-mercaptoethanol | Gibco | 31350010 | Cell culture |
Bioruptor Pico | Diagencode | B01060010 | 4C/sonication |
BSA | NEB | B9000S | 4C |
CHIR99021 | Selleck Chemicals | S1263 | Cell culture |
CIP | NEB | M0212 | 4C |
cOmplete Protease Inhibitor Cocktail | Roche | 4693116001 | 4C |
DMEM/F12 medium | Gibco | 11320033 | Cell culture |
dNTP | NEB | N0447S | 4C |
ESGRO Leukaemia Inhibitory Factor (LIF) | EMD Millipore | ESG1107 | Cell culture |
Formaldehyde solution (37%) | Sigma | 252549-25ML | 4C |
Glycin | Sigma | GE17-1323-01 | 4C |
Glycogen | ThermoFischer | R0551 | 4C |
Herculase II Fusion DNA polymerase | Agilent | 600675 | 4C |
IGEPAL CA-630 | Sigma | I3021-50ML | 4C |
Knockout DMEM | 10829018 | ||
L-glutamine | Gibco | 25030081 | Cell culture |
MboI | NEB | R0147M | 4C |
MEM non-essential amino acids | Gibco | 11140050 | Cell culture |
N2 supplement | Gibco | A1370701 | Cell culture |
NEBNext DNA Library prep | NEB | E6040 | 4C |
NEBuffer 2.1 | NEB | B7202S | 4C/digestion |
Neurobasal medium | Gibco | 21103049 | Cell culture |
PD0325901 | Selleck Chemicals | S1036 | Cell culture |
Penicillin Streptomycin | Gibco | 15140122 | Cell culture |
Proteinase K | NEB | P8107S | 4C |
Qubit 4 Fluorometer | ThermoFischer | Q33238 | 4C |
Qubit dsDNA HS Assay Kit | ThermoFischer | Q32851 | 4C |
RNase A | ThermoFischer | EN0531 | 4C |
Sodium pyruvate solution | Gibco | 11360070 | Cell culture |
StemPro Accutase Cell Dissociation Reagent | Gibco | A1110501 | Cell culture |
T4 DNA Ligase Reaction Buffer | NEB | B0202S | 4C |
T4 DNA Ligase Reaction Buffer | NEB | M0202M | 4C |