In this work we provide an experimental workflow of how active enhancers can be identified and experimentally validated.
Embryonic development is a multistep process involving activation and repression of many genes. Enhancer elements in the genome are known to contribute to tissue and cell-type specific regulation of gene expression during the cellular differentiation. Thus, their identification and further investigation is important in order to understand how cell fate is determined. Integration of gene expression data (e.g., microarray or RNA-seq) and results of chromatin immunoprecipitation (ChIP)-based genome-wide studies (ChIP-seq) allows large-scale identification of these regulatory regions. However, functional validation of cell-type specific enhancers requires further in vitro and in vivo experimental procedures. Here we describe how active enhancers can be identified and validated experimentally. This protocol provides a step-by-step workflow that includes: 1) identification of regulatory regions by ChIP-seq data analysis, 2) cloning and experimental validation of putative regulatory potential of the identified genomic sequences in a reporter assay, and 3) determination of enhancer activity in vivo by measuring enhancer RNA transcript level. The presented protocol is detailed enough to help anyone to set up this workflow in the lab. Importantly, the protocol can be easily adapted to and used in any cellular model system.
Development of a multicellular organism requires precisely regulated expression of thousands of genes across developing tissues. Regulation of gene expression is accomplished in large part by enhancers. Enhancers are short non-coding DNA elements that can be bound with transcription factors (TFs) and act from a distance to activate transcription of a target gene1. Enhancers are generally cis-acting and most frequently found just upstream of the transcription start site (TSS), but recent studies also described examples where enhancers were found much further upstream, on the 3′ of the gene or even within the introns and exons2.
There are hundreds of thousands of potential enhancers in the vertebrate genomes1. Recent methods based on chromatin immunoprecipitation (ChIP) provide high-throughput data of the whole genome that can be used for enhancer analysis3-9. Though data obtained by ChIP-seq experiments greatly increases the likelihood to identify cell and tissue-specific enhancers, it is important to keep in mind that detected binding sites do not necessarily identify direct DNA binding and/or functional enhancers. Thus, further functional analysis of newly identified enhancers is indispensable. In this work, we present a basic three-step process of putative active enhancer identification and validation. This includes: 1) selection of putative transcription factor binding sites by bioinformatics analysis of ChIP-seq data, 2) cloning and validation of these regulatory sequences in reporter constructs, and 3) measurement of enhancer RNA (eRNA).
Exposure of embryonic stem (ES) cells to retinoic acid (RA) is frequently used to promote neural differentiation of the pluripotent cells 10. RA exerts its effects by binding to RA receptors (RARα, β, γ) and retinoid X receptors (RXRα, β, γ). RARs and RXRs in a form of heterodimer bind to DNA motifs called RA-response elements, that is typically arranged as direct repeats of AGGTCA sequence (called as half site) and regulate transcription. Ligand-treatment experiments allowed the identification of several retinoic acid regulated genes in ES cells 11,12. However, enhancer elements for many of these genes has not been described yet. To demonstrate how the here-described workflow can be used for enhancer identification and validation we show step-by-step the selection and characterization of two retinoic acid-dependent enhancers in embryonic stem cells.
1. Enhancer Selection Based on Chip-seq Analysis
2. Reporter Assay
NOTE: The luciferin/luciferase system is used as a very sensitive reporter assay for transcriptional regulation. Depending on the enhancer activity luciferase enzyme is produced that will catalyze the oxidation of luciferin to oxyluciferin resulting in bioluminescense, which can be detected. As a first step, identified putative enhancer sequences should be subcloned into a reporter vector (e.g., TK-Luciferase 22, pGL3 or NanoLuc).
3. Characterization of Enhancer RNA
NOTE: A more direct indicator of enhancer activity has emerged from recent genome-wide studies that identified many short non-coding RNAs, ranging in size from 50 to 2,000 nt, which are transcribed from enhancers, and are termed enhancer RNAs (eRNAs)16,25,26 (Figure 4). eRNA induction highly correlates with the induction of adjacent exon-coding genes. Thus, signal-dependent enhancer activity can be quantified in vivo by comparing eRNA production between various conditions by RT-qPCR.
We used a pan-specific RXR antibody in order to identify genome-wide which RA-regulated genes have receptor enrichment in their close proximity. Bioinformatics analysis of RXR ChIP-seq data obtained from ES cells treated with retinoic acid revealed the enrichment of the nuclear receptor half site (AGGTCA) under the RXR occupied sites (Figure 1). Using a bioinformatics algorithm we mapped back the motif search result for the half site to the RXR ChIP-seq data (Figure 2). This analysis helped us to accurately identify those ChIP peaks which were overlapping with canonical nuclear receptor binding sites. Visualization of these sites in IGV indicated enrichment of these transcription factors in the close proximity of Hoxa1, a previously characterized RAR/RXR target (Figure 2 and Figure 3). We also identified a novel RA target gene, namely PRMT8 27. This latter region contains a direct repeat with no spacer nucleotides between the two half sites (AGGTCAAGGTCA)(Figure 3). To functionally validate that the element identified for Hoxa1 can indeed bind RAR/RXR and regulated by RA we constructed TK-luciferase reporter vectors that contained ~300bp genomic region, including the identified elements. We also constructed vectors without the response element (Figure 5). ES cells were transfected and luciferase activity was measured in the absence and presence of retinoic acid (Figure 5). Constructs containing the retinoic acid response element (RARE) did show increased luciferase signal intensity upon RA treatment, while the construct without the RARE was not inducible.
We also studied the retinoic acid-dependent activity of the Hoxa1 enhancer. To ascertain if the sense and anti-sense eRNA expression of the Hoxa1 RAR/RXR-bound enhancer correlate with the mRNA expression of the gene, we also measured the level of Hoxa1 mRNA (Figure 6). These results confirmed that the enhancer activity is induced by short-term RA treatment (3 hr), thus confirming that the enhancer is likely involved in RA-dependent enhancer regulation.
Figure 1. Homer De Novo Motif Results. An output HTML (homerResults.html) generated by Homer for the example RXR ChIP-seq is shown. Sequence logos corresponding to top enriched transcription factor motifs identified by de novo motif discovery at RXR-bound loci. 7463 RXR occupied genomic regions were used for the motif search. 77.66% of these regions contains AGGTCA-like motif (ranked as #1). For detailed description visit: (http://homer.salk.edu/homer/motif/) Please click here to view a larger version of this figure.
Figure 2. Finding Genomic Regions Containing Motif of Interest. Visualization of the aligned RXR ChIP-seq reads (mm_ES_RXR_24h_ATRA.bam and the AGGTCA motif occurrences (mm_ES_RXR_24h_ATRA_homerpeaks_motif1_mm10s_200_remaped_mbed.bed) by Integrative Genomics Viewer (IGV). Alignments are colored by read strand (reads on + strand: red, – strand: blue). Genome used for the alignment: mm10. Please click here to view a larger version of this figure.
Figure 3. Genomic Loci of Hoxa1 and PRMT8 Enhancers. IGV snapshot of RXR ChIP-Seq signals obtained from untreated and RA-treated (24 hr, 1 µM) embryonic stem cells 27 are shown. Identified binding sequences are colored red. Please click here to view a larger version of this figure.
Figure 4. Experimental Design for Testing eRNA Coding Sequences. Enhancers chosen for enhancer RNA (eRNA) measurement should be located at least 1.5 – 2 kb away relative to the transcription start site (TSS) of the closest gene. ChIP-seq data of the transcription factor (TF) of interest and the motif analysis can be used to identify direct binding sites genome-wide. As binding of transcriptional coactivator (e.g., P300) often marks distal enhancers, regions enriched for both the TF and P300 are good enhancer candidates. Expression of eRNAs is positively correlate with the enrichment of activated enhancer histone mark H3K27ac. Regions 200 – 1,000 bp away in sense or anti-sense direction from the center of the transcription factor binding of is recommended for eRNA primer design. Please click here to view a larger version of this figure.
Figure 5. Luciferase Activity of the Indicated Reporter Constructs in ES Cells. Indicated genomic regions of the Hoxa1 enhancer were cloned into TK-Luc vector and transfected into embryonic stem cells. Construct 1 and 2 contained the retinoic acid response element (RARE, underlined sequence). Cells were treated with RA (24 hr, 1 µM). βGal was used as an internal control for normalization. Bars represent mean normalized values from four biological replicates. ± s.d. *** P <0.005 Please click here to view a larger version of this figure.
Figure 6. Retinoic Acid Induced Enhancer RNA Transcription. Hoxa1 mRNA and eRNA levels as measured by RT-qPCR. Total RNA was isolated from ES cells treated for 3 hr with retinoic acid (RA) or DMSO as vehicle (veh). Forward primers for the sense and anti-sense eRNA detection and the PCR amplicons are shown. Distance of regions detected by eRNA RT-qPCR from the center of the RXR binding site is indicated. Values were normalized to the average of 36B4 mRNA. Data represent mean ± s.e.m. *** P <0.005 Please click here to view a larger version of this figure.
In recent years, advances in sequencing technology have allowed large-scale predictions of enhancers in many cell types and tissues 7-9. The workflow described above allows one to perform primary characterization of candidate enhancers chosen based on ChIP-seq data. The detailed steps and notes will help anyone to set up a routine enhancer validation in the lab.
The most critical step in the luciferase reporter assay is the transfection efficiency. It is recommended to include a GFP-construct in order to estimate the transfection efficiency in every experiment. Having fresh and good quality of plasmids can be critical and can significantly improve the transfection. The presented protocol can be adapted to any cellular model system. It is highly recommended to use the appropriate cell type for the characterization of tissue-specific enhancers. Thus, luciferase reporter assay requires that the cells chosen for the experiment should be efficiently transfectable. Alternatively, easily transfectable cells should be used where the transcription factor of interest is (over)expressed. This can be particularly important in case of some tissue-specific enhancers.
In contrast, enhancer RNA can be easily used in any cell without such limitations to study enhancer activity in different conditions in vivo. Key parameters for efficient eRNA detection are listed above. Isolation of good quality, non-degraded RNA is essential. DNase treatment is very important in order to avoid the detection of genomic DNA contamination instead of the eRNA. As RNA is used for subsequent reverse transcription reaction to get cDNA, DNase should be appropriately inactivated. It is advisable to design primers for the detection of both sense and anti-sense eRNAs.
Carrying out the above described enhancer trap experiment (transfection of a construct containing an enhancer element and a luciferase-based reporter) can provide strong functional evidence for enhancer activity. However, the ultimate proof, which is rarely presented in research articles, would be genetic elimination of the designated enhancer and thus determination of its role in vivo or at least in a cell 28.
The above-described workflow does not give us information which gene is regulated by the validated enhancer. Further research could use 3C (Chromosome Conformation Capture) experiment that show whether the implicated enhancer is interacting with the promoter of the gene of interest.
The authors have nothing to disclose.
The authors would like to acknowledge Dr. Bence Daniel, Matt Peloquin, Dr. Endre Barta, Dr. Balint L Balint and members of the Nagy laboratory for discussions and comments on the manuscript. L.N is supported by grants from the Hungarian Scientific Research Fund (OTKA K100196 and K111941) and co-financed by the European Social Fund and the European Regional Development Fund and Hungarian Brain Research Program – Grant No. KTIA_13_NAP-A-I/9.
KOD DNA polymerase | Merck Millipore | 71085-3 | for PCR amplification of enhancer from gDNA |
DNeasy Blood & Tissue kit | Qiagen | 69504 | for genomic DNA isolation |
QIAquick PCR Purification kit | Qiagen | 28106 | for PCR product purification |
Gel extraction kit | Qiagen | 28706 | for gel extraction if there are more PCR product |
HindIII | NEB | R3104L | restriction enzyme |
BamHI | NEB | R3136L | restriction enzyme |
FastAP | Thermo Scientific | EF0651 | release of 5'- and 3'-phosphate groups from DNA |
T4 DNA ligase | NEB | M0202 | for ligation |
QIAprep Spin Miniprep kit | Qiagen | 27106 | for plasmid isolation |
DMEM | Gibco | 31966-021 | ES media |
FBS | Hyclone | SH30070.03 | ES media |
MEM Non-Essential Amino Acid | Sigma | M7145 | ES media |
Penicillin-Streptomycin | Sigma | P4333 | ES media |
Beta Mercaptoethanol | Sigma | M6250 | ES media |
FuGENE HD | Promega | E2311 | transfection reagent |
Opti-MEM® I Reduced Serum Medium | Life Technologies | 31985-062 | for transfection |
All-trans retinoic acid | Sigma | R2625 | ligand, for activation of RAR/RXR |
96-well clear plate | Greiner | 655101 | for Beta galactosidase assay |
96-well white plate | Greiner | 655075 | for Luciferase assay |
D-luciferin, potassium salt | Goldbio.com | 115144-35-9 | for Luciferase assay |
ATP salt | Sigma | A7699-1G | for Luciferase assay |
MgSO4x 7H2O | Sigma | 230391-25G | for Luciferase assay |
HEPES | Sigma | H3375-25G | for Luciferase assay |
Na2HPO4 x 7H2O | Sigma | 431478-50G | for Beta galactosidase assay |
NaH2PO4 x H2O | Sigma | S9638-25G | for Beta galactosidase assay |
MgSO4 x 7H2O | Sigma | 230391-25G | for Beta galactosidase assay |
KCl | Sigma | P9541-500G | for Beta galactosidase assay |
ONPG (o-nitrophenyl-β-D-galactosidase) | Sigma | N1127-1G | for Beta galactosidase assay |
TRIzol® | Life Technologies | 15596-026 | RNA isolation |
High-Capacity cDNA Reverse Transcription Kit | Life Technologies | 4368814 | reverse transcription of eRNA |
Rnase-free Dnase | Promega | M6101 | Dnase treatment |
SsoFast Eva Green | BioRad | 750000105 | RT-qPCR mastermix |
CFX384 Touch™ Real-Time PCR Detection System | BioRad | qPCR machine | |
BioTek Synergy 4 microplate reader | BioTek | luminescent counter |