Experimental validation of enhancer activity is best approached by loss-of-function analysis. Presented here is an efficient protocol that uses CRISPR/Cas9 mediated deletion to study allele-specific regulation of gene transcription in F1 ES cells which contain a hybrid genome (Mus musculus129 x Mus castaneus).
Enhancers control cell identity by regulating tissue-specific gene expression in a position and orientation independent manner. These enhancers are often located distally from the regulated gene in intergenic regions or even within the body of another gene. The position independent nature of enhancer activity makes it difficult to match enhancers with the genes they regulate. Deletion of an enhancer region provides direct evidence for enhancer activity and is the gold standard to reveal an enhancer’s role in endogenous gene transcription. Conventional homologous recombination based deletion methods have been surpassed by recent advances in genome editing technology which enable rapid and precisely located changes to the genomes of numerous model organisms. CRISPR/Cas9 mediated genome editing can be used to manipulate the genome in many cell types and organisms rapidly and cost effectively, due to the ease with which Cas9 can be targeted to the genome by a guide RNA from a bespoke expression plasmid. Homozygous deletion of essential gene regulatory elements might lead to lethality or alter cellular phenotype whereas monoallelic deletion of transcriptional enhancers allows for the study of cis-regulation of gene expression without this confounding issue. Presented here is a protocol for CRISPR/Cas9 mediated deletion in F1 mouse embryonic stem (ES) cells (Mus musculus129 x Mus castaneus). Monoallelic deletion, screening and expression analysis is facilitated by single nucleotide polymorphisms (SNP) between the two alleles which occur on average every 125 bp in these cells.
Transcriptional regulatory elements are critical for spatio-temporal fine tuning of gene expression during development1 and modification of these elements can result in disease due to aberrant gene expression2. Many disease-associated regions identified by genome wide association studies are in non-coding regions and have features of transcriptional enhancers3-4. Identifying enhancers and matching them with the genes they regulate is complicated as they are often located several kilobases away from the genes they regulate and may be activated in a tissue-specific manner5-6. Enhancer predictions are commonly based on histone modification marks, mediator-cohesin complexes and binding of cell type-specific transcription factors7-10. Validation of predicted enhancers is most often done through a vector based assay in which the enhancer activates expression of a reporter gene11-12. These data provide valuable information about the regulatory potential of putative enhancer sequences but do not reveal their function in their endogenous genomic context or identify the genes they regulate. Genome editing serves as a powerful tool to study the function of transcriptional regulatory elements in their endogenous context by loss-of-function analysis.
Recent advances in genome editing, namely the CRISPR/Cas9 genome editing system, facilitate the investigation of genome function. The CRISPR/Cas9 system is easy to use and adaptable for many biological systems. The Cas9 protein is targeted to a specific site in the genome by a guide RNA (gRNA)13. The SpCas9/gRNA complex scans the genome for its target genomic sequence which must be 5' to a protospacer adjacent motif (PAM) sequence, NGG14-15. Base pairing of the gRNA to its target, a 20 nucleotide (nt) sequence complementary to the gRNA, activates SpCas9 nuclease activity resulting in a double strand break (DSB) 3 bp upstream of the PAM sequence. Specificity is achieved through complete base pairing in the gRNA seed region, the 6-12 nt adjacent to the PAM; conversely, mismatches 5' of the seed are usually tolerated16-17. The introduced DSB can be repaired either by the non-homologous end joining (NHEJ) DNA repair or homology directed repair (HDR) mechanisms.NHEJ DNA repair often creates insertion/deletion (indels) of a few bp at the target site that can disrupt the open reading frame (ORF) of a gene. To generate larger deletions in the genome two gRNAs, which flank the region of interest, can be used18-19. This approach is particularly useful for the study of transcriptional enhancers clustered into locus control regions or super-enhancers which are larger than conventional enhancers9,18,20-22.
Monoallelic deletions are a valuable model for studying cis-regulation of transcription. The observed change in transcript level after monoallelic deletion of an enhancer correlates to the role of that enhancer in gene regulation without the confounding effects that can occur when transcription of both alleles is affected potentially influencing cellular fitness. Evaluating reduced expression is difficult however without the ability to distinguish the deleted from the wild type allele. Furthermore, genotyping deletions at each allele without the ability to distinguish the two alleles is challenging, especially for large deletions of >10 kb to 1 Mb23 in which it is difficult to amplify the entire wild type region by PCR. The use of F1 ES cells generated by crossing Mus musculus129 with Mus castaneus allows the two alleles to be differentiated by allele-specific PCR18,24. The hybrid genome in these cells facilitates allele specific deletion screening and expression analysis. On average there is a SNP every 125 bp between these two genomes, providing flexibility in primer design for expression and genotyping analyses. The presence of one SNP can influence the primer melting temperature (Tm) and target specificity in real-time quantitative PCR (qPCR) amplification allowing for discrimination of the two alleles25. Furthermore a mismatch within the 3' end of the primer greatly influences the ability of DNA polymerase to extend from the primer preventing amplification of the undesired allele target26. Described in the following protocol is the use of F1 ES cells for allele specific enhancer deletions of greater than 1 kb and subsequent expression analysis using the CRISPR/Cas9 genome editing system (Figure 1).
Figure 1. Enhancer deletion using CRISPR/Cas9 to study cis-regulation of gene expression. (A) F1 ES cells generated by a cross between Mus musculus129 and Mus castaneus are used to allow for allele specific deletion. (B) Two guide RNAs (gRNA) are used to induce a large Cas9-mediated deletion of the enhancer region. (C) Primer sets are used to identify large mono- and bi-allelic deletions. The orange primers are the inside primers, the purple primers are the outside primers and the green primers are the gRNA flanking primers. (D) Changes in gene expression are monitored using allele-specific qPCR. RFU denotes relative fluorescence units. Please click here to view a larger version of this figure.
1. Designing and Constructing the gRNA
2. Transfection
Note: Electroporation is an efficient method of transfecting plasmids into ES cells. The method described here uses microporator transfection technology.
3. FACS Sorting Transfected Cells
4. Culturing Clones for Genotyping, Expression Analysis and Freezing Cell Stocks
5. Allele-specific Primer Design
6. Genotyping the Deletion
7. Analyzing Expression with Allele Specific Primers
8. Freeze Stock Preparation for Long-term Storage of ES Cells
The protocol described here uses F1 ES cells to study cis-regulation of gene expression in monoallelic enhancer deleted cells generated using CRISPR/Cas9 genome editing (Figure 1). The gRNA and allele-specific primer design for genotyping and gene expression are the key factors in this approach. Each allele-specific primer set must be validated by qPCR to confirm allele specificity. Allele-specific primers that amplify only their respective genomic DNA target are ideal (Figure 2). Ideally these primers have a SNP at their 3' end. Primers with less allele-specificity can be used if they display a minimum of a 5 Ct value difference in their amplification of the correct vs. the incorrect genotype25. Primers that carry a SNP more 5' than the 4th base from the 3' end usually fail to exhibit allele specificity and amplify both genotypes with equal efficiency, revealing the importance of the SNP position in the primer34. In addition a purine/purine or pyrimidine/pyrimidine substitution has been shown to have a greater impact on the Tm difference of two primers compared to a purine/pyrimidine substitution25.
Primary screening of a 96-well plate of DNA isolated from ES cell clones is conducted with allele-specific inside primers to identify clones that carry a deletion on one or both alleles. These deleted clones are further screened with outside allele-specific primers to confirm each deletion. The example given here is from an efficient gRNA pair that resulted in 46% of clones carrying a deletion on the 129, Cast, or both alleles (Figure 3). Secondary screening is done for confirmed monoallelic deletion clones to identify indels around the gRNA target sites on the non-deleted allele as the frequency of indel occurrence at the gRNA target site is high23. Indels around the gRNA are not identifiable in the primary screening, as these clones show no deletion with inside primers and do not amplify with the outside primers (Figure 4). Monoallelic deletion clones that give amplification with both sets of left and right gRNA flanking primers have their other allele largely intact as they do not contain a deletion larger than the 5' or 3' gRNA flanking amplicons. Sequencing the amplicons from the gRNA flanking primers will identify indels smaller than the gRNA flanking amplicons which may not be noticeable in the qPCR. The monoallelic deletion clones that do not give amplification at both regions in the secondary screening are not included for further gene expression analysis. As gRNA target regions are chosen outside the region suspected to have enhancer function, indels smaller than the gRNA flanking amplicons are not likely to affect enhancer function and clones containing these can be included in further analysis.
To restrict a large deletion to one allele a gRNA can be chosen which overlaps a SNP in the seed region or the PAM. Described here is an example of a high-efficiency deletion (53%) on the 129 allele due to a SNP in the PAM for the 3' gRNA on the Cast allele (Figure 5). This deletion removed the SCR, a recently described Sox2 specific enhancer in ES cells18,22. Although the introduction of the large deletion was greatly reduced on the Cast allele, three clones (1, 11, 75) were identified with a large deletion on the Cast allele (Figure 5). Of these three clones two (1, 11) contained 3' break points within 50 bp of the 3' gRNA target region. For the third clone we were not able to identify the 3' break point and concluded that the deletion was greater than 11 kb18. Deletions with one or both of their break points located >100 bp from either the 5' of 3' gRNA target region are difficult to genotype, generally account for 15-30% of all clones, and remain uncharacterized in the primary screening as the deletion is not amplified with the outside primers.
Once clones with monoallelic deletion have been identified they are analyzed for allele-specific gene expression using absolute quantification by reverse transcription qPCR. Gene expression from clones carrying a monoallelic deletion of the critical Sox2 enhancer region, the SCR, is shown here compared with expression in wild-type F1 ES cells (Figure 6). Specifically, clones carrying a deletion of the enhancer region on the 129 allele displayed a decrease in 129 transcript levels whereas clones carrying a deletion on the Cast allele displayed a decrease in Cast transcript levels. The allele-specific gene expression analysis revealed that this distal enhancer region, SCR, is a critical cis-regulator of Sox2 in ES cells18.
Figure 2. Testing the allele-specificity of 129 and Cast allele-specific primers. (A) Amplification from primers for the 129 genotype. (B) Amplification from primers for the Cast genotype. In both A and B the lines display the amplification profiles for C57BL/6 genomic DNA which has the same genotype as 129 DNA at these specific SNPs; the lines with open circles display the amplification profiles for Cast genomic DNA. Amplification resulting from the primer set with the most allele-specificity is shown in green (Allele-specific primer-1), a primer set displaying a 5 Ct difference is shown in purple (Allele-specific primer-2) and a primer set displaying a minimal difference in Ct value is represented in red (Allele-specific primer-3, primer details in Table 7). The red primer set would not be appropriate for allele-specific screening. RFU denotes relative fluorescence units. Please click here to view a larger version of this figure.
Figure 3. Results obtained after screening a 96-well plate with allele-specific inside and outside primers. (A) qPCR results from the screen with inside primers (Enh_del_IS_F1_129, Enh_del _IS_F1C, Enh_del _IS_R1) B) qPCR results from the screen with outside primers (Enh_del_OS_F1, Enh del_OS_R1_129, Enh_del_OS_F2C, Enh_del_OS_R3, primer details in Table 7). In both grey bars represent amplification from Cast-specific primers and black bars represent amplification from 129-specific primers. Note that only clones with a deletion on one or both alleles are screened with the outside primers. The relative amplification of each allele was calculated using 2-CT to approximate the initial concentration and subsequently expressing each allele as a percentage of the sum of the two alleles. Please click here to view a larger version of this figure.
Figure 4. Clones with a monoallelic deletion are screened to identify large indels at the gRNA target sites. Primers flanking the gRNA target regions are used to confirm that the non-deleted allele is intact. Only monoallelic clones without large indels at the target sites on the non-deleted allele are used in the subsequent expression analysis. Please click here to view a larger version of this figure.
Figure 5. Clones obtained from the Sox2 SCR deletion. Shown are the qPCR results from a screen with inside primers (pr111R, pr111F_129, pr111F_Cast, details in Table 7). Grey bars represent amplification from Cast-specific primers and black bars represent amplification from 129-specific primers. Note that the deletion is heavily skewed towards the 129 allele due to the presence of a SNP in the PAM of the 3' gRNA target region on the Cast allele. The relative amplification of each allele was calculated using 2-CT to approximate the initial concentration and subsequently expressing each allele as a percentage of the sum of the two alleles. Please click here to view a larger version of this figure.
Figure 6. SCR deletion dramatically reduces the expression of Sox2. Representative results from 129 or Cast SCR deleted clones. Red bars represent expression of the Sox2 Cast allele and blue bar represents amplification of the Sox2 129 allele. Deletion of the SCR on the 129 allele reduced expression of Sox2 (129) whereas deletion of the SCR on the Cast allele reduced expression of Sox2 (Cast). Data displayed are an average of three technical replicates, error bars not shown. Primers used for Sox2 expression analysis [Sox2_F, Sox2(129)_R, Sox2(Cast)_R] are listed in Table 7. Please click here to view a larger version of this figure.
Reagent | Stock concentration | Volume | Final concentration |
GlutaMAX | 200 mM | 6 ml | 2 mM |
2-Mercaptoethanol | 10 mM | 6 ml | 0.1 mM |
MEM Non-essential aminoacids (NEAA) | 10 mM | 6 ml | 0.1 mM |
Sodium Pyruvate | 100 mM | 6 ml | 1 mM |
Pencillin/Streptomycin | 10,000 units | 3 ml | 50 U/ml |
FBS | 90 ml | 15% | |
CHIR99021* | 10 mM | 3 µM | |
PD0325901* | 10 mM | 1 µM | |
LIF* | 107 units/ml | 1000 U/ml | |
# to prepare ES cell media, add the above components to 500 ml of high glucose DMEM. The ES cell media should not be stored for more than 4 weeks and with inhibitors* not more than 2 weeks. |
Table 1. ES cell media.
Reagent | Stock concentration | Volume | Final concentration |
GlutaMAX | 200 mM | 5 ml | 2 mM |
Pencillin/Streptomycin | 10,000 units | 5 ml | 100 U/ml |
FBS | 50 ml | 10% | |
# to prepare Spin media, add the above components to 500 ml of high glucose DMEM. |
Table 2. Spin media.
Reagent | Final concentration | Volume (50 ml) |
1x PBS without Ca/Mg2+ | 42.0 ml | |
BSA Fraction V (7.5 %) | 15% (v/v) | 7.5 ml |
0.5 M EDTA | 5 mM | 0.5 ml |
Table 3. Collection buffer.
Reagent | Final concentration | Volume (50 ml) |
1x HBSS | 47.25 ml | |
1M HEPES | 25 mM | 1.25 ml |
0.5 M EDTA | 5 mM | 0.5 ml |
BSA Fraction V (7.5 %) | 1% (v/v) | 0.5 ml |
FBS | 1% (v/v) | 0.5 ml |
Table 4. Sorting buffer.
ES cell media | 60% |
FBS | 40% |
Table 5. Recovery media.
ES cell media | 60% |
FBS | 20% |
DMSO | 20% |
Table 6. 2x freezing media.
Table 7. List of primers.
CRISPR/Cas9 mediated genome editing technology provides a straightforward, fast and inexpensive method for genome modification. The method detailed here to generate and analyze monoallelic enhancer deletion for functional enhancer characterization takes advantage of SNPs in F1 mouse cells. The advantages of this type of approach are: 1) monoallelic enhancer deletions do not produce confounding effects that occur when a critical enhancer is deleted from both alleles, i.e., a great reduction in the protein levels of the regulated gene leading to cell lethality or altered phenotype; 2) if the frequency of monoallelic deletion is low obtaining a homozygous deletion is less likely; however, with the use of allele-specific primers in gene expression analysis one can analyze clones with a monoallelic deletion; 3) using four sets of primers to screen monoallelic deletions permits the elimination of clones containing partial or large deletions which confound the downstream analysis.
Allele-specific primer design is critical for genotyping mono/biallelic CRISPR deletions and analyzing the effect on gene expression in an allele-specific manner. This is more easily achieved when the F1 cells used contain more frequent SNPs that allow discrimination of the two alleles. Here ES cells generated from a Mus musculus129 x Mus castaneus cross are used; however, other cells could be used if SNPs between the two alleles allow for allele-specific deletion screening and expression analysis, and if sufficient data exist to predict active enhancer regions to be targeted in the chosen cell type. Therefore, this method can be adapted to any cell line where information about allelic SNPs is available. One of the limitations of the protocol is the dependence on SNPs at specific locations. Some target regions carry fewer SNPs which makes designing allele-specific outside primers that amplify a <800 bp fragment challenging. In such cases a PCR approach could be used as an alternative to qPCR screening allowing for a larger amplicon. In addition there may be SNP associated differences in the phenotype of F1 ES cells; to confirm the function of specific enhancers in additional genotypes homozygous deletions can be conducted in standard ES lines. The specificity of the SpCas9 nuclease is an important issue especially in considering potential applications in clinical approaches. Investigation into SpCas9 specificity has revealed that both the 6-12 nt seed region of the gRNA recognition sequence and the adjacent PAM are important for nuclease activity13-14,17. Off target mutations can be minimized by ensuring that the seed region and adjacent PAM are unique in the genome being modified14,16.
A monoallelic deletion approach described here coupled with allele specific RNA-seq can definitively reveal the gene or genes regulated by a specific enhancer18. These experiments are important for understanding genome function as deletion of even reporter-assay validated enhancers does not always affect gene expression18. Furthermore, enhancers may not regulate the closest gene in the genome or may regulate more than one gene1,20,35. As a result, loss-of-function analysis is the most informative approach to determine the function of an enhancer region. This can be rapidly achieved using CRISPR/Cas9-mediated monoallelic deletion.
The authors have nothing to disclose.
We would like to thank all the members of the Mitchell lab for helpful discussions. This work was supported by the Canadian Institutes of Health Research, the Canada Foundation for Innovation and the Ontario Ministry of Research and Innovation (operating and infrastructure grants held by JAM).
Phusion High-Fidelity DNA Polymerase | NEB | M0530S | high fidelity DNA polymerase used in gRNA assembly |
Gibson Assembly Master Mix | NEB | E2611L | |
gRNA_Cloning Vector | Addgene | 41824 | A target sequence is cloned into this vector to create the gRNA plasmid |
pCas9_GFP | Addgene | 44719 | Codon-optimized SpCas9 and EGFP co-expression plasmid |
AflII | NEB | R0520S | |
EcoRI | NEB | R3101S | |
Neon Transfection System 100 µL Kit | Life Technologies | MPK10096 | Microporator transfection technology |
prepGEM | ZyGEM | PT10500 | genomic DNA extraction reagent |
Nucleo Spin Gel & PCR Clean-up | Macherey-Nagel | 740609.5 | |
High-Speed Plasmid Mini Kit | Geneaid | PD300 | |
Maxi Plasmid Kit Endotoxin Free | Geneaid | PME25 | |
SYBR select mix for CFX | Life Technologies | 4472942 | qPCR reagent |
iScript cDNA synthesis kit | Bio-rad | 170-8891 | Reverse transcription reagent |
0.25% Trypsin with EDTA | Life Technologies | 25200072 | |
PBS without Ca/Mg2+ | Sigma | D8537 | |
0.5M EDTA | Bioshop | EDT111.500 | |
HBSS | Life Technologies | 14175095 | |
1M HEPES | Life Technologies | 13630080 | |
BSA fraction V (7.5%) | Life Technologies | 15260037 | |
Max Efficiency DH5α competent cells | Invitrogen | 18258012 | |
FBS | ES cell qualified | FBS is subjected to a prior testing in mouse ES cells for pluripotency | |
DMSO | Sigma | D2650 | |
Glutamax | Invitrogen | 35050 | |
DMEM | Life Technologies | 11960069 | |
Pencillin/Streptomycin | Invitrogen | 15140 | |
Sodium pyruvate | Invitrogen | 11360 | |
Non-essential aminoacid | Invitrogen | 11140 | |
β-mercaptoethanol | Sigma | M7522 | |
96-well plate | Sarstedt | 83.3924 | |
Sealing tape | Sarstedt | 95.1994 | |
CoolCell LX | Biocision | BCS-405 | alcohol-free cell freezing container |
CHIR99021 | Biovision | 1748-5 | Inhibitor for F1 ES cell culture |
PD0325901 | Invivogen | inh-pd32 | Inhibitor for F1 ES cell culture |
LIF | Chemicon | ESG1107 | Inhibitor for F1 ES cell culture |