Methods for generating large-scale gRNA libraries should be simple, efficient and cost-effective. We describe a protocol for the production of gRNA libraries based on enzymatic digestion of target DNA. This method, CORALINA (comprehensive gRNA library generation through controlled nuclease activity) presents an alternative to costly custom oligonucleotide synthesis.
The popularity of the CRISPR/Cas9 system for both genome and epigenome engineering stems from its simplicity and adaptability. An effector (the Cas9 nuclease or a nuclease-dead dCas9 fusion protein) is targeted to a specific site in the genome by a small synthetic RNA known as the guide RNA, or gRNA. The bipartite nature of the CRISPR system enables its use in screening approaches since plasmid libraries containing expression cassettes of thousands of individual gRNAs can be used to interrogate many different sites in a single experiment.
To date, gRNA sequences for the construction of libraries have been almost exclusively generated by oligonucleotide synthesis, which limits the achievable complexity of sequences in the library and is relatively cost-intensive. Here, a detailed protocol for CORALINA (comprehensive gRNA library generation through controlled nuclease activity), a simple and cost-effective method for the generation of highly complex gRNA libraries based on enzymatic digestion of input DNA, is described. Since CORALINA libraries can be generated from any source of DNA, plenty of options for customization exist, enabling a large variety of CRISPR-based screens.
The adaptation of the bacterial CRISPR/Cas9 system as a molecular targeting tool caused the most recent revolution in molecular biology. Never before has it been so easy to manipulate chromatin at defined genomic locations. Common applications of CRISPR include targeted gene mutations1, genome engineering2, epigenome editing3, transcriptional activation and gene silencing4. One particular advantage of the CRISPR system is that its applications are not limited to well-studied candidate sites, as gRNA libraries make less biased screens possible. These facilitate the discovery of functional loci in the genome without any prior experimental knowledge. However, gRNA library construction is currently mostly based on oligo-nucleotide synthesis, and there are limited options to purchase gRNA libraries that are not of human or mouse origin or target regions outside open reading frames. Thus, although CRISPR screens have already proven incredibly potent5,6,7,8, their full potential has not yet been exploited.
To overcome the limitation of classical gRNA generation methods two strategies have recently been developed. Both are based on controlled enzymatic digestion of target DNA rather than relying on custom oligonucleotide synthesis. While CORALINA9 employs micrococcal nuclease, the only currently available alternative method, CRISPR-EATING10, makes use of restriction enzymes (HpaII, ScrFI, BfaI and MmeI). Importantly, both techniques can be applied to any input DNA, which serves as the source of gRNA protospacer sequences. While the CRISPR-EATING method employs a strategy to decrease the number of cloned gRNAs whose targeting sites are not followed by the required S.pyogenes PAM (protospacer adjacent motive), it generates only a small fraction of all possible functional gRNAs for a given region. CORALINA, on the other hand, is able to generate all potential gRNAs for the source sequence, but also incorporates a higher fraction of non-functional guides. gRNA library generation through controlled nuclease activity enables the production of comprehensive gRNA libraries for any species, any Cas9-protein or -effector system in a simple and cost-effective manner. Moreover, CORALINA is adaptable to customization, as appropriate input and vector choices define the library type, size and content. Here, a detailed protocol is presented that can be used for the generation of comprehensive gRNA libraries from diverse sources of DNA (Figure 1), including bacterial artificial chromosomes (BACs) or genomic DNA9. The representative results accompanying this protocol were derived by applying the CORALINA protocol to BAC DNA.
1. Digestion of DNA with Micrococcal Nuclease
2. Separation of DNA Fragments Using Polyacrylamide Gel Electrophoresis (PAGE)
3. Isolation of DNA Fragments from PAGE-gels Using the Crush and Soak Method
NOTE: This step has been adopted from Sambrook et al.11
4. End Repair of MNase-digested, Gel-purified Fragments
5. Linker Generation
NOTE: Linkers need to be amplified in parallel with section 3 to be able to proceed immediately with linker ligation. Primer sequences used below must be appropriate for the chosen gRNA expression vector. Those presented here have been designed for the vector pgRNA-pLKO.1.9 For amplification of the 5' linker from pgRNA-pLKO.1, use the primer sequences 5'-linker-F (TTGGAATCACACGACCTGGA) and 5'-linker-R (CGGTGTTTCGTCCTTTCCAC), yielding a 689 bp amplicon. For amplification of the 3' linker from pgRNA-pLKO.1, use the primers 3'-linker-F: (GTTTTAGAGCTAGAAATAGCAAGTTAAAATA) and 3'-linker-R: (ACTCGGTCATGGTAAGCTCC), which yield an 848 bp amplicon.
6. Linker Ligation and Amplification of Inserts
7. Size Selection
NOTE: This step separates MNase-fragments with the correctly attached 5' and 3' linker from fragments with two 5' or two 3' linkers based on size.
8. Cloning of PCR-amplified Fragments into the gRNA Expression Vector by Gibson Assembly
9. Preparation of Electro-competent TG1 E. coli Cells
10. Electroporation of TG1 Electrocompetent E. coli Cells
NOTE: Electroporation is one of the bottlenecks in comprehensive library generation. To preserve the library representation, it is recommended to conduct as many individual electroporation reactions as necessary/practicable and to perform the quality control steps described below (10.6. and 10.8.).
11. Extraction of Plasmid DNA
Using the protocol at hand, CORALINA gRNA libraries have been generated from human and mouse genomic DNA9 and BAC DNA (Figure 1). To produce fragments of input DNA suitable for cloning into gRNA expression vectors, optimal conditions for controlled nuclease digestion have to be determined. A typical result for the optimization of micrococcal nuclease digestion is depicted in Figure 2A. Insufficient amount of nuclease (0.1, 2, 3, 4, 4.5 or 5 units) produces no noticeable products in the required size range (10-100 bp) and 5.5-7.5 units still produced fragments that are on average too long. Larger amounts of enzyme (50 units) lead to excessive degradation of input DNA after 10 min. Consequently, an intermediate amount was chosen (10 units). The digest was scaled up to produce enough digested fragments for subsequent purification and cloning (Figure 2B). While it is recommended to blindly select DNA fragments by size and only rely on the DNA ladder for orientation to minimize exposure of DNA fragments to UV light, gels can be stained afterwards for quality control of digestion and cutting. Figure 2B shows a representative example of a PAGE gel from which DNA fragments between 20 and 30 bp have been excised. Gel purified MNase fragments were loaded onto a 20% PAGE gel to check successful size selection and purification of MNase-digested fragments (Figure 2C). The protocol at hand is compatible with the use of customized linker sequences, allowing to clone the MNase-digested fragments into gRNA expression vectors of choice. Here, gRNA-PLKO9 was used as backbone. The linkers are amplified from the gRNA expression vector using standard PCR. Figure 2D depicts a representative example of amplified linker sequences devoid of additional, incorrect or no template amplicons. Next, linker amplicons are digested with restriction enzymes to ensure linkers are ligated onto the MNase-digested fragments in the correct orientation. Figure 2D shows agarose gels of 5' and 3' linkers before and after digestion with HindIII and SacII respectively, indicating complete digestion of the linkers to the predicted 637 and 295 bp. The right-hand portion of the gel image documents the excision of the digested linker fragments. Following gel extraction of digested linkers, the next step in the protocol is the ligation of linkers to the end-repaired MNase-digested fragments. Because linker sequences are generated by PCR using unphosphorylated primers, self-ligation of linkers should not occur. Only the end-repaired MNase-digested DNA fragments provide the phosphate groups necessary for ligation. Following nick translation, the ligation product is amplified by PCR. In order to avoid excessive PCR amplification bias that could skew the representation of gRNA sequences in the library, amplification is limited to less than 20 cycles in total. Following PCR, the amplification products are difficult to visualize on agarose gels. Separate control PCRs with 32 cycles are therefore performed to detect the products (but are not used for library preparation). Results from this control PCR are shown in Figure 2E. This allows to optimize the ligation reactions and to ensure reactions are devoid of PCR artefacts, which sometimes occur in "no fragments controls" (NFC). Figure 2E shows the desired amplicon (5' linker + DNA fragment + 3' linker, length: 869 bp) following amplification of ligation reactions using equimolar (1:1) ratios between fragments and linker sequences.
Figure 1: Suggested timeline for preparing a gRNA library. CORALINA offers a simple and cost-efficient strategy for the generation of comprehensive gRNA libraries from a plethora of different DNA sources from any organism. The protocol at hand can be brought to completion during one working week. Linker generation can be performed in parallel with DNA end-repair. Preparation of electrocompetent bacteria takes two days and includes an overnight growth step and should therefore be started before assembly reactions are set up. Please click here to view a larger version of this figure.
Figure 2: Critical steps during the protocol. (A) Controlled digestion of BAC DNA enables generation of fragments of different sizes. Shown here is the optimization of MNase digestion. Purified BAC DNA was treated with different amounts of MNase for 10 min. 10 U of MNase generate DNA fragments of the desired length (20-30 bp). (B) Size selection of fragments between 20 and 30 bp using excision from polyacrylamide gels. Purified BAC DNA was treated with 10 U of MNase for 10 min. The image was recorded following excision. (C) Quality control of gel-purified fragments. After gel-purification, 1/6th of the purified MNase fragments was loaded onto a 20% PAGE gel to check successful size selection and purification. (D) Amplification of linker sequences for assembly and restriction enzyme digestion of linkers to ensure directional cloning. 5' and 3' linkers were amplified and cut with HindIII and SacII, respectively. No-template controls (NTC) were included to control for PCR artefacts and DNA contamination. Left: analytical sample application; right: preparative sample application. Image was recorded after gel excision. (E) Successful ligation of linkers to DNA fragments can be analyzed using PCR with an increased number of PCR cycles (32) and controlled by performing no template controls with H2O (NTC) or using the NTC from the previous nick translation step as input (NTC NT)). It is important to include a no fragment control (NFC), which is an amplification from a ligation and nick translation reaction from which the MNase fragments were omitted. Only samples in which MNase fragments have been combined with linker DNA produce the expected amplicon (869 bp). Please click here to view a larger version of this figure.
CORALINA can be used to generate large scale gRNA libraries by controlled nuclease digestion of target DNA and bulk cloning of resulting double stranded fragments. Statistical inference indicates that many more than 107 individual gRNA sequences have already been successfully cloned using the protocol at hand9. CORALINA can be customized in multiple ways. The choice of template DNA defines the target region and the maximal complexity of the generated library. Using this protocol, CORALINA libraries have previously been generated from human and mouse genomic DNA9. Representative results presented here depict the generation of a CORALINA library from purified BAC DNA. Further customization can be achieved by the choice of gRNA expression vector and linker sequences. We have previously tested three different pairs of linker lengths for Gibson assembly with little variations in efficiency9.
Due to their origin from bulk digested DNA, protospacer of CORALINA gRNAs are usually not exactly 20 bp in length, but show a length distribution with a mean that depends both on the parameters of the MNase digestion as well as the size of the excision made from the PAGE gels. The representative example shown in Figure 2B and C, depicts fragments with a median length between 19 and 27 bp. In our experience, the length of the fragments is faithfully preserved by the generated gRNA protospacer9. While fragments shorter than 20 bp should be avoided due to higher off-target rate of resulting gRNAs, longer fragments are likely much less of a problem for downstream applications, since it has been demonstrated that gRNAs with protospacers as long as 45 bp are still functional9.
The two most critical steps in the CORALINA protocol are the size selection of MNase-digested fragments and cloning steps. Generation of fragments that are too short (e.g. average below 18 bp) or incorporation of too many empty gRNA expression vectors will render the library useless. Thus, it is important to optimize the MNase digestion step (Figure 2A), to monitor excision (Figure 2B, C), check for complete digestion of the gRNA vector backbone and including no fragment controls throughout the protocol. Special care has also to be taken to preserve the representation of the gRNA library. One common bottleneck of library generation in general is the efficient transfer of plasmids into bacteria for amplification. Thus, large quantities of bacteria with excellent competency and a large number of individual electroporation events are necessary to achieve a high number of gRNA clones.
New strategies for gRNA library production will be necessary to harvest the full potential of CRISPR-based screening approaches over the next decades. There is a significant demand for cost-effective, simple and customizable methods to generate large-scale libraries, a pre-requisite to make screening amenable to a larger number of model systems and different CRISPR-based engineering approaches. CORALINA is providing a first step toward this. The potential uses are manifold, especially to produce comprehensive libraries of genomes, cDNA derived libraries of less common model systems, highly focused libraries and experimental set-ups in which different CRISPR proteins (with differing PAM requirements) are used in combination.
Unlike other methods, CORALINA generates all possible gRNAs from the input DNA. However, one drawback of the method is that gRNAs lacking the required PAM sequence are also included in the library, a feature that it shares with a second enzymatic method for gRNA library generation, CRISPR-EATING (Table 1). The choice of the ideal method for gRNA library generation depends on the specifications of the planned screening experiment, especially the nature (genic, regulatory, intergenic) and size of the target region (single locus, multiple regions, genome-wide). We see a special upside in using CORALINA when a large number of non-coding or regulatory regions are to be analyzed, if there is incomplete or unreliable sequence information (exotic model systems, mixtures of species (e.g. microbiomes) or experimentally obtained input), if different CRISPR endonucleases are combined or if saturating analysis is performed on a short and defined locus (e.g. represented by BACs).
The authors have nothing to disclose.
The authors would like to thank Prof. Dr. Stephan Beck and Prof. Dr. Magdalena Goetz for their input, help and support in developing the CORALINA method, Maximilian Wiessbeck and Valentin Baumann for helpful comments. The work has been supported by DFG (STR 1385/1-1).
500 mM EGTA | Sigma Aldrich | 03777-10G | 1.4., Inactivation of Mnase |
Novex Hi-Density TBE Sample Buffer | Thermo Fisher Scientific | LC6678 | 2.1. |
Novex® TBE Gels, 20%, 10 well | Thermo Fisher Scientific | EC6315BOX | 2.1., pre-made 20 % PAGE gel |
O'RangeRuler 5 bp DNA Ladder, | Thermo Fisher Scientific | SM1303 | 2.1. |
Novex® TBE Running Buffer | Thermo Fisher Scientific | LC6675 | 2.1., PAGE gel running buffer |
Disposable scalpel, sterile | VWR | 233-5363 | 2.3., other equivalent reagents may be used |
SYBR Green I nucleic acid stain (1000x concentrate in DMSO) | Sigma Aldrich | S9430 |
2.3. +2.5., also available from Thermo Fisher Scientific (S7563) |
UltraPure Phenol:Chloroform:Isoamyl Alcohol (25:24:1) | Thermo Fisher Scientific | 15593-031 | 3.6.1. + 4.3., other equivalent reagents may be used |
Glycogen | Sigma | 10901393001 | 3.6.4., other equivalent reagents may be used |
3M Sodium acetate , pH5.2 | Thermo Fisher Scientific | R1181 | 3.6.4., other equivalent reagents may be used |
Ethanol | 3.6.4. + 9.1.8., molecular biology grade | ||
Quick blunting kit | New England Biolabs | E1201 | 4.1. |
ammomium acetate | Sigma | A1542 |
3.1., other equivalent reagents may be used |
magnesium acetate | Sigma | M5661 |
3.1., other equivalent reagents may be used |
0.5 M EDTA (pH 8.0) | VWR | MOLEM37465520 (or Promega V4231) | 2.2. + 3.1., other equivalent reagents may be used |
Agencourt AMPure XP beads | Beckman coulter | A63881 | 5.3. + 6.5. |
Gel extraction kit | QIAGEN | 28704 | 5.7.+ 7.1. +8.4., other equivalent reagents may be used |
concentrated T4 DNA ligase | New England Biolabs | M0202T | 6.1.+ 8.1.2. |
Long Amp Taq 2X Master Mix | New England Biolabs | M0287S | 6.3. |
Phusion High-Fidelity PCR Master Mix with HF Buffer | New England Biolabs | M0531S | 5.1. + 6.6., other equivalent reagents may be used |
HindIII | New England Biolabs | R0104S | 5.4.1. |
SacII | New England Biolabs | R0157S | 5.4.2. |
AgeI | New England Biolabs | R0552S | 8.2.1. |
Tris base | Sigma | 93362 | 8.1.1. |
2M MgCl | Sigma | 93362 | 8.1.1. |
dGTP,dATP, dCTP, dTTP | New England Biolabs | N0446S | 8.1.1. |
DTT | Sigam | DTT-RO |
8.1.1. |
PEG-8000 | Sigma | P5413 |
8.1.1. |
NAD | Sigma | N6522 |
8.1.1. |
T5 exonuclease | New England Biolabs | M0363S | 8.1.2. |
Phusion DNA polymerase | New England Biolabs | M0530S | 8.1.2. |
Taq DNA ligase | New England Biolabs | M0208L | 8.1.2. |
rSAP | New England Biolabs | M0371S | 8.3.1. |
TG1 competent cells | Lucigen | 60502-1 | 9.1. |
1mm gap electroporation cuvettes | VWR | 732-2267 | 10.2. |
Bio-Assay Dish (Polystyrene, 245 mm x 245 mm x 25 mm) | Fisher Scientific | DIS-988-010M | 9.4. |
NaCl | Sigma | S7653 | 9.3. |
Bacto-tryptone | BD | 211705 | 9.3. |
Yeast extract | BD | 212750 | 9.3. |
Agar | Sigma | A1296 |
9.4. |
Glycerol | Sigma | G5516 |
9.17. |
MNAse | New England Biolabs | M0247S | 1.1. |
Nanodrop | Thermo Fisher Scientific | ND-2000 | throughout |
Micropulser | Biorad | 165-2100 | 10.2. |
Electroporation cuvettes | Biorad | 732-2267 | 10.2. |
250 ml centrifuge tubes | Corning | 430776 | 9.1-9.9. |