The CRISPR-Cas9 genome editing system is an easy-to-use genome editor that has been used in model and non-model species. Here we present a protein-based version of this system that was used to introduce a premature stop codon into a mating gene of a non-model filamentous ascomycete fungus.
The CRISPR-Cas9 genome editing system is a molecular tool that can be used to introduce precise changes into the genomes of model and non-model species alike. This technology can be used for a variety of genome editing approaches, from gene knockouts and knockins to more specific changes like the introduction of a few nucleotides at a targeted location. Genome editing can be used for a multitude of applications, including the partial functional characterization of genes, the production of transgenic organisms and the development of diagnostic tools. Compared to previously available gene editing strategies, the CRISPR-Cas9 system has been shown to be easy to establish in new species and boasts high efficiency and specificity. The primary reason for this is that the editing tool uses an RNA molecule to target the gene or sequence of interest, making target molecule design straightforward, given that standard base pairing rules can be exploited. Similar to other genome editing systems, CRISPR-Cas9-based methods also require efficient and effective transformation protocols as well as access to good quality sequence data for the design of the targeting RNA and DNA molecules. Since the introduction of this system in 2013, it has been used to genetically engineer a variety of model species, including Saccharomyces cerevisiae, Arabidopsis thaliana, Drosophila melanogaster and Mus musculus. Subsequently, researchers working on non-model species have taken advantage of the system and used it for the study of genes involved in processes as diverse as secondary metabolism in fungi, nematode growth and disease resistance in plants, among many others. This protocol detailed below describes the use of the CRISPR-Cas9 genome editing protocol for the truncation of a gene involved in the sexual cycle of Huntiella omanensis, a filamentous ascomycete fungus belonging to the Ceratocystidaceae family.
The increasing availability of high quality, fully assembled genomes and transcriptomes has greatly improved the ability to study a wide variety of biological processes in an array of organisms1. This is true of both model species as well as non-model species, many of which may offer a more diverse understanding of biological processes. These kinds of data can be used for gene discovery, the identification of transcription networks and both whole genome and transcriptome comparisons, each of which come with their own set of applications. However, while genes are being predicted, annotated and putatively linked to different functional pathways at a never-before-seen rate, the functional characterization of these genes remains behind, limited by the molecular toolkits available for many species. This is particularly the case for the non-model species, where genomic data is relatively easy to generate but where further molecular characterization has been near impossible1,2.
Partial characterization of the functions of specific genes important to the biology of fungal species can be achieved by either knockout or knockin experiments followed by phenotypic analysis of the mutant strains3. These two systems rely entirely on the availability of genetic engineering protocols, including, at minimum, a transformation system and a genetic editing system. There are a number of different transformation systems that have been developed in a variety of filamentous fungi4. Physical systems like those that rely on biolistics and electroporation have been developed in Trichoderma harzianum5 and Aspergillus niger6, respectively. Systems that utilize chemicals such as calcium chloride or lithium acetate have been developed in Neurospora crassa7. Lastly, biological systems that rely on the use of Agrobacterium tumefaciens for transformation have been successfully used in Ceratocystis albifundus8.
In contrast to the availability of different transformation protocols, genome editing systems are less abundant. Many of the traditional functional characterization experiments conducted in filamentous fungi utilized a split marker knockout construct in the form of a selectable marker flanked by regions of homology to the target region or gene in the genome3. The method relies on homology directed (HR) DNA repair, which facilities homologous recombination between the knockout construct and the region of interest. This recombination event results in the replacement of the gene of interest with the sequence of the selectable marker. Unfortunately, while this was successful in many species including Cercospora nicotianae10, Aspergillus fumigatus11 and Grosmannia clavigera12, the rates of homologous recombination are highly variable across different fungal species, making this an inefficient and sometimes unusable protocol in certain species3.
Other genome editing systems, including those that make use of zinc-finger nucleases (ZFNs) and transcription activator-like effector nucleases (TALENs) represented a great improvement on the older systems, particularly given their abilities to make specific and targeted changes13. Both ZFNs and TALENs are comprised of a nuclease protein and a protein that is capable of recognizing specific nucleotide sequences13. Upon recognition, the nuclease induces a double stranded DNA break that can facilitate the introduction of specific mutations. In order to bring about genome changes, the protein region that recognizes the nucleotide sequence needs to be specifically designed for each experiment. Because of this reliance on protein-nucleic acid interactions to guide the editing, designing and producing the targeting molecules for each knockout or knockin experiment is difficult and labor intensive14,15. Illustrative of these challenges, very few filamentous fungi have been subjected to genome editing using these systems. One example is the TALENs-based system that was developed in the rice blast fungus, Magnaporthe oryzae16.
Arguably the greatest revolution to the field of genome editing was the discovery and subsequent development of the CRISPR-Cas9 system- a genome editor that allows for the targeted cleavage of a sequence of interest by an endonuclease that is guided by an RNA molecule. This was a huge improvement on the previously developed genome editors that relied on protein-nucleic acid interactions as the major advantage of the CRISPR-Cas9 system is that it relies on an RNA molecule to target the region of interest. This means that the system relies on an RNA-DNA interaction and thus standard base pairing rules can be exploited when designing each experiment15.
The CRISPR-Cas9 system as detailed here is comprised of three major components: a single guide RNA (sgRNA), the Cas9 enzyme and a donor DNA (dDNA)17. The sgRNA is composed of a 20 nucleotide region called the protospacer as well as a longer region called the scaffold18. The protospacer region is used to guide the editing system to the target region and is thus redesigned for each experiment. The scaffold is the region of RNA that physically binds to the Cas9 enzyme to form the ribonucleoprotein (RNP) and is thus identical regardless of the region to be targeted. The Cas9 enzyme physically facilitates the cleavage of the target DNA, using the protospacer as a guide to identify this region19. The last component, the dDNA, is optional and its use depends on the particular experiment20. The dDNA harbors the sequence that should specifically be inserted into the region being cut by the Cas9 enzyme, and is thus ideal for gene knockin experiments where a gene is being introduced into the genome or for gene knockout experiments where an antibiotic resistance gene or other selectable marker is being introduced to replace the gene of interest. The dDNA can also be designed in such a way as to introduce novel sequences into the genome. For example, as detailed below, it is possible to introduce an in-frame stop codon into a particular region in the gene of interest when a gene truncation is required21. Other applications include the mutation of specific regions of the gene, such as a functional domain22, or the introduction of a tagging sequence23.
A major benefit of using the CRISPR-Cas9 system is its versatility24. One example of this adaptability is that the Cas9 enzyme can be introduced into the host cell in one of its three forms- DNA, RNA or protein- depending on the particular transformation system being used. When introduced in DNA form, the cas9 gene is often included on a plasmid along with a selectable marker, a cassette to express the sgRNA and, if necessary, a cassette encoding the dDNA sequence25. The primary advantage of this system is that only a single construct needs to be transformed into the cell and successful transformation ensures that all the necessary components for CRISRP-Cas9-mediated genome editing are present. However, this method relies on the availability of an expression system for the host species. For Cas9 to successfully induce DNA damage, it needs to be expressed at high levels, and thus, a suitable and potentially specific-specific promoter is required. For non-model species where such promoters have not yet been developed, this may be a detracting factor and thus the ability to introduce Cas9 in RNA or protein form may be a more attractive option. The introduction of RNA into the cell brings its own challenges- particularly in that RNA is unstable and may not survive the transformation process. Furthermore, when introduced in either DNA or RNA form, the Cas9 gene sequence may need to be codon-optimized for use in the particular host system17. For example, the cas9 gene from Streptococcus pyogenes may not work in a mammalian host cell and a cas9 gene that has been codon-optimized for use in a mammalian cell may not work in a plant cell. All of these challenges can be overcome using the protein form of Cas9, which, together with the sgRNA, can be assembled into an RNP and transformed into the host cell26,27. This system does not rely on any endogenous expression system or codon optimization and should thus work in the majority of non-model species. The disadvantage of the protein-based system is that it is not compatible with DNA-based transformation systems like Agrobacterium-mediated transfer. Thus, for the protein-based method to work, a transformation protocol such as those that rely on protoplasts or biolistics needs to be available. This RNP-based system has been successfully used in the filamentous fungi, Fusarium oxysporum26 and Mucor circinelloides27.
Huntiella omanensis, a member of the Ceratocystidaceae family, is a cosmopolitan fungus often found on freshly wounded woody plants28. While high quality genome and transcriptome data are available for this species28,29,30, no transformation or genome editing protocols have been developed. To date, research on H. omanensis has focused on the underlying genetic components of its sexual cycle29,31. This fungus exhibits a typical heterothallic sexual cycle, with sexual reproduction occurring exclusively between isolates of the MAT1-1 and MAT1-2 mating types31. In contrast, MAT1-2 isolates of the closely related Huntiella moniliformis are capable of independent sexual reproduction and complete a sexual cycle in the absence of a MAT1-1 partner31. This difference in sexual capabilities is thought to be, at least partly, due to a major difference in the mating gene, MAT1-2-7, where H. omanensis harbors a full length and intact copy, while the gene is severely truncated in H. moniliformis29,31. To further characterize the role of this gene in sexual reproduction, the MAT1-2-7 gene of H. omanensis was truncated to mimic the truncation seen in H. moniliformis21.
The protocol below details the transformation of H. omanensis and the truncation of the MAT1-2-7 gene using a protein-based version of the CRISPR-Cas9 genome editing system. This protocol was developed after the approaches of homologous recombination-based gene replacement and plasmid-based CRISPR-Cas9 genome editing were unsuccessful.
1. Design and synthesis of the sgRNA
2. Testing the in vitro Cleavage Ability of the sgRNA
NOTE: This step is optional but is recommended.
3. Design and synthesis of the dDNA
4. Extraction of protoplasts
5. Protoplast and PEG-assisted transformation and transformant recovery
6. Confirmation of the integration and stability of the dDNA
7. Phenotypic analysis of the mutant strains
The protocol described above facilitated the introduction of a premature stop codon into a mating gene from the non-model ascomycete, H. omanensis. This process utilized a version of the CRISPR-Cas9 genome editing system and as such one of the most important steps in this protocol is the design and synthesis of a high quality sgRNA. Figure 1 shows how this molecule was designed in such a way that it A) specifically targets the gene of interest and shows little similarity to other regions in the genome and B) folds correctly in order to bind with the Cas9 protein. The sgRNA must also be capable of effectively cleaving the target region. The ability of the sgRNA to target and allow for the cleavage of the target region was conducted in vitro, yielding two products of the expected size.
Once successful transformation has taken place, it is important to ensure that the dDNA has integrated into the genome only once and in the expected place. Figure 2 illustrates the design of PCR primers that target the insertion sites, which can be used to screen the potential transformants for the correct integration site. By designing primers that flank the 5’ and 3’ insertion sites, amplification is only possible if the dDNA is inserted at the correct region. Figure 4 illustrates that the premature stop codon was introduced into the MAT1-2-7 gene into the correct reading frame, ensuring that the gene would be truncated in a similar manner to that of H. moniliformis. Furthermore, Southern blot analysis showed that the dDNA construct was only integrated at a single site in the genome.
The success of the protocol was confirmed upon the phenotypic analysis of the mutant strains. In the case of the MAT1-2-7 disruption experiment, two independent mutant strains were developed. In both isolates, the vegetative radial growth rate was significantly reduced, suggesting a pleiotropic effect of the novel mating gene (Figure 5). Furthermore, the mutant isolates were incapable of completing a sexual cycle, producing only immature sexual structures that did not produce sexual spores (Figure 5). This was in contrast to wildtype isolates, which completed the entire sexual cycle within a few days of incubation (Figure 5).
Figure 1: Choosing a suitable sgRNA candidate.
(A) A suitable sgRNA will only have similarity to the target region of the genome (in this case indicated by the MAT locus sequence). (B) A suitable sgRNA will have identical minimal free energy and centroid secondary structures, with the three stem loops and five rings in the primary step loop. Furthermore, the majority of the structure will have high binding probabilities (indicated in dark orange and red) while lower binding probabilities should be seen at the protospacer region (indicated by the black triangles). Please click here to view a larger version of this figure.
Figure 2: Design, amplification and assembly of the dDNA.
The first and second primer pairs (PP1 and PP2) are used to amplify approximately 800 bp upstream (5’) and 800 bp downstream (3’) of the gene of interest. The reverse primer of PP1 and the forward primer of PP2 include regions of homology to the hygromycin resistance cassette. The third primer pair amplifies the entire hygromycin resistance cassette. In a stepwise manner, the various amplicons are assembled until the entire dDNA, comprised of the 5’ region, the hygromycin resistance cassette and 3’ region, is assembled. When transformed into the cell, the dDNA should recombine at the region where the Cas9 enzyme will have been directed to cut, thereby replacing the gene of interest with the hygromycin resistance cassette. PP4 and PP5 can be used to determine if the dDNA has been correctly inserted into the genome at the appropriate location. Please click here to view a larger version of this figure.
Figure 3: The different cell types important during the protoplast extraction protocol.
(A) Conidia are used as the starting material for the protocol. These conidia are allowed to germinate and grow until they are (B) young germlings. The ideal growth phase of the young germlings are indicated by the two black arrows. Other mycelial strands seen on (B) are too mature for degradation and should not be used. The final step of the protocol is the release of the (C) round protoplasts, indicated by the black, dotted circles. These cells no longer have cell walls and are thus very sensitive to mechanical disruption. Please click here to view a larger version of this figure.
Figure 4: The successful integration of the TGA stop codon into the MAT1-2-7 gene of H. omanensis.
(A) The full-length H. omanensis MAT1-2-7 gene, with the sgRNA target site indicated by the green arrow. (B) A magnified schematic of the sgRNA target site within the H. omanensis MAT1-2-7 gene. (C) A magnified schematic of a region of the dDNA showing the stop codon flanked by arms homologous to the MAT1-2-7 gene of H. omanensis. (D) Sanger sequence chromatogram indicating the successful integration of the stop codon into the MAT1-2-7 gene. Modified from Wilson et al. 202021. Please click here to view a larger version of this figure.
Figure 5: The phenotypic differences between (A) wildtype isolates and (B) mutant isolates.
The first three images in each panel show the differences in the sexual capabilities of the two isolate types. While the wildtype isolates form mature ascomata during sexual reproduction, complete with the exudation of spores from the tips of the ascomatal necks, the mutant isolates form only immature sexual structures that do not produce any sexual spores. The fourth image in each panel shows the difference in growth rate and morphology of the two isolate types. While the wildtype isolate grows much faster and with more aerial mycelia, the mutant shows slower and is submerged within the agar. Modified from Wilson et al. 202021. Please click here to view a larger version of this figure.
Reaction | Enzyme Concentration | Degradation Time |
A | 1.250 mg/mL | 180 min |
B | 1.875 mg/mL | 180 min |
C | 2.500 mg/mL | 150 min |
D | 3.750 mg/mL | 150 min |
E | 4.375 mg/mL | 120 min |
F | 5.000 mg/mL | 120 min |
Table 1: Degradation of the germling/mycelia solution with lysing enzymes from Trichoderma harzianum. The different enzymes concentrations correspond to different incubation periods, with lower concentrations requiring longer incubations.
The protocol for the successful transformation of H. omanensis and editing of the MAT1-2-7 gene was demonstrated by introducing an in-frame premature stop codon along with a gene for resistance to hygromycin B21. This was achieved using a protein-based version of the CRISPR-Cas9 genome editing system. The experiment entailed the in vitro transcription of the sgRNA, PCR-based assembly of the dDNA and the co-transformation of these two nucleic acids with a commercially available Cas9 enzyme into protoplasts extracted from H. omanensis
Unlike other protocols that rely on the availability of many other molecular tools, the protocol described above can be successfully used in species for which the molecular toolbox is still fairly limited21. The protocol relies only on an established transformation system and the availability of NGS data, preferably whole genome sequence. While an effective transformation system may take some optimizing in a species for which this is not available, there are many different protocols available for a variety of species. Furthermore, genome data is becoming increasingly available for even the most obscure of species and is becoming easier to generate de novo if it does not already exist.
Given the length of the protocol, there are many steps at which modifications can be introduced and where troubleshooting may be necessary. This is particularly true of the steps that are considered species specific. For example, there are many incubation steps in this protocol that need to be conducted at specific temperatures and for specific lengths of time in order to generate cell types important for the experiment. These steps would thus require species-specific optimization. Where possible, micrographs of the particular cells or growth phases have been provided to assist in transferring this protocol to a different species (Figure 1). The type and concentration of enzymes used to degrade the cell walls of the fungal cells in order to release the protoplasts will also be specific to the species of fungus being studied. In this protocol, only one source of lysing enzymes from is used, while different enzyme combinations are required for the extraction of protoplasts in species like Fusarium verticillioides33. This step depends entirely on the chemical make of the cell wall and will thus need to be optimized on a species to species basis.
This method is particularly significant to those studying non-model species as there is no reliance on an expression system. A popular method of establishing the CRISPR-Cas9 genome editing system is to express the Cas9 protein, the sgRNA as well as the dDNA from one or two plasmids that are transformed into the cells of choice. In this case, the Cas9 needs to be expressed by a promoter that is capable of high levels of expression in the particular organism being studied. General promoters have been developed for use in filamentous fungi and while they are not compatible in all species, they do allow for low level expression and can successfully be used to express, for example, antibiotic resistance genes. These promoters, however, often do not allow for high levels of expression and thus cannot be used to express the Cas9 protein. Using a protein-based version of the CRISPR-Cas9 genome editing system overcomes this limitation and allows the sgRNA and dDNA to be co-transformed into the cell with an already produced Cas9 enzyme.
The development of this protein-based system for use in H. omanensis came after many unsuccessful attempts at genome editing using both the classical split marker approach as well as the plasmid-based CRISPR-Cas9 system. While efficiencies differ from species to species, the split marker approach has been successfully used with 100% efficiency in species as diverse as Alternaria alternata34,35, and C. nicotianae36. In contrast, the efficiency of this system in H. omanensis was zero, despite more than 80 independent transformation and integration events. Similarly, the plasmid-based CRISPR-Cas9 system has been successfully used with high efficiencies in Trichoderma reesei (>93%)17 and Penicillium chrysogenum (up to 100%)37. This is, again, in contrast to this system’s usefulness in H. omanensis. Sufficient expression of the Cas9 protein was not attainable in H. omanensis despite trying a number of potential promoters, including two species-specific promoters predicted from housekeeping genes. Thus, this system could not be used at all. Using the protein-based version of the CRISPR-Cas9 system, however, yielded many independent transformants, two of which harbored the integrated dDNA in the correct location. Furthermore, this experiment was attempted only once and was successful- further illustrating the ease at which this system can be used.
Future applications of this protocol include its optimization and use in other species of the Ceratocystidaceae. There is already a wealth of NGS data available for these species30,38,39 and studies regarding their host specificity40, growth rate and virulence41 have been conducted. These studies can be strengthened by the functional characterization of the genes that are thought to be involved in these processes, research which will now become possible due to the availability of a transformation and genome editing protocol.
In conclusion, thorough investigation into the genes underlying important biological processes in non-model species is becoming more accessible thanks to the availability of easy-to-use genome editing protocols that do not rely on the existence of extensive biological resources and molecular toolkits. Studying non-model species is becoming easier and will allow for the discovery of novel pathways and interesting deviations from the standard biological processes that have been elucidated in model species.
The authors have nothing to disclose.
This project was supported by the University of Pretoria, the Department of Science and Technology (DST)/National Research Foundation (NRF) Centre of Excellence in Tree Health Biotechnology (CTHB). The project was additionally supported by Prof BD Wingfield’s DST/NRF SARChI chair in Fungal Genomics (Grant number: 98353) as well as Dr AM Wilson’s NRF PhD bursary (108548). The grant holders acknowledge that opinions, findings and conclusions or recommendations expressed in this piece of work are that of the researchers and that the funding bodies accept no liability whatsoever in this regard.
EcoRI-HF | New England Biolabs, Ipswich, USA | R3101S | |
EnGen Spy Cas9 NLS protein | New England Biolabs, Ipswich, USA | M0646T | Used to assemble the RNP |
Eppendorf 5810 R centrifuge | Eppendorf, Hamberg, Germany | ||
FastStart Taq DNA Polymerase | Sigma, St Louis, USA | 12032902001 | Standard DNA polyermase |
GeneJET Gel Extraction Kit | ThermoFisher Scientific, Waltham, USA | K0691 | |
HindIII-HF | New England Biolabs, Ipswich, USA | R3104S | |
HiScribeTM T7 Quick High Yield RNA synthesis kit | New England Biolabs, Ipswich, USA | E2050S | |
Hygromycin B from Streptomyces hygroscopicus | Sigma, St Louis, USA | 10843555001 | |
Infors HT Ecotron Shaking Incubator | Infors AG, Bottmingen, Switzerland | ||
LongAmp Taq DNA Polymerase | New England Biolabs, Ipswich, USA | M0323S | Long-range, high-fidelity DNA polymerase |
Malt extract agar, 2% (MEA) | 20 g ME and 20 g agar in 1 l ddH20 | ||
Malt extract | Sigma, St Louis, USA | 70167-500G | |
Agar | Sigma, St Louis, USA | A5306 | |
Malt Extract broth, 1% (MEB) | Sigma, St Louis, USA | 70167-500G | 2 g ME in 200 ml ddH20 |
Malt Extract broth, 2% (MEB) | Sigma, St Louis, USA | 70167-500G | 4 g ME in 200 ml ddH20 |
Miracloth | Merck Millipore, New Jersey, USA | 475855 | |
Nylon membrane (positively charged) | Sigma, St Louis, USA | 11209299001 | |
Osmotic control medium (OCM) | 0.3% yeast extract, 20% sucrose, 0.3% casein hydrolysate | ||
Casein Hydrolysate | Sigma, St Louis, USA | 22090 | |
Sucrose | Sigma, St Louis, USA | 84097 | |
Yeast extract | Sigma, St Louis, USA | Y1625 | |
Osmotic control medium (OCM) agar | Osmotic control medium (OCM) + 1% agar | ||
Agar | Sigma, St Louis, USA | A5306 | |
PCR DIG Labeling Mix | Sigma, St Louis, USA | 11585550910 | |
Phusion High-Fidelity DNA Polymerase | ThermoFisher Scientific, Waltham, USA | F-530XL | High fidelity DNA polymerase |
Plasmid pcb1004 | N/A | N/A | From: Carroll et al., 1994 |
Presynthesized sgRNA | Inqaba Biotec, Pretoria, South Africa | Ordered as an synthesized dsDNA with specified sequence | |
Proteinase K | Sigma, St Louis, USA | P2308 | |
PTC Solution | 30% polyethylene glycol 8000 in STC buffer from above | ||
Polyethylene glycol 8000 | Sigma, St Louis, USA | 1546605 | |
RNase A | ThermoFisher Scientific, Waltham, USA | 12091021 | |
RNAfold Webserver | Institute for Theoretical Chemistry, University of Vienna | N/A | http://rna.tbi.univie.ac.at/cgi-bin/RNAWebSuite/RNAfold.cgi |
RNAstructure | Mathews Lab | N/A | https://rna.urmc.rochester.edu/RNAstructureWeb/Servers/Predict1/Predict1.html |
Sorbitol, 1 M | Sigma, St Louis, USA | 1617000 | 182.17g sorbitol in 1 l ddH20 |
STC Buffer | 20% sucrose, 50 mM Tris-HCl pH 8.00 and 50 mM CaCl2 | ||
Calcium chloride | Sigma, St Louis, USA | 429759 | |
Tris-HCl pH 8.00 | Sigma, St Louis, USA | 10812846001 | |
Sucrose | Sigma, St Louis, USA | 84097 | |
Trichoderma harzianum lysing enzymes | Sigma, St Louis, USA | L1412 | |
Zeiss Axioskop 2 Plus Ergonomic Trinocular Microscope | Zeiss, Oberkochen, Germany |