This report describes a bioengineering method to design and construct novel Artificial Splicing Factors (ASFs) that specifically modulate the splicing of target genes in mammalian cells. This method can be further expanded to engineer various artificial factors to manipulate other aspects of RNA metabolism.
The processing of most eukaryotic RNAs is mediated by RNA Binding Proteins (RBPs) with modular configurations, including an RNA recognition module, which specifically binds the pre-mRNA target and an effector domain. Previously, we have taken advantage of the unique RNA binding mode of the PUF domain in human Pumilio 1 to generate a programmable RNA binding scaffold, which was used to engineer various artificial RBPs to manipulate RNA metabolism. Here, a detailed protocol is described to construct Engineered Splicing Factors (ESFs) that are specifically designed to modulate the alternative splicing of target genes. The protocol includes how to design and construct a customized PUF scaffold for a specific RNA target, how to construct an ESF expression plasmid by fusing a designer PUF domain and an effector domain, and how to use ESFs to manipulate the splicing of target genes. In the representative results of this method, we have also described the common assays of ESF activities using splicing reporters, the application of ESF in cultured human cells, and the subsequent effect of splicing changes. By following the detailed protocols in this report, it is possible to design and generate ESFs for the regulation of different types of Alternative Splicing (AS), providing a new strategy to study splicing regulation and the function of different splicing isoforms. Moreover, by fusing different functional domains with a designed PUF domain, researchers can engineer artificial factors that target specific RNAs to manipulate various steps of RNA processing.
Most human genes undergo Alternative Splicing (AS) to produce multiple isoforms with distinct activities, which has greatly increased the coding complexity of the genome1,2. AS provides a major mechanism to regulate gene function, and it is tightly regulated through diverse pathways in different cellular and developmental stages3,4. Because splicing misregulation is a common cause of human disease5,6,7,8, targeting splicing regulation is becoming an attractive therapeutic route.
According to a simplified model of splicing regulation, AS is mainly controlled by Splicing Regulatory cis-Elements (SREs) in pre-mRNA that function as splicing enhancers or silencers of alternative exons. These SREs specifically recruit various trans-acting protein factors (i.e. splicing factors) that promote or suppress the splicing reaction3,9. Most trans-acting splicing factors have separate sequence-specific RNA binding domains to recognize their targets and effector domains to control splicing. The best-known examples are the members of the serine/arginine-rich (SR) protein family that contain N-terminal RNA Recognition Motifs (RRMs), which bind exonic splicing enhancers, and C-terminal RS domains, which promote exon inclusion10. Conversely, hnRNP A1 binds to exonic splicing silencers through the RRM domains and inhibits exon inclusion through a C-terminal glycine-rich domain11. Using such modular configurations, researchers should be able to engineer artificial splicing factors by combining a specific RNA-Binding Domain (RBD) with different effector domains that activate or inhibit splicing.
The key of such a design is to use an RBD that recognizes given targets with programmable RNA binding specificity, which is analogous to the DNA-binding mode of the TALE domain. However, most native splicing factors contain RRM or K Homology (KH) domains, which recognize short RNA elements with weak affinity and thus lack a predictive RNA-protein recognition "code"12. The RBD of PUF repeat proteins (i.e. the PUF domain) has a unique RNA recognition mode, allowing for the redesign of PUF domains to specifically recognize different RNA targets13,14. The canonical PUF domain contains eight repeats of three α-helices, each recognizing a single base in an 8-nt RNA target. The side chains of amino acids at certain positions of the second α-helix form specific hydrogen bonds with the Watson-Crick edge of the RNA base, which determines the RNA binding specificity of each repeat (Figure 1A). The code for RNA base recognition of the PUF repeat is surprisingly simple (Figure 1A), allowing for the generation of PUF domains that recognize any possible 8-base combination (reviewed by Wei and Wang15).
This modular design principle allows for the generation of an Engineered Splicing Factor (ESF) that consists of a customized PUF domain and a splicing modulation domain (i.e. an SR domain or a Gly-rich domain). These ESFs can function as either splicing activators or as inhibitors to control various types of splicing events, and they have proven useful as tools to manipulate the splicing of endogenous genes related to human disease16,17. As an example, we have constructed PUF-Gly-type ESFs to specifically alter the splicing of the Bcl-x gene, converting the anti-apoptotic long isoform (Bcl-xL) to the pro-apoptotic short isoform (Bcl-xS). Shifting the ratio of the Bcl-x isoform was sufficient to sensitize several cancer cells to multiple anti-cancer chemotherapy drugs16, suggesting that these artificial factors may be useful as potential therapeutic reagents.
In addition to controlling splicing with known splicing effector domains (e.g., an RS or Gly-rich domain), the engineered PUF factors can also be used to examine the activities of new splicing factors. For example, using this approach, we have demonstrated that the C-terminal domain of several SR proteins can activate or inhibit splicing when binding to different pre-mRNA regions18, that the alanine-rich motif of RBM4 can inhibit splicing19, and that the proline-rich motif of DAZAP1 can enhance splicing20,21. These new functional domains can be used to construct additional types of artificial factors to fine-tune splicing.
1. Construction of a PUF Scaffold with Customized RNA-binding Specificity by Overlapping PCR
2. Construction of a Functional Module of ESFs
3. Construction of ESF Expression Plasmids
4. Construction of the Splicing Reporter
5. Construction of Lentiviral Expression Vectors for ESF
6. Specifically Modulating Exon Inclusion and the Alternative Use of Splice Sites with ESFs
7. Use ESF to Modulate Endogenous Bcl-x Splicing and Measure Its Effects on Apoptosis
8. Measure the Apoptosis of Different Cancer Cells Expressing ESF
This report describes the complete protocol for the design and construction of ESFs and splicing reporters. It also outlines the further application of ESFs in manipulating the AS of endogenous genes16. To illustrate typical results of ESF-mediated splicing changes, we use the data from our previous work as an example. The ESFs with different functional domains can be used to promote or inhibit the inclusion of the target cassette exon (Figure 1D & E). ESFs can also affect the usage of alternative 5' and 3' splice sites in the reporter system (Figure 1F & G).
The alternative splicing of the endogenous gene can also be specifically regulated with designer ESFs. We have demonstrated this application by specifically targeting Bcl-x, which can be spliced into two antagonistic isoforms with alternative 5' splice sites. We designed an ESF, Gly-PUF(531), that recognizes an 8-nt RNA element between the alternative 5' splice sites. This Gly-PUF(531) specifically shifted the splicing towards the production of Bcl-xS (Figure 2A). After transfecting the Gly-PUF(531) into HeLa cells, the level of Bcl-xS isoforms and Bcl-xS proteins increased in a dose-dependent manner, whereas the control ESF, Gly-PUF(WT), did not affect the ratio of Bcl-xS to Bcl-xL (Figure 2B & C). In addition, the designer ESF can induce the cleavage of caspase 3 and poly (ADP-ribose) polymerase (PARP), two well-known molecular markers of apoptosis (Figure 2D). As expected, the designer ESFs are predominantly localized in the nuclei of transfected cells, as demonstrated by immunofluorescence microscopy (Figure 2E). Consistently, the splicing shift by Gly-PUF(531) caused the fragmentation of nuclear DNA, indicating that these cells are undergoing apoptosis (Figure 2E). The increase of apoptotic cells was further confirmed by examining more than 200 cells from randomly selected fields and by quantifying the percent of cells with fragmented nuclear DNA (Figure 2F).
Figure 1: Design of ESFs and Their Activity in Modulating Exon Skipping. (A) Specific binding between the PUF domain and RNA targets is illustrated with the RNA-PUF structure and a schematic diagram. The PUF binding code for each of the four RNA bases, shown on the right with different colors, is used to design PUF mutations. (B) Flow chart to obtain a customized PUF domain. The PUF that recognizes "UGUAUAUA" was used as an example. A 4-round PCR strategy is used to assemble a PUF scaffold with customized RNA-binding specificity (color coded similarly to panel A). In the first round, a series of PCR primers that incorporate the desired RNA-recognition codes for two adjacent PUF repeats are used to generate four fragments that include the eight RNA-recognition codes of a full PUF protein (R1/R2, R3/R4, R5/R6, and R7/R8). Cap fragments encoding an N-terminal nuclear localization signal, a C-terminal stop codon, and bridge fragments are also produced separately (5'-end and 3'-end cap, bridge 2/3, bridge 4/5, and bridge 6/7). In the second round, new templates are generated by mixing overlapping fragments encoding adjacent repeats with the appropriate bridge (e.g., mixing R1/R2, R3/R4, and bridge 2/3 generates the template for R1 – 4) and then extending with DNA polymerase to fill the gaps. Similarly, the third round joins R1-4, R5-8, and bridge 4/5. Finally, the fourth round adds the 5'-end and 3'-end caps of the PUF domain together with the cloning sites for subsequent cloning to expression vectors. (C) Modular domain organization of ESFs. ESFs are driven by CMV promoters (arrow) and encode, from the N- to the C-terminal: a FLAG epitope (for the detection of ESFs), a functional module (a Gly-rich domain or an RS domain), an NLS (facilitating the nuclear localization of ESFs), and an RNA-recognition domain (a PUF domain). NheI and BamHI are designed to insert a functional module, while XbaI and SalI are designed to insert an RNA-recognition domain. (D) Gly-PUF ESFs are co-expressed with exon skipping reporters, and the splicing pattern is assayed by RT-PCR. The modified PUFa and PUFb specifically bind to 8-mer targets A and B, respectively (in the same colors). All combinations are used, so the PUF-target pairs of different color serve as the controls. The effects of RS-PUF on exon skipping (E), the competing 5' splice site (F), or the competing 3' splice site reporter (G) were assayed by methods similar to panel D. The data of the RT-PCR are from Wang et al.16. Please click here to view a larger version of this figure.
Figure 2: Regulation of Endogenous Bcl-x pre-mRNA Splicing with ESFs. (A) Schematic of the alternative splicing of endogenous Bcl-x pre-mRNA. Two alternative 5' splice site in exon 2 of Bcl-x are used to generate two isoforms of different sizes, Bcl-xL and Bcl-xS. The sequence UGUGCGUG between the two 5' splice sites is selected as the ESF target, and WT PUF repeats 1, 3, and 5 (Q867E/Q939E/C935S/Q1011E/C1007S) are reprogrammed (asterisks) to recognize this target sequence. The resulting ESF containing a Gly-rich domain inhibits the use of the downstream 5' ss (indicated by the red arrow). (B) Modulation of Bcl-x 5' ss usage. Different amounts of the Gly-PUF(531) expression construct are transfected into HeLa cells. Gly-PUF(WT) is used as a control. Two isoforms of Bcl-x are detected with RT-PCR using primers corresponding to exons 1 and 3 of the Bcl-x gene. The percentage of the Bcl-xS isoform is quantified and shown at the bottom. (C) ESFs affect the expression levels of Bcl-xL and Bcl-xS. Samples are loaded in the same order as in panel B, and all proteins are detected by Western blots. The expression of ESFs is detected by the anti-FLAG antibody, and the tubulin level is used as a control. (D) Different amounts of ESF expression constructs are transfected into HeLa cells, resulting in the cleavage of PARP and caspase 3. Samples are detected by Western blot 24 h after transfection. The actin level is detected as a control. (E) The subcellular localization of ESFs in transfected HeLa cells is detected by immunofluorescence microscopy with the anti-FLAG antibody. The cells are co-stained with DAPI to show the nuclei. Some nuclei, especially in cells transfected with Gly-PUF(531), are fragmented due to apoptosis. Scale bar: 5 µm. (F) Percentage of apoptotic cells (i.e. cells with fragmented nuclear DNA) are measured from randomly chosen fields of fluorescence microscopy images. The bars indicate the mean, while the dots indicate the data from the two experiments. The figures are modified from our earlier report by Wang et al.16 in accordance with the policy of Nature Publishing Group. Please click here to view a larger version of this figure.
This report provides a detailed description for the design and construction of artificial splicing factors that can specifically manipulate the alternative splicing of a target gene. This method takes advantage of the unique RNA binding mode of PUF repeats to produce an RNA-binding scaffold with customized specificity. It can be used to either activate or repress splicing.
The critical step in this protocol is the generation of the reprogramed PUF domain that defines the specificity of ESFs. A PCR stitching protocol has been developed and optimized for the rapid generation of the PUF scaffold. The key for its success is to adjust the ratio of different overlapping templates to 1:1:1. The purification of the PCR products after each round is also critical, because unpurified products may have primer contamination from the last round. Another important step is to assay the splicing ratio using semi-quantitative RT-PCR. Generally, too many amplification cycles should be avoided, as they may saturate the PCR reaction. In our experiments with the co-expression of a splicing reporter, 20 – 25 cycles were routinely used, but this may vary depending on the abundance of the mRNA when measuring the splicing of endogenous genes. When quantifying the splicing isoforms using a new pair of primers, we suggest calibrating the PCR experiment each time, as previously described16.
A potential limitation with designer ESFs is their off-target effects, because the specificity is determined by the number of repeats in the PUF scaffold. Wildtype PUF recognizes an 8-nt site, which is comparable to the specificity of an siRNA that recognizes its target through "seed match." However, since any 8-nt sequence could occur once by chance in a transcript 65,000 nt long (48 = 65,536), there will be other off-target transcripts recognized by the designer PUF. The off-target effect can be reduced by using PUFs with additional repeats; however, it is still useful to evaluate the specificity and off-target effects of ESFs. To minimize potential off-target effects, the expression of a combination of multiple designer ESFs at a lower level may also be carried out. In such a case, the off-targeted genes may not be affected by the low amount of ESFs, whereas the splicing of the real target will be affected by the multiple ESFs that function synergistically. This solution is similar to what researchers used in gene silencing with RNAi, where the pooled siRNAs (each at a reduced concentration) that target multiple sites of a single mRNA can decrease the off-target effects.
The other main method to manipulate AS is to use antisense oligonucleotides that pair with certain regions of the pre-mRNA. Compared to this existing method, the ESFs can cause prolonged effects in stably transfected cells. In addition, the in vivo delivery of ESF can take advantage of the increasing arsenal of gene therapy vectors, whereas the in vivo delivery of antisense oligonucleotides is very hard to control. In addition, this method can avoid complicated and costly modifications of oligonucleotides. Using various inducible promoters, a more precise control of the ESF expression in the correct cell types and at the correct times may also be achieved. The main disadvantage of this method is the relatively low specificity (an 8-nt recognition site versus a 16- to 20-nt recognition site in typical antisense oligonucleotides).
The dysregulation of alternative splicing causes many diseases, including cancer26,27. Genome-wide studies have revealed more than 15,000 tumor-associated splice variants in various types of cancers28,29,30. For instance, intronic splice-site mutations of tumor-suppressor genes often cause exon-skipping events and produce aberrant proteins that may contribute to tumor genesis26,31,32. Moreover, some splicing factors are found to be overexpressed in many cancer types, which contributes to cell transformation33,34, indicating that splicing factors can also play important roles in cancer biogenesis. Therefore, in addition to providing a useful tool to modulate gene function, manipulation of splicing with designer ESFs may restore misregulated splicing events in cancer, thus providing a potential therapeutic tool. In addition, by fusing a designer PUF domain with different functional domains, multiple artificial factors that manipulate various RNA metabolism processes can be designed. For example, the fusion of a translational activator (GLD2) or a translational repressor (CAF1) with the PUF domain produced novel factors that can activate or inhibit mRNA translation27. Using the same design principal, we combined a non-specific RNA endonuclease domain (PIN) with a series of designer PUF domains to generate artificial site-specific RNA endonucleases (ASREs) that function analogously to DNA restriction enzymes17.
The authors have nothing to disclose.
This work was supported by NIH grant R01-CA158283 and NSFC grant 31400726 to Z.W. Y.W. is funded by the Young Thousand Talents Program and the National Natural Science Foundation of China (grants 31471235 and 81422038). X.Y. is funded by the postdoctoral science foundation of China (2015M571612).
High-fidelity DNA polymerase (Phusion High-Fidelity) with PCR buffer | New England Biolabs | M0530L | |
DNA ligase (T4 DNA ligase) | New England Biolabs | M0202L | |
Liposomal transfection reagent (Lipofectamine 2000) | Invitrogen | 11668-019 | |
Reduced serum medium (Opti-MEM) | Gibco | 31985-062 | |
RNA extraction buffer (TRIzol Reagent) | ambion | 15596018 | TRIzol reagent includes phenol, which can cause burns. Wear gloves when handling |
BSA (Bovine Serum Albumin) | Sigma-Aldrich | A7638-5G | |
PBS (1X) | Life Technologies | 10010-031 | |
SuperScript III reverse transcriptase | Invitrogen | 18080044 | |
Caspase-3 antibody | Cell Signaling Technology | 9668 | |
PARP antibody | Cell Signaling Technology | 9542 | |
Bcl-x antibody | BD Bioscience | 610211 | |
beta-actin antibody | Sigma-Aldrich | A5441 | |
alpha-tubulin antibody | Sigma-Aldrich | T5168 | |
FLAG antibody | Sigma-Aldrich | F4042 | |
Nitrocellulose membrane | Amersham-Pharmacia | RPN203D | |
ECL Western Blotting detection reagents | Invitrogen | WP20005 | |
Cy5-dCTP | GE Healthcare | PA55021 | |
Fluorescence-activated cell sorter | BD Bioscience | FACSCalibur | |
Dulbecco’s Modified Eagle’s Medium (DMEM) | GE Healthcare | SH30243.01 | |
Fetal bovine serum | Invitrogen | 26140079 | |
Propidium iodide (PI) | Sigma | P4170 | |
Bovine Serum Albumin (BSA) | Sigma | A7638-5G | |
Triton-X100 | Promega | H5142 | |
Poly-lysine | Sigma | P-4832 | Filter sterilize and store at 4 °C |
Vector pWPXLd | Addgene | 12258 | |
Vector pMD2.G | Addgene | 12259 | |
Vector psPAX2 | Addgene | 12260 | |
DNase I (RNase-free) | New England Biolabs | M0303S | |
Oligo(dT)18 Primer | Thermo Scientific | SO131 | |
Anti-mouse secondary antibody (Anti-mouse IgG, HRP-linked Antibody) | Cell Signaling Technology | 7076S |