Detailed protocols for both in vitro and in-cell selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) experiments to determine the secondary structure of pre-mRNA sequences of interest in the presence of an RNA-targeting small molecule are presented in this article.
In the process of drug development of RNA-targeting small molecules, elucidating the structural changes upon their interactions with target RNA sequences is desired. We herein provide a detailed in vitro and in-cell selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) protocol to study the RNA structural change in the presence of an experimental drug for spinal muscular atrophy (SMA), survival of motor neuron (SMN)-C2, and in exon 7 of the pre-mRNA of the SMN2 gene. In in vitro SHAPE, an RNA sequence of 140 nucleotides containing SMN2 exon 7 is transcribed by T7 RNA polymerase, folded in the presence of SMN-C2, and subsequently modified by a mild 2'-OH acylation reagent, 2-methylnicotinic acid imidazolide (NAI). This 2'-OH-NAI adduct is further probed by a 32P-labeled primer extension and resolved by polyacrylamide gel electrophoresis (PAGE). Conversely, 2'-OH acylation in in-cell SHAPE takes place in situ with SMN-C2 bound cellular RNA in living cells. The pre-mRNA sequence of exon 7 in the SMN2 gene, along with SHAPE-induced mutations in the primer extension, was then amplified by PCR and subject to next-generation sequencing. Comparing the two methodologies, in vitro SHAPE is a more cost-effective method and does not require computational power to visualize results. However, the in vitro SHAPE-derived RNA model sometimes deviates from the secondary structure in a cellular context, likely due to the loss of all interactions with RNA-binding proteins. In-cell SHAPE does not need a radioactive material workplace and yields a more accurate RNA secondary structure in the cellular context. Furthermore, in-cell SHAPE is usually applicable for a larger range of RNA sequences (~1,000 nucleotides) by utilizing next-generation sequencing, compared to in vitro SHAPE (~200 nucleotides) that usually relies on PAGE analysis. In case of exon 7 in SMN2 pre-mRNA, the in vitro and in-cell SHAPE derived RNA models are similar to each other.
Selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE) is a method of measuring the kinetics of each nucleotide in an RNA sequence of interest and elucidating the secondary structure at single-nucleotide resolution1. SHAPE methodologies, both in in vitro conditions2,3,4 (purified RNA in a defined buffer system) and in living mammalian cells5,6, have been developed to investigate the secondary structure of medium length RNA sequences (typically <1,000 nucleotides for in-cell SHAPE and <200 nucleotides for in vitro SHAPE). It is particularly useful to evaluate structural changes in receptor RNA upon binding to RNA-interacting small molecule metabolites2,4,7,8 and to study mechanistic actions of RNA-targeting molecules during drug development9,10.
RNA-targeting drug discovery has recently drawn attention in academic laboratories and the pharmaceutical industry11,12 via different approaches and strategies13,14,15,16. Recent examples of RNA-targeting small molecules for clinical use include two structurally distinct experimental drugs, LMI-07017 and RG-791618,19, for spinal muscular atrophy (SMA), which showed promising results in phase II clinical trials20. Both molecules were demonstrated to target survival of motor neuron (SMN) 2 pre-mRNA and regulate the splicing process of the SMN2 gene6,17,21. We previously demonstrated the application of in vitro and in-cell SHAPE in an examination of the target RNA structural changes in the presence of an analog of RG-7916 known as SMN-C26.
In principle, SHAPE measures the 2'-OH acylation rate of each nucleotide of an RNA sequence in the presence of excess amounts of a self-quenching acylation reagent in an unbiased manner. The acylation reagent is not stable in water, with a short half-life of (e.g., T1/2 = 17 s for 1-methyl-7-nitroisatoic anhydride; or 1M7, ~20 min for 2-methylnicotinic acid imidazolide, or NAI)22 and insensitivity to the identity of the bases23. This results in a more favorable acylation of the 2'-OH groups of flexible bases, which can be transformed into an accurate assessment of the dynamics of each nucleotide. Specifically, a nucleotide in a base-pair is usually less reactive than an unpaired one to a 2'-OH modifying reagent, such as NAI and 1M7.
Looking at the source of the RNA template and where 2'-OH acylation takes place, SHAPE can generally be categorized into in vitro and in-cell SHAPE. In vitro SHAPE uses purified T7 transcribed RNA and lacks a cellular context in experimental designs. In in-cell SHAPE, both the RNA template transcription and 2'-OH acylation occur within living cells; therefore, the results can recapitulate the RNA structural model in a cellular context. In-cell SHAPE has been referred to as in vivo SHAPE for the SHAPE carried in living cells in the literature24. Since this experiment is not performed in an animal, we termed this experiment as in-cell SHAPE for accuracy.
The strategies for the primer extension stage of in vitro and in-cell SHAPE are also different. In in vitro SHAPE, reverse transcription stops at the 2'-OH acylation position in the presence of Mg2+. A 32P-labled primer extension therefore appears as a band in polyacrylamide gel electrophoresis (PAGE) and the intensity of the band is proportional to the acylation rate1. In in-cell SHAPE, reverse transcription generates random mutations at the 2'-OH adduct position in the presence of Mn2+. The mutational rate of each nucleotide can be captured by in depth next-generation sequencing, and the SHAPE reactivity at single-nucleotide resolution can then be calculated.
A potential problem for in-cell SHAPE is the low signal-to-noise ratio (i.e., a majority of the 2'-OH groups is unmodified, while the unmodified sequences occupy most of the read in next-generation sequencing). Recently, a method to enrich the 2'-OH modified RNA, referred to as in vivo click SHAPE (icSHAPE), was developed by the Chang laboratory25. This enrichment method may be advantageous in studying weak small molecules such as RNA interactions, especially in a transcriptome-wide interrogation.
1. In Vitro SHAPE
NOTE: The protocol is modified from the published protocol1.
2. In-cell SHAPE
We previously demonstrated that an RNA splicing modulator, SMN-C2, interacts with AGGAAG motif on exon 7 of the SMN2 gene's pre-mRNA, and used SHAPE to assess the RNA structural changes in the presence of SMN-C26. The binding site of SMN-C2 is distinct from the FDA-approved antisense oligonucleotide (ASO) for SMA, nusinersen, which binds and blocks the intronic splicing silencer (ISS) on intron 727,28 (Figure 1A). Most known splicing regulators of SMN2 exon 7 are within the ~100 nucleotide range of exon 7 in the pre-mRNA29; therefore, a 140 and 276 nucleotide-long RNA sequences were used for in vitro and in-cell SHAPE, representatively, which covers exon 7 and the adjacent intron region (Figure 1A).
In this representative in vitro SHAPE analysis, the RNA sequence of interest is embedded into an optimized cassette developed by the Weeks laboratory, which is compatible for most RNA sequences1. Occasionally, the sequence of interest interacts or interferes with this cassette. In these cases, a modified cassette can be used with the following three characteristics: i) a 3'-end specific primer binding site with a more efficient hybridization affinity for the primer than any part of the RNA sequence, ii) a highly structured hairpin loop located directly upstream of the primer binding site that will show specific and reproducible SHAPE signal (this will act as both an internal control for the experiment and method to align the signal from experiment to experiment), and iii) a 5'-end hairpin structure element, which indicates the end of the SHAPE signal. The SHAPE cassette is further flanked with self-cleaving ribozyme sequences at both 5'- and 3'-ends in order to generate a homogenous RNA30. We found that hammerhead and hepatitis delta virus ribozymes are compatible to the SHAPE cassette and usually give high yield of the desired RNA. The resulting template RNA has a 3'-end of 2',3'-cyclic phosphate and a 5'-end of hydroxyl group, which do not interfere with primer extension. A 140-nucleoside long sequence covering exon 7 of SMN2 and adjacent region in the pre-mRNA was synthesized within SHAPE and ribozyme cassette as illustrated in Scheme 1.
In general, the sequence of interest ligated in the expression cassette should be long enough to cover the potential secondary or higher order of structure. In the case of SMN2 exon 7, a 140-nucleotide region contains two stem-loop structures6,31. The structure of the region of interest varies case-by-case and should be evaluated by trial-and-error.
Polyacrylamide sequencing gel with 32P-labeled primer extension products was chosen to visualize the in vitro SHAPE profile in this representative experiment. An alternative visualization method is to use capillary electrophoresis with a fluorescently labeled DNA primer32. In polyacrylamide sequencing gels, ~20 nucleosides near the 5'-end and ~10 nucleotides near the 3'-end of the RNA sequence of interest will not be visualized quantitatively, due to occasionally stops at the initiation steps of reverse transcription and intense bands on PAGE gels for full-length transcripts1.
An alternative way for preparative gel recovery of an RNA template (steps 1.1.6 to 1.1.12) is to overload a mini-gel if the yield of desired RNA template is >90%. It is advised to remove the excess NTP from the RNA product by desalting the column and eluting the RNA in 50 µL of TE buffer. Measure the concentration of the RNA and load 5.0 µg in each well. During the purification step of the T7 transcribed RNA template, stop the PAGE when the xylene cyanol FF dye (light blue) passed two-thirds of the 6% denatured TBE-urea gel. The self-cleavage RNA fragment (<80 nucleosides) passed through the gel, which left the desired RNA as the only major band in the gel upon staining (Figure 1B).
SMN-C2 (Figure 1C) was synthesized according to the published procedure33 and dissolved in 10 mM DMSO stock solution. The stock solution is further diluted into 500 and 50 µM in 10% DMSO solution to achieve final concentrations of 50 and 5 µM, respectively. Snap-cooled RNA refolded to its equilibrium stage in the presence of DMSO or SMN-C2 within 30 min at 37 °C. A longer incubation time did not change the outcome of the experiment. Two experimental samples (50 and 5 µM SMN-C2), two controls (DMSO and NAI- controls), and four markers (A, T, G, C) were treated for primer extension. After exposing the gel to phosphor storage screen, a successful SHAPE experiment will show: i) a single and most-intense band at the top of the gel and ii) bands throughout the gel at single-nucleotide resolution without smear (Figure 1D). A common problem in PAGE analysis is that a smear region known as the "salt front" may appear in the middle of the gel (Figure 1E). This is probably due to high concentration of salt, DMSO, or other unwanted substance in the loading sample that can be removed by ethanol precipitation.
In in vitro SHAPE, a pure RNA template is the key for a successful experiment. An impure RNA template is usually the cause of undesired results. If PAGE analysis clearly shows a pattern of 2 sets of markers, it indicates that the RNA template is not homogenous and needs to be repurified by preparative TBE-urea gel.
For in-cell SHAPE, a key to a successful experiment is to design an optimized primer set for amplification. With 0.10 µg of genomic DNA-free cDNA template, a single band should be observed within 25 PCR cycles in agarose gel analysis. The repetitive intron sequences should therefore be avoided. To study the structural impact of SMN-C2 on SMN2 pre-mRNA, three primer sets (Table 2) was tested, and all were satisfactory (Figure 2A).
A low copy number of a target RNA sequence is generally a problem for in-cell SHAPE. To enrich the RNA of interest, a minigene that contains the RNA sequence of interest under a strong CMV promoter was transfected into 293T cells. Because the splicing pattern of the SMN2 minigene recapitulates that of endogenous SMN2 with or without SMN-C219,34, we envision that the structure of SMN-C2 interacting RNA in the overexpressed SMN2 pre-mRNA likely was the same as the endogenous gene product. While the EC50 of SMN-C3 in the splicing assay was ~0.1 µM6,19, a concentration at the higher end (20 µM) was used to ensure the binding state of the small molecule and target RNA sequence.
Upon isolation of the RNA, both gene-specific and random primers can be used for extension. In the case of exon 7 of SMN2, we found that SMN2E7-338-RV (Table 2) yields a higher copy of desired cDNA than a random nonamer, evidenced by a more intense band in the PCR amplification with SMN2E7-276 primer set (Table 2) after 25 cycles. In the 2'-OH modification step, a 91 µM final NAI concentration was used for an incubation period of 15 min. If a different cell type or medium is used, the incubation time must be re-optimized. If the incubation time is too long, agarose gel analysis of the amplicon will sometimes fail to show a band.
A python-based program, ShapeMapper, developed by the Weeks laboratory35 was used to analyze the data generated by next-generation sequencing [for raw data of SMN-C2 treated in-cell SHAPE of SMN2 exon 7 pre-mRNA, refer to the Sequence Read Archive (SRA) database]. Throughout the amplicon, the SHAPE reactivity did not significantly change (>1), which indicated that the secondary structure remains in the presence of SMN-C2 (Figure 2B). This is also confirmed by the arc plots generated by SuperFold35. The connection lines indicate the possible base pairing based on SHAPE activity of each nucleotide. SMN-C2- and DMSO-treated RNA modeling is essentially the same (Figure 2C). Differential in-cell SHAPE reactivity was calculated for each nucleotide (Figure 2B), and the most reactivity change occurred at TSL-1 (5'-end of exon 7) but not TSL-2 (3'-end of exon 7). This result agrees with the in vitro SHAPE; although, the base pairing shifted from in vitro SHAPE RNA modeling in the in-cell SHAPE analyses (Figure 2D).
A common problem of in-cell SHAPE is the low mutation rate throughout the amplicon. This is usually due to genomic DNA contamination. DNA does not contain 2'-OH group; therefore, no acylation product is formed with NAI. PCR amplification thus merely reflects the low mutation rate of the DNA polymerase. Isolation of RNA by guanidinium thiocyanate followed by on-column DNase digestion is sufficient to remove all DNA in most cases. If DNA contamination persists, repeat step 2.3 for RNA isolation.
Scheme 1: DNA sequence for the template of RNA transcription. The 12 bp sequence within the Hammerhead virus ribozyme sequence in red is the reverse complement of the first 12 bp of the 5'-cassette. The RNA sequence of interest presented herein is a 140 bp pre-mRNA sequence contains exon 7 of human SMN2 gene. Please click here to view a larger version of this figure.
Figure 1: Experimental design and result of in vitro SHAPE for SMN2 exon 7 pre-mRNA in the presence of SMN-C2. (A) Overview of the sequence of interest for in vitro and in-cell SHAPE studies of the SMN2 gene. SMN-C2 binds to the AGGAAG motif on exon 7, a distinct location from the nusinersen binding site. (B) Purification of the T7 transcription product. Each of the three lanes contains 5.0 µg of crude RNA. The TBE-urea gel was stained with SYBR-Safe (1:10,000) for 5 min in 1x TBE buffer. Red dashed box indicates the edge of excision for RNA recovery. (C) The structure of SMN-C2. (D) In vitro SHAPE experiment with NAI and a 140 nucleotide-long RNA template containing exon 7. 1 = DMSO; 2 = SMN-C2 (50 µM); 3 = SMN-C2 (5 µM); 4 = lacking of NAI; 5-8 = ladders generated by addition of ddATP, ddTTP, ddCTP, and ddGTP during primer extension. PAGE was carried out on a TBE-urea sequencing gel at 60 W for 3 h. Red asterisks indicate increased band intensity with 50 µM SMN-C26. (E) An example of a "salt front" region in the dashed red box from a separate experiment. Please click here to view a larger version of this figure.
Figure 2: In-cell SHAPE derived RNA modeling for SMN2 exon 7 pre-mRNA. (A) PCR amplification with all three primer sets (Table 2) yielded a single band in agarose gel analysis. 1 = SMN2E7-338, 2 = SMNE7-276, 3 = SMN2E7-251, 4 = DNA ladder. (B) Differential in-cell SHAPE reactivity in SMN2 minigene-transfected 293T cells for 10 µM SMN-C2 and DMSO in TSL1. SHAPE reactivity at single-nucleotide resolution. Its standard deviation was calculated by ShapeMapper software35. Green asterisks indicate significant SHAPE reactivity change induced by 10 µM SMN-C2. The numbering of the nucleotides in the 276 bp amplicon shown on x-axis. Error bars were estimated by Shape-Mapper software26. (C) Arc plot generated by SuperFold35 for the most plausible RNA secondary structure modeling by in-cell SHAPE data. (D) In vitro and in-cell SHAPE-directed modeling of exon 7 and adjacent regions. For in vitro RNA model, SHAPE stabilizing cassette (orange) and nucleotides 1-19 (blue) are shown in sketch. For in-cell RNA model, nucleotide numbering is aligned with in vitro SHAPE template. Nucleotides 1-18 and 120-140 were omitted. Significant reactivity changes are indicated in red and green asterisks for in vitro and in-cell SHAPE, respectively. The secondary structures that were previously named TSL1 and TSL231 are enclosed in blue boxes. Please click here to view a larger version of this figure.
Primer name | 5'-3' Sequence |
Ribozyme-FW | TAAAACGACGGCCAGTGAAT |
Ribozyme-RV | GCAGGTCGACTCTAGAGGAT |
RT primer | GAACCGGACCGAAGCCCG |
Table 1: The primer sequences used in the representative experiment.
Name | Sequence (5'->3') |
SMN2E7-338-FW | AAAGACTATCAACTTAATTTCTGA |
SMN2E7-338-RV | TGTTTTACATTAACCTTTCAACT |
SMN2E7-276-FW | AATGTCTTGTGAAACAAAATGCT |
SMN2E7-276-RV | AACCTTTCAACTTTCTAACATCT |
SMN2E7-251-FW | TGAAACAAAATGCTTTTTAACATCC |
SMN2E7-251-RV | TCAACTTTCTAACATCTGAACTTTT |
Table 2: The primer sets for amplification of the pre-mRNA sequence of interest. All three primer sets yield a single amplicon in a PCR reaction with genomic DNA-free cDNA template.
In in vitro SHAPE, it is critical to use high quality homogeneous RNA template. T7 transcription, however, often yields heterogeneous sequences36. Especially, sequences with ±1 nucleotide at the 3'-terminus with non-negligible yields36 are usually difficult to be removed by polyacrylamide gel purification. Heterogeneous RNA template can result in more than one set of the signal in the sequencing gel profiling of the primer extension product, which sometimes makes it difficult to interpret the result. The ribozyme at both 5'- and 3'-ends of the RNA template expression cassette will make both ends homogenous.
For both in vitro and in-cell SHAPE, the incubation time of 2'-OH modification reagents is another critical factor. It was suggested by the Weeks group that at least five times the half-life of the water-quenching 2'-OH acylation reagent should be used1. In our hands, incubating with NAI (T1/2 = ~20 min) for >30 min in both in vitro and in-cell SHAPE results in an overreacted result. As shown in Figure 1D, a good in vitro SHAPE profiling should have >50% total signal on top of the gel as full-length transcript, ensuring that most of the modified RNA is only acylated once. Overreaction will render the double-acylated product non-negligible and the profiling biased to the enriched short length extension products. In in-cell SHAPE, the amplicon for the library construction should appear as a single band in agarose gel analysis (step 2.6.3). Smear of the band indicates an overreaction, and the incubate time should be reduced. In the representative experiments, a 15 min incubation time at 37 °C was optimal for in vitro and in-cell SHAPE. This should be used as a starting point for NAI modification in similar applications.
There are other 2'-OH modification reagents22 widely used for SHAPE, such as 1M7. Compared to NAI, 1M7 has a better reactivity to 2'-OH group and shorter half-life in water22. In our hands, 1M7 formed massive amount of yellow precipitation in cell culture media in an in-cell SHAPE experiment, which complicated the RNA isolation. For a comparison reason, both in vitro and in-cell SHAPE used NAI as the 2'-OH modification reagent for a study of SMN-C2 and SMN2 pre-mRNA interactions. If only in vitro SHAPE is required, 1M7 is an alternative option as demonstrated by various studies in riboswitch structural determination7,8.
In general, in-cell SHAPE over in vitro SHAPE is preferred, especially if the molecule is presumably acting in the nucleus of a eukaryotic cell. RNA-binding proteins in the nucleus are abundant, and it is almost impossible to recapitulate the cellular context under in vitro conditions.
In the past decade, SHAPE becomes the standard method for studying the secondary structure of RNA. Compared to traditional RNA footprinting with RNase37, it is more suitable to study small molecule-RNA interactions, as these interaction can sometimes be weak and insensitive to RNase challenges.
The major limitation of using SHAPE to study small molecule-induced RNA structural changes is that its results do not reveal the binding site. In in vitro and in-cell SHAPE for SMN-C2 bound pre-mRNA, the SHAPE reactivity did not change at the putative binding site (Figure 1D, Figure 2B); rather, the reactivity at 2 to 3 remote sites at the loop or budge region were altered. SMN-C2 presumably binds to an RNA double-helix region (Figure 1D, Figure 2D), which usually has a low SHAPE reactivity. Therefore, further stabilizing the structure to decrease SHAPE reactivity would probably not be observable. To generate a putative binding site, other methods such as ChemCLIP38 should be used, in which a crosslinking chemical probe is involved.
A common alternative approach to map RNA secondary or higher structure and nucleotide dynamics is NMR spectrometry39,40. It has been demonstrated that RNA dynamics derived from quantitative NMR analysis strongly correlate with SHAPE activity39. In the context of small molecule binding, chemical shift perturbations can reveal the interacting nucleotides21.
As RNA-targeting small molecules become a new modality of drug development11, we envision that SHAPE will be established as a standard methodology to evaluate the structural impacts of target RNA in the presence of small molecules. In the future, a transcriptome-wide interrogation method for RNA-targeting small molecule drugs is desired.
The authors have nothing to disclose.
This work was made possible by the NIH R01 grant (NS094721, K.A.J.).
DNA oligonucleotide | IDT | gBlock for > 200 bp DNA synthesis | |
Phusion Green Hot Start II High-Fidelity PCR Master Mix | Thermo Fisher | F566S | |
NucleoSpin gel and PCR clean-up kit | Takara | 740609.50 | |
MegaScript T7 transcription kit | Thermo Fisher | AM1333 | Contains 10X reaction buffer, T7 enzyme, NTP and Turbo DNase |
DEPC-treated water | Thermo Fisher | 750023 | |
2X TBE-urea sample buffer | Thermo Fisher | LC6876 | |
40% acrylamide/ bisacrylamide solution (29:1) | Bio-Rad | 1610146 | |
10X TBE buffer | Thermo Fisher | 15581044 | |
Nalgene Rapid-Flow™ Filter Unit | Thermo Fisher | 166-0045 | |
Kimwipe | Kimberly-Clark | 34133 | |
TEMED | Thermo Fisher | 17919 | |
SYBR-Safe dye | Thermo Fisher | S33102 | |
6 % TBE-urea mini-gel | Thermo Fisher | EC6865BOX | |
ChemiDoc | Bio-Rad | ||
T4 PNK | NEB | M0201S | |
γ-32P-ATP | Perkin Elmer | NEG035C005MC | |
Hyperscreen™ Intensifying Screen | GE Healthcare | RPN1669 | calcium tungstate phosphor screen |
phosphor storage screen | Molecular Dynamics | BAS-IP MS 3543 E | |
Amersham Typhoon | GE Healthcare | ||
NAI (2M) | EMD Millipore | 03-310 | |
GlycoBlue | Thermo Fisher | AM9515 | |
SuperScript IV Reverse Transcriptase | Thermo Fisher | 18090010 | Contains 5X RT buffer, SuperScript IV |
dNTP mix (10 mM) | Thermo Fisher | R0192 | |
ddNTP set (5mM) | Sigma | GE27-2045-01 | |
large filter paper | Whatman | 1001-917 | |
Gel dryer | Hoefer | GD 2000 | |
QIAamp DNA Blood Mini Kit | Qiagen | 51104 | Also contains RNase A and protease K |
SMN2 minigene34 | Addgene | 72287 | |
Heat inactivated FBS | Thermo Fisher | 10438026 | |
Pen-Strep | Thermo Fisher | 15140122 | |
Opti-MEM I | Thermo Fisher | 31985062 | |
FuGene HD | Promega | E2311 | |
TrpLE | Thermo Fisher | 12605010 | |
DPBS without Ca/ Mg | Thermo Fisher | 14190250 | |
TRIzol | Thermo Fisher | 15596018 | |
RNeasy mini column | Qiagen | 74104 | Also contains RW1, RPE buffer |
RNase-Free DNase Set | Qiagen | 79254 | Contains DNase I and RDD buffer |
Deionized formamide | Thermo Fisher | AM9342 | |
MnCl2•4H2O | Sigma-Aldrich | M3634 | |
random nonamer | Sigma-Aldrich | R7647 | |
SuperScript First-Strand Synthesis System | Thermo Fisher | 11904-018 | Contains 10X RT buffer, SuperScript II reverse transcriptase |
AccuPrime pfx DNA polymerse | Thermo Fisher | 12344024 | |
NextSeq500 | Illumina | ||
NucAway column | Thermo Fisher | AM10070 | for desalting purpose |