The two different 3' rapid amplification of cDNA ends (3' RACE) protocols described here make use of two different DNA polymerases to map sequences that include a segment of the open reading frame (ORF), the stop codon, and the entire 3' UTR of a transcript using RNA obtained from different cancer cell lines.
Maturation of eukaryotic mRNAs involves 3' end formation, which involves the addition of a poly(A) tail. In order to map the 3' end of a gene, the traditional method of choice is 3' rapid amplification of cDNA ends (3' RACE). Protocols for 3' RACE require the careful design and selection of nested primers within the 3' untranslated region (3' UTR) of the target gene of interest. However, with a few modifications the protocol can be used to include the entire 3' UTR and sequences within the open reading frame (ORF), providing a more comprehensive picture of the relationship between the ORF and the 3' UTR. This is in addition to identification of the polyadenylation signal (PAS), as well as the cleavage and polyadenylation site provided by conventional 3' RACE. Expanded 3' RACE can detect unusual 3' UTRs, including gene fusions within the 3' UTR, and the sequence information can be used to predict potential miRNA binding sites as well as AU rich destabilizing elements that may affect the stability of the transcript.
The formation of the 3' end is a critical step in mRNA maturation that comprises the cleavage of the pre-mRNA downstream of a PAS followed by the addition of ~250 untemplated adenines, which make up the poly(A) tail1,2. The poly(A) binding protein (PABP) binds to the poly(A) tail, and this protects the mRNA transcript from degradation, and facilitates translation1.
Current estimates suggest that 70% of human genes have multiple PASs, and thus undergo alternative polyadenylation, resulting in multiple 3' ends3. Thus, it is important to identify where the poly(A) tail attaches to the rest of the 3' UTR, as well as identify the PAS used by any given transcript. The advent of next-generation sequencing has resulted in the simultaneous identification of the 3' UTRs and the PASs of thousands of genes. This increase in sequencing capability has required the development of bioinformatic algorithms to analyze data involving alternative polyadenylation of the 3' end. For the de novo detection or validation of the PAS and hence mapping of the 3' end of individual genes from large scale sequencing data, 3' RACE remains the method of choice4,5. The sequences included in cDNA products of 3' RACE normally include only a portion of the 3' UTR that contains the poly(A) tail, the cleavage site, the PAS, and the sequences upstream of the PAS. Unlike PCR, which requires the design and use of gene specific forward and reverse primers, 3' RACE only requires two gene specific nested forward primers. Hence, PCR requires a more detailed knowledge of the nucleotide sequence of a large region of the gene being amplified4,6. Since 3' RACE uses the same reverse primer that targets the poly(A) tail for all polyadenylated RNA transcripts, only the forward primers need to be gene specific, thus, only requiring knowledge of a significantly smaller region of the mRNA. This enables the amplification of regions whose sequences are not fully characterized4,7. This has allowed 3' RACE to be used not only to determine the 3' end of a gene, but to also determine and characterize large regions upstream of the PAS that form a significant portion of the 3' UTR. By combining 5' RACE with the modified 3' RACE that includes larger portions of the 3' UTR and flanking regions, it is possible to fully sequence or clone an entire mRNA transcript from the 5' end to its 3' end8.
An example of this application of modified 3' RACE is the recent identification of a novel CCND1-MRCK fusion gene transcript from Mantle Cell Lymphoma cell lines and cancer patients. The 3' UTR consisted of sequences from both the CCND1 and MRCK genes and was recalcitrant to miRNA regulation9. The two nested CCND1 specific forward primers were complementary to the region immediately adjacent and downstream of the CCND1 stop codon. Although whole transcriptome sequencing together with specific bioinformatic tools can be used to detect gene fusions within the 3' UTR10, many labs may lack the financial resources or bioinformatic expertise to make use of this technology. Hence, 3' RACE is an alternative for de novo identification and validation of novel fusion genes involving the 3' UTR. Considering the drastic increase in the number of reported fusion genes as well as read through transcripts, 3' RACE has become a powerful tool in characterizing gene sequences11,12. In addition, recent studies have shown that different sequences within the 3' UTR as well as the length of the 3' UTR can affect mRNA transcript stability, localization, translatability, and function13. Due in part to an increased interest in mapping the transcriptome, there has been an increase in the number of different DNA polymerases being developed for use in the lab. It is important to determine what types of modifications can be made to the 3' RACE protocol in order to utilize the available repertoire of DNA polymerases.
This work reports adapting 3' RACE to map the entire 3' UTR, the PAS, and the 3' end cleavage site of the ANKHD1 transcript by using nested primers within the ANKHD1 section of the transcript and two different DNA polymerases.
Wear a lab coat, gloves, and safety glasses at all times while performing all procedures in this protocol. Ensure that containers/tubes containing the phenol and guanidine isothiocyanate reagent are only opened in a certified hood, and dispose of phenol waste in a designated container. Use DNAse/RNAse-free sterile tubes, tips, and reagents.
1. Cell Culture
2. RNA Extraction
3. DNase Treatment
4. cDNA Synthesis
For a final reaction volume of 50 µL:
5. Primer Search for Fusion Gene Transcript
6. Optimizing 3' RACE to Map the 3' UTR Using Two Different Enzymes
NOTE: There has been an increase in the diversity of DNA polymerases used for PCR; therefore, we wanted to determine standard conditions that can be applied even when using different enzymes for 3' RACE PCR reactions. The reverse primers for any transcript are kept constant; the only changes are in the nested forward primers that are specific for the target transcript.
7. Verify the second PCR product of 3' RACE.
8. Product Purification and Sequencing
Nested Forward Primer Search:
The agarose gel from Figure 1 shows two distinct PCR gel products (Lanes 1 and 2) which use the same forward primer but different reverse primers. Lane 3 has a distinct PCR product and has a distinct forward and reverse primer. The ideal primers to use for the PCR based reaction are those that give one distinct PCR product (Lane 3). The forward primer used in Lanes 1 and 2 gives the strongest band and gives the largest PCR products (expected), and is used as the first forward primer. The second nested primer for the second set of PCR reactions in 3' RACE is the forward primer from Lane 3.
Different DNA Polymerases Used in 3' RACE Produce Similar Results:
Two different DNA polymerases produced the same sized PCR products despite having different PCR cycling conditions for the 3' RACE (Figure 2). There is only one distinct band for the expected PCR product. All of the minus reverse transcriptase negative controls (-) do not have a band, showing no genomic contamination of the RNA used in cDNA synthesis as well as in subsequent downstream reactions.
Sanger Sequencing Results:
The PCR products were gel purified and sent for Sanger sequencing. The results shown in Figure 3A show a portion of a sequence chromatograph from Sanger sequencing. A representative sequence identifies the location of the stop codon, putative polyadenylation signal, and the cleavage site, as well as the poly (A) tail in the 3' RACE product (Figure 3B).
Figure 1: Identification of two nested gene specific forward primers to use in 3' RACE. Shown is an example of PCR products from different combinations of gene specific forward and reverse primers using HeLa cDNA run on an ethidium bromide stained agarose gel with the molecular weight (mw) ladder. Lanes 1 and 2 are PCR products from the same gene specific forward primer but different reverse primers. The arrows point to the expected PCR product for each lane. In addition to the band for the expected PCR product, there is an additional band. The single PCR product in Lane 3 is from a different gene specific nested forward and reverse primer and shows the single expected band.
Figure 2: Verification of 3' RACE products from the second set of PCR reactions. Products from the second PCR reaction with (A) Pfu DNA Polymerase and (B) Chimeric DNA polymerase were run on an agarose gel with ethidium bromide. Lanes depicted by (-) are from the negative control sample for each cell line when cDNA synthesis from RNA was carried out in the absence of the reverse transcriptase enzyme. Lanes 1 and 2 are PCR products from biological replicates of each cell line. Please click here to view a larger version of this figure.
Figure 3: Mapping the PCR products of 3' RACE. (A) A portion of a sequence chromatogram of a gel purified 3' RACE product showing the predicted polyadenylation signal and the location of the poly(A) tail. (B) Representative sequences from Sanger sequencing data showing the stop codon, the polyadenylation signal (PAS), the cleavage and attachment site of the poly(A) tail as well as a section of the poly (A) tail sequences. Please click here to view a larger version of this figure.
Cycle step | Temperature and Duration | Number of cycles |
Initial denaturation | 95 °C for 3 min | 1 |
Initial annealing and extension | 50 °C for 5 min and 72 °C for 10 min | 1 |
Subcyles (denaturation, annealing and extension) | 95 °C for 40 s, 50 °C for 1 min, and 72 °C for 3 min | 20 |
Final cycle (denaturation, annealing and extension) | 95 °C for 40 s, 50 °C for 1 min, and 72 °C for 15 min | 1 |
Hold | 4 °C, ∞ |
Table 1: Pfu DNA Polymerase PCR cycling conditions for 3' RACE.
Cycle step | Temperature and Duration | Number of cycles |
Initial denaturation | 98 °C for 3 min | 1 |
Initial annealing and extension | 50 °C for 5 min and 72 °C for 10 min | 1 |
Subcyles (denaturation, annealing and extension) | 98 °C for 30 s, 50 °C for 1 min, and 72 °C for 4 min | 20 |
Final cycle (denaturation, annealing and extension) | 98 °C for 30 s, 50 °C for 1 min, and 72 °C for 15 min | 1 |
Hold | 4 °C, ∞ |
Table 2: Chimeric DNA polymerase PCR cycling conditions.
Despite the advent of massive parallel sequencing technologies, on a gene-by-gene basis, 3' RACE still remains the easiest and most economical method to identify the PAS and nucleotides adjacent to the poly(A) tail. The adaptation described here expands using 3' RACE to both amplify and map sequences that include a portion of the ORF, the stop codon, and the entire 3' UTR of the ANKHD1 mRNA transcript. A major advantage of 3' RACE is that with a few minor adaptations, products from 3' RACE can be cloned into other vectors to facilitate downstream interrogation of 3' UTR function including miRNA targeting, stability assays as well as other mechanistic assays. This can be done by including restriction enzyme sites within the nested primer sequences9,10. Adjustments can be made to include only the 3' UTR without any sequences from the ORF for cloning for 3' UTR functional assays5.
A major determinant of the success of 3' RACE is the development of nested primers that target the gene of interest. The cDNA sequences from the Ensembl genome browser are recommended to best identify regions without polymorphisms in order to develop optimal primers. Ideally, several nested forward primers as well as at least two reverse primers within the gene of interest can be screened using standard PCR to identify the best two nested gene specific primers to use. In this study the first gene specific forward primer selected for 3' RACE initially gave two bands with standard PCR routinely used in the lab. Unfortunately, the primer design was limited by the significant number of SNPs in the transcript of interest. SNPs in the target transcript lead to mismatch between the primers and the target, which can result in inefficient annealing and amplification, and should be reduced to a minimum19. The presence of at least four SNPs in one primer, or the occurrence of a combination of five mismatches (three in one primer and two in the other) may result in complete inhibition of the PCR reaction, and hence the only option is to design more primers20. Instead of designing more primers to get one single band for the standard PCR used for the primer search in experiments reported here, the PCR cycling conditions and the DNA polymerases subsequently used in 3' RACE were altered in order to optimize the probability of getting one specific band. The denaturation temperature was increased to 98 °C for one of the DNA polymerases (an increase from 95 °C used in the primer search). For both DNA polymerases used in the 3' RACE PCR, the initial denaturation duration was increased to 3 min instead of the recommended 30 s to 2 min in order to fully denature the cDNA template. The T7 oligo dT25 primer for example is predicted to potentially form weak secondary structures, which may reduce the availability of primers21. Increasing the initial denaturation length and temperature breaks any secondary structures and helps yield the single specific band after the final cycle of 3' RACE. In addition, the non-specific band was lighter than the expected bands, suggesting that it appeared at higher PCR cycles (up to 35 cycles); limiting the PCR cycles to only 20 reduces the amplification of the non-specific band.
Another potential of 3' RACE, like other PCR based reactions, is incomplete amplification of the target region22. Hence, for both thermophilic enzymes the initial annealing temperature was set at 50 °C for 5 min to optimize annealing of the primers to the target and this was followed by an initial extension temperature at 72 °C for 10 min. The final PCR cycle for each enzyme had a final extension time of 15 min at 72 °C. These extension steps were longer than those recommended by the manufacturers of the DNA polymerases. The long extension (elongation) step allows complete synthesis of incomplete amplicons, enabling full extension of the initial and final amplification products16.
PCR steps that occur in 3' RACE are highly sensitive, so they may detect genomic DNA (or other DNA contamination) instead of the target mRNA transcript, resulting in several unexpected bands on the gel22. One way to prevent contamination is to perform the PCR work in a dedicated PCR hood, wear gloves, and use clean reagents and autoclaved sterile tubes and pipette tips. In regular PCR, forward and reverse primers can be designed so they are located on different exons (across intron primers); in this way, an abnormally large PCR product would signify that a region containing an intron has been amplified. However, this may not always be feasible in the event that only the 3' UTR sequence, located within the same terminal exon, is the target for 3' RACE amplification. Alternatively, the RNA can be pre-treated with DNAse enzyme before cDNA synthesis. The use of the T7 oligo dT25 primer to prime the reverse transcriptase reaction instead of a random hexanucleotide for first strand cDNA synthesis from RNA also decreases amplification of the genomic products. Last but not least, setting up a negative control (the "–" lanes in Figure 2), where cDNA synthesis is set up in absence of reverse transcriptase is highly recommended. The control undergoes identical steps to the other samples in the subsequent 3' RACE procedures. The presence of a band in this control signifies potential genomic DNA contamination.
Despite the aforementioned challenges, 3' RACE is a powerful tool in mapping 3' ends on an individual gene basis. The advent of next-generation technology has led to an increase in the repertoire of transcripts that have alternative 3' ends resulting from alternative polyadenylation, which may or may not involve alternative splicing. To add to the complexity, in cancer cells, chromosomal rearrangements are frequent and can result in oncogenic gene fusion products23. The fusion between the two genes may involve the ORF or, in some cases, involve only sequences within the 3' UTR9,10. Furthermore, there are also conjoined/co-transcribed genes, which are transcribed simultaneously but may be translated into different proteins11. 3' RACE can be tailored to map all these unique, abnormal transcripts as well as normal transcripts. This is important because these abnormal transcripts may end up serving as disease specific biomarkers or specific drug targets, e.g., the drug Imatinib targets the BCR-ABL1 fusion gene in cancer.
Hence, 3' RACE is a highly versatile technique which can be used to amplify normal transcripts, identify novel 3' UTRs, and novel gene fusions within the 3' UTR5,9. In this report, 3' RACE was used to amplify and map the stop codon, the PAS, and the entire 3' UTR of the ANKHD1 transcript using different DNA polymerases. The described approach can be used to map the 3' end of any polyadenylated transcript.
The authors have nothing to disclose.
We would like to acknowledge Bettine Gibbs for her technical help.
HeLa cells | ATCC | CCL-2 | Cervical cancer cell line. |
Jeko-1 cells | ATCC | CRL-3006 | Mantle cell lymphoma cell line. |
Granta-519 cells | DSMZ | ACC-342 | Mantle cell lymphoma cell line. |
Fetal Bovine Serum | Sigma Aldrich | F6178 | Fetal bovine serum for cell culture. |
Penicillin/Streptomycin | ThermoFisherScientific | 15140122 | Antibiotic and antimycotic. |
GlycoBlue | Ambion | AM9545 | Coprecipitant. |
DMEM | ThermoFisherScientific | 10569044 | Gibco brand cell culture media with GlutaMAX. |
Nuclease Free-Water | ThermoFisherScientific | AM9938 | Ambion DNase and RNAse free water(non DEPC treated). |
Dulbecco’s Phosphate-Buffered Solution | Corning | 21-030 | 1X PBS. |
Chloroform | Sigma Aldrich | C7559-5VL | |
2-propanol | Sigma Aldrich | I9516 | |
Reagent Alcohol | Sigma Aldrich | 793175 | Ethanol |
Ethidium Bromide solution | Sigma Aldrich | E1510 | |
TRIzol Reagent | ThermoFisherScientific | 15596026 | Monophasic phenol and guanidine isothiocyanate reagent. |
2X Extender PCR-to-Gel Master Mix | Amresco | N867 | 2X PCR-to-Gel Master Mix containing loading dye used in routine quick PCR assays and primer search. |
10mM dNTP | Amresco | N557 | |
RQ1 RNase-Free DNase | Promega | M6101 | Dnase treatment kit. |
Gel Loading Dye Orange (6X) | New England BioLabs | B7022S | |
2X Phusion High-Fidelity PCR Master Mix with HF buffer | ThermoFisherScientific | F531S | 2X PCR MasterMix containing a chimeric DNA polymerase consisting of a DNA binding domain fused to a Pyrococcus-like proofreading polymerase and other reagents. |
PfuUltra II Fusion HS DNA polymerase | Agilent Technologies | 600670 | Modified DNA Polymerase from Pyrococcus furiosus (Pfu). |
RevertAid RT Reverse Transcription Kit | Thermo Fischer | K1691 | Used for reverse transcription of mRNA into cDNA synthesis. Kit includes Ribolock RNAse inhibitor, RevertAid M-MuLV reverse transcriptase and other reagents listed in manuscript. |
Pefect size 1Kb ladder | 5 Prime | 2500360 | Molecular weight DNA ladder. |
Alpha Innotech FluorChem Q MultiImage III | Alpha Innotech | Used to visualise ethidium bromide stained agarose gel. | |
Low Molecular Weight Ladder | New England BioLabs | N3233L | Molecular weight DNA ladder. |
Vortex Mixer | MidSci | VM-3200 | |
Mini Centrifuge | MidSci | C1008-R | |
Dry Bath | MidSci | DB-D1 | |
NanoDrop 2000C | ThermoFisherScientific | ND-2000C | Spectrophotometer. |
Wide Mini-Sub Cell GT Horizontal Electrophoresis System | BioRad | 1704469 | Electrophoresis equipment-apparatus to set up gel |
PowerPac Basic Power Supply | BioRad | 1645050 | Power supply for gel electrophoresis. |
Agarose | Dot Scientific | AGLE-500 | |
Mastercycler Gradient | Eppendorf | 950000015 | PCR thermocycler. |
Centrifuge | Eppendorf | 5810 R | |
Centrifuge | Eppendorf | 5430R | |
Wizard SV Gel and PCR Fragment DNA Clean-Up System | Promega | A9281 | |
Zero Blunt TOPO PCR Cloning Kit, with One Shot TOP10 Chemically Competent E. coli cells | ThermoFisherScientific | K280020 | |
MyPCR Preparation Station Mystaire | MidSci | MY-PCR24 | Hood dedicated to PCR work. |
Hamilton SafeAire II fume hood | ThermoFisherScientific | Fume hood. | |
Beckman Coulter Z1 Particle Counter | Beckman Coulter | 6605698 | Particle counter. For counting cells before plating for RNA extraction. |
Applied Biosystems Sequence Scanner Software v2.0 | Applied Biosystems (through ThermoFisherScientific) | Software to analyze Sanger sequencing data. |