Tandem splicing events occur at sites less than 12 nucleotides apart. Quantifying ratios of such splice variants is feasible using an absolute quantitative PCR approach. This manuscript describes how splice variants of the gene STAT3, in which two splicing events results in Serine-701 inclusion/exclusion and α/β C-termini, can be quantified.
Human signal transducer and activator of transcription 3 (STAT3) is one of many genes containing a tandem splicing site. Alternative donor splice sites 3 nucleotides apart result in either the inclusion (S) or exclusion (ΔS) of a single residue, Serine-701. Further downstream, splicing at a pair of alternative acceptor splice sites result in transcripts encoding either the 55 terminal residues of the transactivation domain (α) or a truncated transactivation domain with 7 unique residues (β). As outlined in this manuscript, measuring the proportions of STAT3‘s four spliced transcripts (Sα, Sβ, ΔSα and ΔSβ) was possible using absolute qPCR (quantitative polymerase chain reaction). The protocol therefore distinguishes and measures highly similar splice variants. Absolute qPCR makes use of calibrator plasmids and thus specificity of detection is not compromised for the sake of efficiency. The protocol necessitates primer validation and optimization of cycling parameters. A combination of absolute qPCR and efficiency-dependent relative qPCR of total STAT3 transcripts allowed a description of the fluctuations of STAT3 splice variants’ levels in eosinophils treated with cytokines. The protocol also provided evidence of a co-splicing interdependence between the two STAT3 splicing events. The strategy based on a combination of the two qPCR techniques should be readily adaptable to investigation of co-splicing at other tandem splicing sites.
Short-range (tandem) alternative splicing, where alternate acceptor or donor sites are in close proximity to one another, is common in mammals1, invertebrates2 and plants3. It is estimated that 20% of mammalian genes contain alternative splice sites 2-12 nucleotides apart4. Many of these sites are 3 nucleotides apart and result in inclusion or exclusion of a single codon. There is debate about the nature of splicing regulation at these sites5,6 with some arguing that the splicing motif differences are so subtle that selection is stochastic7, while others infer regulation based on tissue specificity8.
Tandem splice site selection has been analyzed semi-quantitatively using modified capillary electrophoresis7, and high-resolution gel electrophoresis8. RNA-Seq (RNA sequencing) reads can be used to quantify the splicing ratios at each splice site. In this way, RNA-Seq data has provided insight into the regulation of tandem splice sites9. It has also enabled prediction of expected splice variant ratios based on nucleotide motif10. Most of the emphasis on splicing that includes or excludes a single codon has been on the more commonly occurring tandem acceptor splice sites, known as NAGNAGs (where N = any nucleotide).
Tandem donor alternative splice sites including or excluding a single codon (GYNGYN recognition motif, where Y = pyrimidine) are less common than tandem acceptors. Signal transducer and activator of transcription 3 (STAT3) is a key gene undergoing tandem donor alternative splicing1,11. The tandem donor splice sites join exons 21 and 22 and result in the inclusion or exclusion of the codon for Serine-701 (S or ΔS respectively)1,11. Downstream alternative acceptor sites (40 nucleotides apart from each other) joining exons 22 and 23a/b result in the inclusion of either the 55 terminal residues of the transactivation domain (α) or a truncated transactivation domain with 7 unique C-terminal residues (β)11. Therefore, four splice variants are possible.
STAT3 protein is a transcription factor and major signal integrator in numerous cell types12 and when mutated its constitutive activation contributes to several cancer phenotypes (reviewed in reference13). Job's Syndrome, an immunodeficiency disorder characterized by high levels of IgE, is also caused by mutations in STAT3 (reviewed in reference14). Distinct roles for STAT3 α and β splice variant proteins have been previously described15. Initially, STAT3 β was thought to act in a dominant-negative manner16, antagonizing STAT3 α's transcriptional activity, but subsequent work suggested it has independent target genes17. Despite the subtlety of tandem splicing, there is reason to believe the absence or presence of Serine-701 (Ser701) influences function. Not only is Ser701 in close proximity to Tyrosine-705 (the residue phosphorylated in STAT3 activation18), but a recent study suggests that STAT3 S and ΔS splice variants are both necessary for viability of STAT3-addicted Diffuse Large B Cell Lymphoma (DLBLCL) cells19. The biological relevance remains to be explored. Given that splice variant composition could influence function, we endeavored to discover whether the ratio was perturbed by cytokine stimulation in eosinophils.
We initially attempted to explore the linkage between the two splicing events by using PCR specific for STAT3 α and β splice variants, followed by cleavage of products with a restriction enzyme specific for the S splice variants, AfeI. Densitometry of products indicated inclusion of Ser701 was roughly ten times more common than its omission (ΔS) in both STAT3 α and β (data not shown). However, this semi-quantitative approach was not sufficiently reproducible, and could not be used effectively to measure all four splice variants simultaneously. To analyze proportions of each of the four splice variants, it was necessary to establish a quantitative PCR (qPCR) protocol that yielded tight technical (several assays of a given sample) replicates.
Relative qPCR relies on comparison of a gene of interest to a standard or housekeeping gene known to be expressed at a particular level20 and is appropriate when the gene of interest and housekeeping gene are amplified with similar efficiency. A double stranded (ds) DNA-binding fluorescent (cyanine) dye binds to PCR amplicons21, and after a certain number of cycles, sufficient amplification has occurred for fluorescence to be detectable. The higher the initial level of the transcript, the lower the threshold cycle (Ct) value. Since the concentration of cDNA preparations differs, one needs to compare the transcript's concentration with the concentration of a transcript known to be expressed at a consistent level in all samples, like glucuronidase-β (GUSB) in eosinophils22.
Relative qPCR is not feasible for highly similar sequences, as seen in splice variants resulting from tandem splicing. The stringent conditions required to specifically amplify the splice variants result in decreased efficiency. Instead, absolute quantification must be used23. This entails preparing a standard curve with known concentrations of the spliced transcript of interest, and ensuring PCR conditions optimize specificity24. As described, absolute and relative qPCR data for a particular gene can be merged to inform understanding of the gene's expression in a particular cell type, in this case STAT3 in variously stimulated eosinophils25.
Herein, STAT3 splice variant quantification is described with the expectation that the method can be adapted to targeted studies of other tandem splicing events. Optimization was a lengthy process, where several primer pairs at various concentrations and numerous iterations of cycling parameters were tested over the course of a few months. The key features of the protocol are primer specificity validation and quantification based on standard curves with known concentrations of the splice variants. Relative qPCR in conjunction proved helpful for our application but is not necessary.
NOTE: Peripheral blood eosinophils were received without identifying information in accord with a protocol approved (#2013-1570) by the University of Wisconsin-Madison Center for Health Sciences Institutional Review Board. Signed informed consent from the donor was obtained for the use of each sample in research.
1. Creating Plasmids as Template Standards
2. Analyzing Primer Specificity for Absolute qPCR
3. Assessing Absolute qPCR Assay Specificity and Repeatability
4. Performing Relative qPCR Assays
5. Analyzing Absolute qPCR Data for Unknown Samples
6. Merging Absolute and Relative qPCR Data
Good quality qPCR data will generate a sigmoidal amplification plot (Figure 2a), signifying exponential increase in transcripts over the course of cycling. The presence of too much template can result in a high fluorescence background, meaning an inappropriate baseline is established in the first few cycles. If the data do not provide an exponential curve (Figure 2b), further optimization is necessary (outlined in steps 3.1 and 3.4). For further information about troubleshooting qPCR results, refer to reference32. The standard curves generated for the template calibrator plasmids will indicate the efficiency of amplification (curve for STAT3 Sα shown in Figure 3). Efficiencies between 83 and 95% were observed under the conditions described. The equation for specificity (Step 3.4) assumes equal efficiency, which is unlikely25, so the specificity is likely to be greater than specificity factor suggests.
In order to evaluate the congruency of STAT3 levels, the absolute values of each of the four splice variants were measured as well as the level of total STAT3, the latter using primers amplifying a region common to all four splice variants (Figure 4). Ideally the linear regression (indication of correlation) and slope (ratio of pan-STAT3 to summed splice variants) should both be close to 1.
The absolute qPCR data are presented as pie charts to show the proportions of the four splice variants over time post-stimulation with cytokines (Figure 5a). Resting eosinophils (0 hr) had the smallest proportion of STAT3 Sα, although this variant was always the most abundant. Multiplying the fraction of STAT3 β splice variants (Sβ+ΔSβ) by the fraction of STAT3 ΔS splice variants (ΔSα+ΔSβ) consistently gave a value lower than the experimentally-recorded value for ΔSβ. If the splicing events were independent, one would expect that multiplying fraction of ΔS variants by the fraction of β variants would give a value that agrees with the experimentally-determined value. Higher levels of ΔSβ than expected from independent splicing events suggests a co-splicing bias exists.
Merging absolute and relative qPCR data demonstrated that levels of all STAT3 splice variants increased post-stimulation with cytokines IL3 and TNFα, with levels peaking 6 hr post-stimulation (Figure 5b–e). For three of the four splice variants, transcript levels were roughly 3 times higher in IL3+TNFα treated eosinophils (6 hr) compared to eosinophils in media at the same time point. STAT3 Sα levels were 3.5 times higher in IL3+TNFα treated eosinophils compared to eosinophils in media at this time point. The greatest uncertainty (largest standard error of measurement) was seen in ΔSβ (Figure 5e), which comprises the smallest fraction of total STAT3 in all samples. This was not surprising, as lower levels are associated with higher Ct values. Requiring more cycles to reach the threshold cycle will compound uncertainty due to cycle-to-cycle variation in efficiency of amplification.
Figure 1: Schematic of primer pairs used to perform qPCR of STAT3 splice variants and pan-STAT3. Primers used to specifically amplify each of the STAT3 splice variants (Sα, Sβ, ΔSα and ΔSβ respectively) are shown. Forward primers (STAT3 "S" and "ΔS") span the junction between exons 21 and 22. Please click here to view a larger version of this figure.
Figure 2: Amplification plots of qPCR data. (a) Sigmoidal amplification plots means reliable amplification. These data were obtained from qPCR of two serial dilutions of plasmid containing STAT3 Sα, with each pair of colored lines representing the fluorescence levels of duplicate diluted samples over the course of 40 cycles (x-axis). The most concentrated sample (green-grey) was sufficiently amplified by cycle 17 (with dsDNA-binding dye proportional to fluorescence, shown on the y-axis) to exceed the threshold fluorescence value (baseline shown as green arrow). Its Ct value would be 17. (b) Non-exponential plots suggest that the background-fluorescence threshold was not correctly established in the first few cycles. This could be due to the presence of an inhibitor, or highly-concentrated template or primers. Please click here to view a larger version of this figure.
Figure 3: Standard curve of log (copy number of STAT3 Sα) vs Ct. There is a linear relationship between the log of each STAT3 splice variant's copy number and threshold cycle (Ct). Creating a standard curve from plasmid DNA imitates the cDNA of samples and thus provides a better measurement than a curve created from diluted PCR amplicons. The data presented are Ct values obtained from qPCR of two serial dilutions of plasmid containing STAT3 Sα. From this curve, the copy number present in each sample can be interpolated, and amplification efficiency calculated (83.9%). Although y-intercepts are less reproducible than slope, the intercept suggests 42.2 cycles would be necessary to be certain no target DNA is present. Error bars indicate SEM, n = 3. Comparable curves were constructed for STAT3 Sβ, ΔSα and ΔSβ (not shown). Please click here to view a larger version of this figure.
Figure 4: Comparison of quantified pan-STAT3 vs cumulative STAT3 splice variants. The regression of added STAT3 splice variants vs total STAT3 should have slope (ratio of pan- to cumulative) and R2 value (correlation) close to 1. Values from 17 samples (eosinophils and DLBCL) included. Error bars indicate SEM of x-to-y determinations, n ≥ 2 for each. Figure adapted from reference20. Please click here to view a larger version of this figure.
Figure 5: STAT3 splice variant levels fluctuated over the course of cytokine treatment. (a) Pie charts indicating percentages of each STAT3 splice variant in eosinophils during treatment with IL3 and TNFα. (b–e) Changes in STAT3 splice variants in eosinophils treated with various combinations of cytokines, measured by combining relative and absolute qPCR data. STAT3 Sα (b), Sβ (c), ΔSα (d), and ΔSβ (e) levels fluctuated over time post-stimulation. Levels initially increased, peaking 6 hr post-stimulation. The IL3+TNFα combination elicited higher expression of all four STAT3 splice variants than IL3 alone. SEM calculated for each data point accounting for propagation of error. Please click here to view a larger version of this figure.
Table 1: Primers used for amplification (a), absolute (b) and relative (c) quantitative PCR. Cloning primers have restriction sequences and 5'-extensions for efficient cutting. KpnI and NheI restriction sites are indicated in bold. Please click here to view a larger version of this table.
Table 2. PCR cycling parameters (left) and reagent volumes (right) for amplification (a), relative (b), absolute (c) quantitative PCR. Please click here to view a larger version of this table.
Table 3: Template for absolute qPCR plasmid calibration assay. This assay is necessary to assess reproducibility and efficiency, as well as generating standard curves from which to interpolate data. The "non-target" mixes will give an estimate of specificity. Optimization may be necessary to achieve consistency. Please click here to view a larger version of this table.
Table 4: Template for relative qPCR (pan-STAT3 and housekeeping gene GUSB) calibration assay. Unlike absolute qPCR, the point of this assay is to determine conditions at which amplification efficiency is ~100%. Please click here to view a larger version of this table.
Table 5: Template for relative qPCR sample assay for measuring pan-STAT3 and housekeeping gene GUSB. Standard curves are repeated together with samples to ensure comparable efficiency of the assays. Please click here to view a larger version of this table.
Table 6: Template for absolute qPCR sample assay for measuring S variants. Standard curves are repeated together with samples to ensure comparable efficiency of the assays. Please click here to view a larger version of this table.
Table 7: Template for absolute qPCR sample assay for measuring ΔS variants. Standard curves are repeated together with samples to ensure comparable efficiency of the assays. Please click here to view a larger version of this table.
Table 8: Template for absolute qPCR sample assay for measuring pan-STAT3. Please click here to view a larger version of this table.
We developed this protocol in order to assess the levels and proportions of STAT3 splice variant transcripts in eosinophils and lymphoma cells and learn whether cytokine stimulation affected the levels and proportions. STAT3 is of particular interest because of its pleiotropic and uncertain functionality, with conflicting reports on whether it acts as an oncoprotein or tumor suppressor in cancer (reviewed in reference33). Differences in STAT3 α and β splice variant function had been characterized previously34,35, and our protocol facilitated a knock-down/re-expression analysis that suggests a need for an optimal ratio of S and ΔS transcripts19.
Accurate quantification of distinct splice variants will facilitate further investigations of relating heterogeneous STAT3 function to splice variant composition. The protocol integrates absolute and relative qPCR data, combining the ability of absolute qPCR to calculate splice variant proportions, and relative qPCR to measure changes in overall STAT3 expression. This approach allows one to distinguish subtle differences in sequence and simultaneously measure splicing ratios at two alternative splice sites more than 50 nucleotides apart. Determining the ratios of the splicing events individually would not have yielded the remarkable finding that a co-splicing bias existed such that ΔSβ levels are higher than anticipated if uses of the two sites are randomly spliced25.
Critically, absolute qPCR with the use of plasmid calibration curves enables quantification (at sub-optimal efficiency) of splice variations that result in highly similar sequences. We anticipate a novel subtle splicing qPCR assay should take roughly two months to optimize. Key steps in assay development are the creation of STAT3 plasmids used in generating standard curves for absolute qPCR; experimentally determining optimal primer sequences and cycling parameters to ensure specificity and reproducibility; and the integration of relative qPCR data derived from quantifying pan-STAT3 expression relative to GAPDH expression. The correlation of copy number quantified by pan-STAT3 versus cumulative quantification (Sα+ΔSα+Sβ+ΔSβ) shows that the protocol produces reliable results.
A caveat of the technique is the extensive validation process. It is necessary to assess intra-assay variability (repeatability), interassay variability (reproducibility) and specificity. The protocol outlines ways to get numeric outputs for these parameters. We deemed efficiency ≥75%, specificity factor ≥4, coefficient of variation (reproducibility) ≤10% and Ct standard deviation (repeatability) ≤0.2 as suitable thresholds30. Mutations or deletions in the STAT3 sequence amino acids 1-690 will not be discovered by this protocol, although they may influence splicing ratios. Transcript proportions might not be proportional to proteoform proportions36.
Since samples have differing starting amounts of total cDNA, absolute qPCR is suitable for comparing copy numbers of splice variants within a sample but not for inter-sample comparison unless coupled with relative qPCR using an established housekeeping gene. The method described conforms to MIQE qPCR guidelines for reproducibility30. PCR cycling parameters and primer concentrations may need to be modified to obtain reproducible data if other equipment is used. Perfect specificity is not possible without drastically compromising efficiency, but the target was amplified more efficiently by greater than four of orders of magnitude.
Linear DNA is more easily amplified than circular. If an alternate plasmid does not provide satisfactory standard curves (R2 < 0.95), consider linearizing the plasmid by single site restriction prior to quantification. Optimizing qPCR is crucial for obtaining good quality data (Figure 1). Most qPCR protocols rely on two-step cycling, and machines are optimized accordingly. Non-uniform heating of the heating block may be exacerbated in three-step cycling, contributing to poor repeatability. Assays must be set up under sterile conditions with filter pipette tips and ultrapure water, ideally in a dedicated laminar flow hood. Because contaminants can lead to inconsistent results, assays should be set up under sterile conditions with filter pipette tips and ultrapure water, ideally in a dedicated laminar flow hood. For more information about qPCR optimization, refer to Bustin et al.32
Quantifying STAT3 may lead to greater insight in a number of contexts. STAT3 auto-regulates its own expression37, and the protocol described above may help to elucidate whether ratios of STAT3 splice variants contribute to regulating this positive feedback loop. The protocol could be used to study shifts in splice variant ratios as observed in cells at differing densities38 or over the course of development: it is known that the STAT3 α/β ratio changes at the protein level during hematopoiesis16. Sundin et al. found that an intronic single nucleotide polymorphism biased splicing of exon 12 in STAT3 of a patient with Job's syndrome39. It is conceivable that one of the many SNPs present in the introns between exon 21 and 22, or exon 22 and 23 may contribute to splicing ratios of ΔS/S and α/β respectively. The assay could be used to quantify STAT3 transcripts in cancerous cells, where mutations or changes in splicing regulation may introduce bias to the splicing process40. Mutations in splicing factors (like SF3B1), as observed in myelodysplastic syndromes41 may also lead to changes that can be measured by this protocol.
More broadly, this approach specifically detects co-association in splicing, which is not feasible with conventional RNA-Seq, nor standard qPCR. While the phenomenon of mutually exclusive exon splicing demonstrates coordination of splicing decisions, the co-association of other splicing events has not been well-researched. A recently described alternative method, in which RNA-Seq was modified so as to interrogate full-length cDNA, suggests distant splicing events are more co-dependent than previously thought42.
STAT3 contains a donor tandem splice site. Acceptor tandem splice sites are more frequent43 and the principles of the outlined protocol could serve as a starting point for developing assays for coincidence detection of NAGNAG splicing and other splicing events within 200 nucleotides. Other potential applications include quantification of coincidence of other subtle sequence differences, like indels or double/triple nucleotide polymorphisms44.
The authors have nothing to disclose.
The authors would like to acknowledge the National Institutes of Health-NHLBI for the Program Project Grant on the Role of Eosinophils in Airway Inflammation and Remodeling: P01HL088584 (PI: N. Jarjour), and the University of Wisconsin Carbone Cancer Center and Department of Medicine for intramural funding. We thank Douglas Annis for cloning the four STAT3 variants.
MJ Research PTC-200 Thermal Cycler | GMI | N/A | Used for standard PCR |
7500 Real-Time PCR System | Applied Biosystems | N/A | qPCR machine |
GS-6R | Beckman Coulter | N/A | centrifuge for 96-well plates |
Nanodrop 2000 sprectrophotometer | ThermoFisher Scientific | N/A | |
RPMI-1640 medium | Sigma Aldrich | R8758 | cell culture medium |
PfuTurbo DNA Polymerase | Agilent Technologies | 600410 | DNA polymerase for standard PCR |
KpnI | New England Biosciences | R0142S | |
NheI | New England Biosciences | R0131S | |
SYBR Green PCR Master Mix | Qiagen | 330523 | qPCR, DNA polymerase/dsDNA-binding dye mix |
Rneasy Mini Kit | Qiagen | 74204 | RNA extraction kit |
SuperScript III First-Strand Synthesis System | Invitrogen (ThermoFisher Scientific) | 18080-044 | cDNA synthesis kit |
Primers | Integrated DNA Technology | N/A | |
NEBuffer 1.1 | New England Biosciences | B7201S | |
GenePure LE Agarose | ISC BioExpress | E-3120-500 | component of TAE gel |
Pipettors | Major lab suppliers (MLS) | N/A | |
Filter pipette tips | Neptune Scientific | BT10XL, BT20, BT200 | |
EU One Piece Thin Wall Plate | MidSci | ABI7501 | |
ThermalSeal A Sealing Film | MidSci | TSA-100 | 96 well plate seal |
pET-Elmer (variant of pET-28a) | Novagen; modified in Mosher lab | N/A | Details in PMID: 20947497 |
Wizard Plus SV Minipreps DNA purification system | Promega | A1460 | Plasmid purification |
BigDye Terminator v3.1 Cycle Sequencing Kit | ThermoFisher Scientific | 4337455 | Sequencing kit |
QIAEX II Gel Extraction kit | Qiagen | 20021 | Amplicon purification |
DH5α competent cells | ThermoFisher Scientific | 18265-017 | available from several providers, see PMID: 2162051 |
kanamycin | Research Products International Corp. | K22000-5.0 | |
Tris base | ThermoFisher Scientific | BP152-5 | component of TAE buffer |
Acetic acid, glacial | ThermoFisher Scientific | A38C-212 | component of TAE buffer |
EDTA (Ethylenediaminetetraacetic acid) | Sigma Chemical Company (Sigma Aldrich) | E-5134 | component of TAE buffer |
Bacto Tryptone | BD Biosciences | 211705 | component of Luria Broth |
Bacto Yeast extract, technical | BD Biosciences | 288620 | component of Luria Broth |
Sodium chloride | ThermoFisher Scientific | S271-10 | component of Luria Broth |
Sodium hydroxide | ThermoFisher Scientific | SS255-1 | component of Luria Broth |
Bacto Agar | BD Biosciences | 214010 | component of Luria Broth plate |
Lasergene SeqBuilder | DNASTAR | Figure 1 generated using Lasergene SeqBuilder software version 12.2.0 (DNASTAR) |