Here, we present a protocol optimized for the processing of coding (mRNA) and non-coding (ncRNA) globin reduced RNA-seq libraries from a single whole blood sample.
The advent of innovative and increasingly powerful next generation sequencing techniques has opened new avenues into the ability to examine the underlying gene expression related to biological processes of interest. These innovations not only allow researchers to observe expression from the mRNA sequences that code for genes that effect cellular function, but also the non-coding RNA (ncRNA) molecules that remain untranslated, but still have regulatory functions. Although researchers have the ability to observe both mRNA and ncRNA expression, it has been customary for a study to focus on one or the other. However, when studies are interested in both mRNA and ncRNA expression, many times they use separate samples to examine either coding or non-coding RNAs due to the difference in library preparations. This can lead to the need for more samples which can increase time, consumables, and animal stress. Additionally, it may cause researchers to decide to prepare samples for only one analysis, usually the mRNA, limiting the number of biological questions that can be investigated. However, ncRNAs span multiple classes with regulatory roles that effect mRNA expression. Because ncRNA are important to fundamental biologic processes and disorder of these processes in during infection, they may, therefore, make attractive targets for therapeutics. This manuscript demonstrates a modified protocol for the generation mRNA and non-coding RNA expression libraries, including viral RNA, from a single sample of whole blood. Optimization of this protocol, improved RNA purity, increased ligation for recovery of methylated RNAs, and omitted size selection, to allow capture of more RNA species.
Next generation sequencing (NGS) has emerged as a powerful tool for the investigation of the changes that occur at the genomic level of biological organisms. Sample preparation for NGS methods can be varied depending on the organism, tissue type, and more importantly the questions the researchers are keen to address. Many studies have turned to NGS as a means of studying the differences in gene expression between states such as healthy and sick individuals1,2,3,4. The sequencing take place on a whole genome basis and allows a researcher to capture the most, if not all, of the genomic information for a particular genetic marker at a time point.
The most common markers of expression observed are the messenger RNAs (mRNAs). The most used procedures for prepping libraries for RNA-seq are optimized for the recovery of mRNA molecules through the use of a series of purifications, fragmentations, and ligations5,6. However, the decision on how a protocol is to be performed relies heavily on the sample type and the questions being posed about said sample. In most cases total RNA is extracted; yet, not all RNA molecules are of interest and in cases such as mRNA expression studies overly abundant RNA species, like ribosomal RNAs (rRNA) need to be removed to increase the number of detectable transcripts associated with the mRNAs. The most popular and widely used method for removing the abundant rRNA molecules is the reduction of polyadenylated RNA transcripts referred to as polyA depletion7. This approach works well for the analysis of mRNA expression as it does not affect the mRNA transcripts. However, in studies that are interested in non-coding or viral RNAs, polyA depletion also removes these molecules.
Many studies choose to focus on the RNA sequence library preparation to examine either mRNA expression (coding) or a particular class of small or large non-coding RNA. Although there are other procedures8 like ours that allow for the dual sample preparation, many studies prepare libraries from separate samples for separate studies when available. For a study like ours, this would normally require multiple blood samples increasing time, consumables, and animal stress. The goal of our study was to be able to use whole blood from animals to identify and quantify the different classes of both mRNA and non-coding RNA expressed between healthy and highly pathogenic porcine reproductive and respiratory syndrome virus (HP-PRRSV) challenged pigs9,10 despite having only a single whole blood sample (2.5 mL) from each pig. In order to do this, we needed to optimize the typical extraction and library creation protocols to generate the proper data to allow for analysis of both mRNA and non-coding RNA (ncRNA) expression11 from a single sample.
This prompted a need for a protocol that allowed for mRNA and non-coding RNA analysis because the available standard kits and methods for RNA-extraction and library creation were intended chiefly for mRNA and use a poly-A depletion step12. This step would have made it impossible to recover non-coding RNA or viral transcripts from the sample. Therefore, an optimized method was needed that allowed for total RNA extraction without sample polyA depletion. The method presented in this manuscript has been optimized to allow for the use of whole blood as a sample type and to build sequencing libraries for both mRNA and ncRNAs of small and large sizes. The method has been optimized to allow for the analysis of all detectable non-coding RNAs as well as retain viral RNAs for later investigation13. In all, our optimized library preparation protocol allows for the investigation of multiple RNA molecules from a single whole blood sample.
The overall goal behind the use of this method was to develop a process that allowed for the collection of both non-coding RNA and mRNA from one sample of whole blood. This allows us to have mRNA, ncRNA, and viral RNA for each animal in our study sourced from a single sample9. This, ultimately, allows for more scientific discovery without additional animal costs and gives a more complete picture of the expression of each individual sample. The described method allows for the examination of the regulators of gene expression as well as allowing for completion of correlative studies comparing both mRNA and non-coding RNA expression using a single whole blood sample. Our study used this protocol to examine the changes in gene expression and possible epigenetic regulators in virally infected 9-week old male commercial pigs.
Animal protocols were approved by the National Animal Disease Center (USDA-ARS-NADC) Animal Care and Use Committee.
1. Collection of Swine Blood Samples
2. Processing of Swine Blood Samples
3. Organic Extraction for Total RNA and Small RNA (miRNA Isolation Kit)
4. Total RNA Isolation Procedure
5. Globin Reduction (based on a protocol optimized for porcine whole blood samples)14,15
NOTE: Globin reduction is performed so that libraries are not overpopulated with reads mapping to globin genes, which would lower the number of reads available to map to other genes of greater interest14,15 .
6. Assessment of RNA
7. Stranded Total RNA Sample Preparation for the mRNA and long ncRNA libraries.16
8. Small RNA Library Preparation for the sncRNAs.17
NOTE: Protocol steps based on manufacturer's instructions17.
9. Sample pooling for sequencing
The representative samples in our study are the globin and ribo-depleted whole blood samples. The representative outcome of the protocol consists of a globin depleted library sample with an RNA integrity number (RIN) above 7 (Figure 1a) and 260/280 nm concentration ratios at or above 2 (Figure 1b and 1c). Validation of the sample outcome was performed using spectrophotometer to give the final concentration of each library and chip-based electrophoresis to give RIN number along with a graph of peaks that show which molecules (mRNA or ncRNA) were captured based on insert size in the library sample prior to pooling and sequencing. For the current study focus was placed on the mRNA and small ncRNAs only. For the mRNA libraries the representative result is an electropherogram peak at ~280 bp (Figure 1d). For the small ncRNAs representative results consist of a range of peaks from ~100-400 bp (Figure 1e), with peaks at ~143 and ~153 bp correspond to miRNAs and piRNAs, respectively. Our sample results showed that our optimized technique resulted in libraries with RIN numbers that ranged from 6.3 (sub-optimal) to 9.2 (above optimal). This proved to be an improvement compared to other studies that used globin depletion methods and were only able to achieve RIN numbers at or near 6. The chip-based electrophoresis results also showed that from one single blood sample it is possible to achieve RNA molecule capture of peaks representing both mRNA and ncRNA (Figures 1d and 1e), and sample insert sizes that covered both small and large non-coding RNA molecules. These results are representative of the optimal RIN scores and insert sizes needed to ensure quality transcript reads can sequenced from the prepared libraries for nearly all NGS RNA-seq platforms.
Figure 1: Please click here to view a larger version of this figure.
Figure 1b: Please click here to view a larger version of this figure.
Figure 1c: Please click here to view a larger version of this figure.
Figure 1d: Please click here to view a larger version of this figure.
Figure 1e: Representative results for mRNA and non-coding RNA expression libraries from single samples of porcine whole blood. (a) Electrophoresis file run summary representation of RIN numbers pre-globin reduction and post-globin reduction. Representative electropherograms (b) pre-globin reduction and (c) post-globin reduction of a single porcine whole blood sample. (d) Representative electropherograms of globin and ribo-depleted whole blood mRNA library samples prior to pooling and sequencing. (e) Representative electropherograms of globin depleted sncRNA whole blood libraries of the same samples featured in panel d prior to pooling and sequencing Please click here to view a larger version of this figure.
The first critical step in the protocol that made it optimized included the added globin depletion steps, which made it possible to get quality reads from whole blood samples. One of the largest limitations on using whole blood in sequencing studies are the high numbers of reads in the sample that will map to globin molecules and reduce the reads that could map to other molecules of interest18. Therefore, in optimizing the protocol for our sample type, we needed to incorporate a globin depletion step to ensure the highest possible mRNA and non-coding RNA capture through sequencing. All of the samples were globin depleted to account for high levels of globin transcripts using porcine specific hemoglobin A and B (HBA and HBB) oligonucleotides based on the procedure from Choi et al., 201414. Another critical step was the use of a ribo-depletion extraction kit which allowed for the ability remove unwanted RNA molecules from our samples while simultaneously retaining polyadenylated non-coding and viral RNA molecules of experimental interest within our libraries for sequencing19. Additionally, we were able to work with single samples to create both coding (mRNA) and non-coding (small and long) RNA libraries by removing the small RNA enrichment and size selection steps. By doing this, we were able to maximize the RNA aliquots for pooling and to guarantee we sent enough volume to allow for sequencing. Optimization of all of the manufacturer's protocols were done to increase mRNA and non-coding RNA recovery for downstream library creation. We also optimized the non-coding RNA library preparation for the small non-coding RNA portion of our downstream analysis by adding additional time to the PCR incubation per the manufacturers' instructions to increase the ligation efficiency of methylated RNAs and the likelihood of recovering piwi-RNAs.
The limitations of the optimized protocol are rooted in the lack of specificity that also makes it ideal for multiple analyses from a single sample. By removing the enrichment and size selection portions of the non-coding RNA library preparation, we are limiting novel small RNA discovery20 to balance it against acquiring both small and long non-coding RNA molecules. Hence, the optimized protocol gives a good overview of the types of non-coding expression but, requires extra processing during the analytical stage of mapping to gain more specific information on the classes and sizes of non-coding RNA expression captured in the sample. Another limitation to the method is due to the incorporation of the globin depletion steps, which will lower the overall RIN numbers. We were able to troubleshoot this by first examining the RIN numbers before and after globin depletion to understand how large of a reduction in RNA integrity we would experience. This yielded results that showed we could experience a drop of ~1-2 points. To account for this drop we further optimized the globin depletion protocol to 6 µg of sample instead of the recommended 10 µg. This helped to improve our RIN numbers, as well as, conserve sample. Additionally, we also used thin-walled microamp reaction tubes and iced the tubes immediately after the first denaturing step. This allowed us to quench the reaction quicker, allowing for improved RIN numbers on the globin depleted samples.
The modified protocol used in this study has several advantages over the base library preparation methods used in other studies where only mRNA or non-coding RNA are studied separately. The changes we made for optimization allowed for efficient use of whole blood as our sample type. This is advantageous to future studies because whole blood as a sample type has less steps for collection and processing than the leukocyte portion of blood. Additionally, whole blood also has the added advantages of allowing researchers to examine systemic responses and can be repeatedly collected from the animal. However, there is a caveat to the use of whole blood samples in that the amount of blood collected is dependent on the age and size of the organism in question. Smaller amounts of blood will lead to lower RNA yields, however the method presented here will allow dual library creation from at least 2.5 mL of whole blood. This allowed for us to use one sample to investigate both mRNA and non-coding expression and correlate both to an individual at a snapshot in time, effectively allowing us to collect more information than traditional library preparation methods. Also, by performing the ribo-depletion we made it possible to reduce transcripts that could deplete sequencing reads, while still retaining the non-coding and viral transcripts that can be lost during traditional library creation to study mRNA. In this way our optimized protocol triples the amount of information that can be gained from a single sample. Other significant changes we made to the method to facilitate the capture of more RNA species information were: changing the type of PCR plate to quickly stop the reaction, which improved RNA purity yield; a longer thermocycler incubation to increase ligation for possible recovery of methylated RNAs; and not employing a size selection gel, to allow for all detectable ncRNAs to identified regardless of length. For the downstream analysis this allowed for the capture of multiple non-coding RNAs between 18nt-200nt in length.
By employing this optimized method, our group was able to put forward a protocol that can be applied to whole blood transcriptomic analyses and allow for mRNA and non-coding RNA from a single sample.
The authors have nothing to disclose.
This work was mainly supported by the by USDA NIFA AFRI 2013-67015-21236, and in part by USDA NIFA AFRI 2015-67015-23216. This study was supported in part by an appointment to the Agricultural Research Service Research Participation Program administered by the Oak Ridge Institute for Science and Education (ORISE) through an interagency agreement between the US Department of Energy (DOE) and the US Department of Agriculture. ORISE is managed by Oak Ridge Associated Universities under DOE contract no. DE-AC05-06OR2310.
We would like to thank Dr. Kay Faaberg for the HP-PRRSV infectious clones, Dr. Susan Brockmeier for her help with animals involved in the experiment, and Sue Ohlendorf for secretarial assistance in preparation of the manuscript.
PAXgene Tubes | PreAnalytix | 762165 | |
Molecular Biology Grade Water | ThermoFisher | 10977-015 | |
mirVana miRNA Isolation Kit | ThermoFisher | AM1560 | |
Rneasy MinElute Clean Up Kit | QIAGEN | 74204 | |
100% Ethanol | Decon Labs, Inc. | 2716 | |
0.2 mL thin-walled tubes | ThermoFisher | 98010540 | |
1.5 mL RNase/DNase – free tubes | Any supplier | ||
Veriti 96-well Thermocycler | ThermoFisher | 4375786R | |
Globin Reduction Oligo (α 1) | Any supplier | Sequence GAT CTC CGA GGC TCC AGC TTA ACG GT | |
Globin Reduction Oligo (α 2) | Any supplier | Sequence TCA ACG ATC AGG AGG TCA GGG TGC AA | |
Globin Reduction Oligo (β 1) | Any supplier | Sequence AGG GGA ACT TAG TGG TAC TTG TGG GT | |
Globin Reduction Oligo (β 2) | Any supplier | Sequence GGT TCA GAG GAA AAA GGG CTC CTC CT | |
10X Oligo Hybridization Buffer | |||
-Tris-HCl, pH 7.6 | Fisher Scientific | BP1757-100 | |
-KCl | Millipore Sigma | 60142-100ML-F | |
10X RNase H Buffer | |||
-Tris-HCl, pH 7.6 | Fisher Scientific | BP1757-100 | |
-DTT | ThermoFisher | Y00147 | |
-MgCl2 | Promega | A351B | |
-Molecular Biology Grade Water | ThermoFisher | 10977-015 | |
RNase H | ThermoFisher | AM2292 | |
SUPERase-IN | ThermoFisher | AM2694 | Rnase inhibitor |
EDTA | Millipore Sigma | E7889 | |
Microcentrifuge | Any supplier | ||
2100 Electrophoresis BioAnalyzer Instrument | Agilent Technologies | G2938C | |
Agilent RNA 6000 Nano Kit | Agilent Technologies | 5067-1511 | |
Agilent High Sensitivity DNA Kit | Agilent Technologies | 5067-4626 | |
TruSeq Stranded Total RNA Library Prep Kit with Ribo-Zero | Illumina | RS-122-2201 | mRNA kit; Human/Mouse/Rat Set A (48 samples, 12 indexes) |
TruSeq Stranded Total RNA Sample Preparation Guide | Illumina | Available on-line | |
RNAClean XP Beads | BeckmanCoulter | A63987 | |
AMPure XP Beads | BeckmanCoulter | A63880 | |
MicroAmp Optical 8-tube Strip | ThermoFisher | N8010580 | 0.2 ml thin-walled tubes |
MicroAmp Optical 8-tube Strip Cap | ThermoFisher | N801-0535 | |
RNase/DNase – free reagent reservoirs | Any supplier | ||
SuperScript II Reverse Transcriptase | ThermoFisher | 18064-014 | |
MicroAmp Optical 96 well plates | ThermoFisher | N8010560 | These were used in place of .3mL plates as needed |
MicroAmp Optical adhesive film | ThermoFisher | 4311971 | |
NEBNext Multiplex Small RNA Library Prep Set for Illumina® (Set 1) | New England Biolabs | E73005 | small RNA kit |
NEBNext Multiplex Small RNA Library Prep Set for Illumina® (Set 2) | New England Biolabs | E75805 | small RNA kit |
QIAQuick PCR Purification Kit | QIAGEN | 28104 | |
96S Super Magnet Plate | ALPAQUA | A001322 |