Improving Small RNA-seq: Less Bias and Better Detection of 2'-O-Methyl RNAs

Erwin L. van Dijk; Evangelia Eleftheriou; Claude Thermes

doi:10.3791/60056

JoVE Journal > Genetics

遺伝学

Improving Small RNA-seq: Less Bias and Better Detection of 2′-O-Methyl RNAs

Published: September 16, 2019

doi:

10.3791/60056

Erwin L. van Dijk¹, Evangelia Eleftheriou¹, Claude Thermes¹

¹Institute for Integrative Biology of the Cell, UMR9198, CNRS CEA Univ Paris-Sud,Université Paris-Saclay

概要

We present a detailed small RNA library reparation protocol with less bias than standard methods and an increased sensitivity for 2'-O-methyl RNAs. This protocol can be followed using homemade reagents to save cost or using kits for convenience.

Abstract

The study of small RNAs (sRNAs) by next-generation sequencing (NGS) is challenged by bias issues during library preparation. Several types of sRNA such as plant microRNAs (miRNAs) carry a 2'-O-methyl (2'-OMe) modification at their 3' terminal nucleotide. This modification adds another difficulty as it inhibits 3' adapter ligation. We previously demonstrated that modified versions of the 'TruSeq (TS)' protocol have less bias and an improved detection of 2'-OMe RNAs. Here we describe in detail protocol 'TS5', which showed the best overall performance. TS5 can be followed either using homemade reagents or reagents from the TS kit, with equal performance.

Introduction

Small RNAs (sRNAs) are involved in the control of a diversity of biological processes¹. Eukaryotic regulatory sRNAs are typically between 20 and 30 nt in size; the three major types are microRNAs (miRNA), piwi-interacting RNAs (piRNA) and small interfering RNAs (siRNA). Aberrant miRNA expression levels have been implicated in a variety of diseases². This underscores the importance of miRNAs in health and disease and the requirement for accurate, quantitative research tools to detect sRNAs in general.

Next-generation sequencing (NGS) is a widely used method to study sRNAs. Main advantages of NGS as compared with other approaches, such as quantitative PCR or microarray techniques (qPCR), are that it does not need a priori knowledge of the sRNA sequences and can therefore be used to discover novel RNAs, and in addition it suffers less of background signal and saturation effects. Further, it can detect single nucleotide differences and has a higher throughput than microarrays. However, NGS also has some drawbacks; the cost of a sequencing run remains relatively high and the multistep process required to convert a sample into a library for sequencing may introduce biases. In a typical sRNA library preparation process, a 3' adapter is first ligated to the sRNA (often gel-purified from total RNA) using a truncated version of RNA ligase 2 (RNL2) and a preadenylated 3' adapter (Figure 1) in the absence of ATP. This increases the efficiency of sRNA-adapter ligation and reduces the formation of side reactions such as sRNA circularization or concatemerization. Subsequently, a 5’ adapter is ligated by RNA ligase 1 (RNL1), followed by reverse transcription (RT) and PCR amplification. All these steps may introduce bias³^,⁴. Consequently, read numbers may not reflect actual sRNA expression levels leading to artificial, method-dependent expression patterns. Specific sRNAs may be either over- or underrepresented in a library, and strongly underrepresented sRNAs may escape detection. The situation is particularly complicated with plant miRNAs, siRNAs in insects and plants, and piRNAs in insects, nematodes and mammals, in which the 3' terminal nucleotide has a 2'-O-methyl (2'-OMe) modification¹. This modification strongly inhibits 3' adapter ligation⁵, making library preparation for these types of RNA a difficult task.

Previous work demonstrated that adapter ligation introduces serious bias, due to RNA sequence/structure effects⁶^,⁷^,⁸^,⁹^,¹⁰^,¹¹. Steps downstream of adapter ligation such as reverse transcription and PCR do not significantly contribute to bias⁶^,¹¹^,¹². Ligation bias is likely due to the fact that adapter molecules with a given sequence will interact with sRNA molecules in the reaction mixture to form co-folds, that may either lead to favorable or unfavorable configurations for ligation (Figure 2). Data from Sorefan et al⁷ suggest that RNL1 prefers a single stranded context, while RNL2 prefers a double strand for ligation. The fact that the adapter/sRNA co-fold structures are determined by the specific adapter and sRNA sequences explains why specific sRNA are over- or underrepresented with a given adapter set. It is also important to note that within a series of sRNA libraries to be compared, the same adapter sequences should be used. Indeed, it has previously been observed that changing adapters by the introduction of different barcode sequences alters miRNA profiles in sequencing libraries⁹^,¹³.

Randomization of adapter sequences near the ligation junction likely reduces these biases. Sorefan and colleagues⁷ used adapters with 4 random nucleotides at their extremities, designated "High Definition" (HD) adapters, and showed that the use of these adapters lead to libraries that better reflect true sRNA expression levels. More recent work confirmed these observations and revealed that the randomized region does not need to be adjacent to the ligation junction¹¹. This novel type of adapters was named "MidRand" adapters. Together, these results demonstrate that improved adapter design can reduce bias.

Instead of modifying the adapters, bias can be suppressed through the optimisation of reaction conditions. Polyethylene glycol (PEG), a macromolecular crowding agent known to increase ligation efficiency¹⁴, has been shown to significantly reduce bias¹⁵^,¹⁶. Based on these results, several "low bias" kits appeared on the market. These include kits that use PEG in the ligation reactions, either in combination with classical adapters or HD adapters. Other kits avoid ligation altogether, and use 3' polyadenylation and template switching for 3' and 5' adapter addition, respectively¹². In yet another strategy, 3' adapter ligation is followed by a circularization step, thus omitting 5' adapter ligation¹⁷.

In a previous study, we searched for a sRNA library preparation protocol with the lowest possible levels of bias and the best detection of 2'-OMe RNAs¹². We tested some of the above-mentioned 'low bias' kits, which had a better detection of 2'-OMe RNAs than the standard protocol (TS). Surprisingly however, upon modification (the use of randomized adapters, PEG in the ligation reactions and removal of excess 3' adapter by purification) the latter outperformed the other protocols for the detection of 2'-OMe RNAs. Here, we describe in detail a protocol based on the TS protocol, 'TS5', which had the best overall detection of 2'-OMe RNAs. The protocol can be followed using reagents from the TS kit and one reagent from the 'Nf' kit or, to save money, using homemade reagents, with equal performance. We also provide a detailed protocol for the purification of sRNA from total RNA and the preparation of preadenylated 3' adapter.

Protocol

1. Isolation of small RNAs

Extract total RNA using phenol-based reagents or any other method. Verify if the RNA is of good quality.
Pre-run a 15% TBE-urea gel (see Table of Materials) for 15 min at 200 V.
While the gel is pre-running, mix 5-20 µg of total RNA in a 5-15 µL volume with an equal volume of formamide loading dye (see Table of Materials; 95% deionized formamide, 0.025% bromophenol blue, 0.025% xylene cyanol, 5 mM EDTA pH 8) in a 200 µL PCR tube. Likewise, mix 10 µL (200 ng) of small-RNA ladder (see Table of Materials) with an equal volume of formamide loading dye. Incubate for 5 min at 65 °C in a thermocycler with heated lid, then place the tubes immediately on ice.
Load the ladder and the sample on the same gel with at least one lane between them and run at 200 V until the bromophenol blue (dark blue) has migrated about two-third of the gel length (approximately 40 min).
Prepare a system to elute the RNA from gel as follows: puncture the bottom of a nuclease-free 0.5 mL micro centrifuge tube with a 21-gauge needle. Place the punctured 0.5 mL tube in a nuclease-free round-bottom 2 mL micro centrifuge tube.
Remove the gel, and incubate at room temperature with 3 μL nucleic acid gel stain (10,000 x concentrate; see Table of Materials) in 30 mL water for 10-15 min.
View the gel on a 'Dark Reader' trans illuminator (it is strongly recommended to avoid UV as this might damage the RNA) and cut out the sample RNA between the 17 nt and the 29 nt ladder bands. Transfer the gel piece to the 0.5 mL tube from step 1.5.
Centrifuge the 0.5 mL tube in the 2 mL tube in a micro centrifuge at maximum speed for 2 min. Remove the 0.5 mL tube, which should be empty now.
Add 300 µL of nuclease-free 0.3 M NaCl to the 2 mL tube containing the crushed gel and rotate for at least 2 h at room temperature or at 4 °C overnight (16 h).
Transfer the suspension of crushed gel pieces to a spin column (see Table of Materials) and centrifuge for 2 min at maximum speed in a micro centrifuge.
Add 1 µL (20 µg/µL) of glycogen (see Table of Materials) and 950 µL of room temperature 100% ethanol. Incubate for at least 30 min at -80 °C.
Centrifuge for 20 min at maximum speed in a micro centrifuge at 4 °C. Remove the supernatant, wash the pellet with 800 µL of cold 80 % ethanol. Centrifuge again for 5 min at 4 °C, and carefully remove all supernatant. Resuspend RNA pellet in 15 µL of nuclease free water. Typically, ~5-20 ng of small RNA should be recovered, depending on the amount of input total RNA (~1 ng of small RNA per 1 µg of input total RNA).
Recommended additional step: Check the quantity and quality of the recovered sRNA (e.g., by capillary gel electrophoresis using a small RNA kit; see Table of Materials).

2. Preparation of preadenylated 3' HD adapter

NOTE: Preadenylation of 3' HD adapter was done in a manner similar to the protocol described by Chen et al¹⁸. Note that preadenylated adapter can be ordered directly (/5rApp/ modification), but this is quite expensive.

Order 5' phosphorylated, 3' blocked 3' HD adapter oligonucleotide. See Table 1 for sequence and modifications. Note that '3AmMO' is a 3' amino modifier group, most suppliers can produce oligonucleotide with this modification. Dilute in nuclease free water to 100 µM.
Set up a 100 µL reaction containing the following reagents: 10 µL of oligo (100 µM), 10 µL of T4 RNA ligase buffer (10x), 10 µL of ATP (10 mM), 40 µL of 50% PEG8000, 5 µL of T4 RNA ligase 1 (50 units), 25 µL of nuclease-free water. Incubate overnight at 20 °C.
Perform a classical phenol-chloroform extraction of the preadenylated oligonucleotide followed by ethanol precipitation. Add 100 µL of acid (pH 4.5) phenol:chloroform and vortex. Spin for 5 min at room temperature, maximum speed. Carefully transfer 90 µL of the upper phase to a new tube; add 10 µL of 3 M sodium acetate pH 5.2, 1 µL of ultra-pure glycogen and 250 µL of cold 100% ethanol.
Keep at -20 °C for at least 30 min. Centrifuge for 30 min at 4 °C maximum speed. Remove the supernatant, wash the pellet once with 500 µL cold 80% ethanol. As an alternative to phenol-chloroform extraction and ethanol precipitation, a nucleotide removal kit can be used (see Table of Materials). Resuspend in 25 µL water.
Measure the concentration of the adapter using a kit for specific detection of single-stranded DNA (see Table of Materials). Dilute to 80 ng/µL (10 µM).
Recommended additional step: Verify the efficacy of preadenylation by migrating 1 µL of 10 µM adapter on a 15% TBE-urea gel (Table of Materials) along with untreated oligonucleotide. The preadenylated adapter should migrate slightly slower than untreated oligonucleotide. If desired, the preadenylated adapter can be gel-purified; proceed as described above (steps 1.2-1.12).

3. Library preparation – Protocol TS5

NOTE: We present here the modified TS protocol 'TS5' that we described previously¹² and that can be performed either with reagents from the kit or with self-provided reagents. It should be noted that we obtained similar or even slightly better results with a different protocol, 'TS7'. However, with TS7 it is more difficult to eliminate adapter dimers. We have therefore preferred to describe TS5 in detail, but TS7 can be followed by simply replacing the adapters. For TS7 use the 'MidRand-Like (MRL)' adapter sequences (Table 1). Note that here the randomized regions are in the middle of the adapters. Primers for reverse transcription and PCR will hybridize to the sequences downstream of the randomized region in the 3' adapter and upstream of the randomized region in the 5' adapter. Sequencing will start from the first randomized nucleotide in the 5' adapter.

3' adapter ligation.
1. Combine 1 µL of preadenylated 3' HD adapter (10 µM) with 1 µL of purified small RNA (~0.1-1 µM) in a 0.2 mL micro centrifuge tube. Incubate for 2 min at 72 °C in a thermo cycler, then put directly on ice.
2. Add 4 µL of 50 % PEG 8000 (viscous solution; pipet slowly), 1 µL of RNA ligase buffer (10x), 1 µL of H₂O, 1 µL of T4 RNA ligase2 truncated, and 1 µL of RNase inhibitor. Incubate overnight at 16 °C.
Elimination of unligated 3' adapter
1. Add 10 µL of nuclease-free water and mix well. Add 6 µL of 3 M NaOAc pH 5.2 or 'Adapter Depletion Solution' from the Nf kit (Table of Materials) and mix well. Add 40 µL of magnetic purification beads (Table of Materials) and 60 µL of isopropanol and mix well. Incubate for 5 min at room temperature.
2. Put the sample in a magnetic rack until the solution appears clear. Remove and discard the supernatant.
3. Add 180 µL of freshly prepared 80 % ethanol. Incubate for ~30 s, then remove. Take care to use freshly prepared 80% ethanol and do not incubate with 80% ethanol for extended periods.
4. Briefly spin the tube and remove residual liquid that may have collected at the bottom of the well. Let the beads dry for 2 min, then resuspend in 22 µL of 10 mM Tris pH 8 or resuspension buffer from the Nf kit. Incubate for 2 min, then magnetize the sample until the solution appears clear.
5. Add 6 µL of 3 M NaOAc pH 5.2 or ‘Adapter Depletion Solution’ from the Nf kit (Table of Materials) to a new tube. Transfer 20 µL of the supernatant from the previous step to this new tube and mix by pipetting. Add 40 µL of magnetic beads and 60 µL of 100% isopropanol and mix well by pipetting. Incubate for 5 min.
6. Magnetize the sample until the solution appears clear, then remove and discard the supernatant.
7. Add 180 µL of freshly prepared 80 % ethanol. Incubate for ~30 s, then remove. Take care to use freshly prepared 80% ethanol and do not incubate for with 80% ethanol for extended periods.
8. Briefly spin the tube and remove residual liquid that may have collected at the bottom of the well. Let the beads dry for 2 min, and resuspend in 10 µL of nuclease-free water. Incubate for 2 min, then magnetize the sample until the solution appears clear.
9. Transfer 9 µL of supernatant to a new tube. Add 1 µL of T4 RNA ligase buffer (10x) and 1 µL of water. Alternatively, add 2 µL of ligase buffer from the TS kit.
Ligation of 5' adapter.
1. Add 1 µL of 5' HD adapter (10 µM; Table 1) to a 200 µL PCR tube in a thermo cycler with heated lid. Incubate for 2 min at 70 °C, then put the tube directly on ice.
2. Add 1 µL of 10 mM ATP and 1 µL of T4 RNA ligase 1. Mix well by gently pipetting. Add 3 µL of this mix to the 3' ligated RNA from step 3.2.9 and mix by pipetting. Incubate for 1 h at 28 °C.
Reverse transcription (RT)
1. Transfer 6 µL of 3' and 5' adapter ligated RNA to a new 200 µL PCR tube (keep the remaining ~8 µL at -80 °C for later use if necessary). Add 1 µL of RT primer (10 µM; Table 1) and mix by pipetting. Incubate for 2 min at 70 °C, then put the tube directly on ice.
2. Add the following reagents for RT: 2 µL of 5x first strand buffer, 0.5 µL of 12.5 mM dNTP mix, 1 µL of 100 mM DTT, 1 µL of RNase inhibitor, and 1 µL of reverse transcriptase (Table of Materials). Incubate for 1 h at 50 °C.
PCR amplification
1. Add the following reagents to the 12.5 µL of RT reaction mixture: 10 µL of PCR polymerase buffer, 2 µL of universal P5 primer (10 µM; Table 1), 2 µL of P7-index primer (10 µM; Table 1), 1 µL of 12.5 mM dNTPs, 0.5 µL of DNA polymerase (Table of Materials), and 22 µL of water. Keep the reaction on ice until use.
2. Run the following PCR program: 98 °C for 30 s, 11 cycles (98 °C for 10 s, 60 °C for 30 s and 72 °C for 15 s) and 72 °C for 10 min. Keep the reaction at 4 °C when finished.
  NOTE: Concerning the number of PCR cycles: we typically perform 11 cycles, but this should be optimized by the user. Try to perform the smallest possible number of PCR cycles.
Gel purification
1. Run 5, 10 and 20 µL of PCR product on a native 6% TBE gel (see Table of Materials) along with a suitable ladder (see Table of Materials). Run the gel for about 1 h at 145 V (until the bromophenol blue reaches the bottom; this dye migrates at the 65 bp position).
2. Remove the gel, and incubate with nucleic acid gel stain (see Table of Materials) in water for 10-15 min.
3. View the gel on a "Dark Reader" trans illuminator (it is strongly recommended to avoid UV as this might damage the RNA) and cut out the library band at 150 bp. Prepare a system to elute the RNA from gel as described in step 1.5 and transfer the gel piece to the 0.5 mL tube.
4. Centrifuge in a micro centrifuge at maximum speed for 2 min. Remove the 0.5 mL tube, which should be empty now.
5. Add 300 µL of nuclease-free water to the 2 mL tube containing the crushed gel and rotate for at least 2 h at room temperature or at 4 °C overnight.
6. Transfer the suspension of crushed gel pieces in water to a spin column and centrifuge for 2 min at maximum speed.
7. Add 1 µL of (20 µg/µL) glycogen, 30 µL of 3 M NaOAc and 975 µL of ice cold 100% ethanol. Centrifuge for 20 min at max speed at 4 °C.
8. Resuspend the pellet in 20 µL of 10 mM Tris pH8. Use 1 µL for concentration measurement and 1 µL for quality control.

4. Data analysis

NOTE: The data analysis procedure described below is based on the Linux operating system Ubuntu 16.04 LTS.

Treatment of raw sequence files
1. Download the FASTQ sequence file(s) generated during the sequencing run. If required, perform demultiplexing with bcl2fastq2 (version V2.2.18.12; a manual can be downloaded from the following link: http://emea.support.illumina.com/sequencing/sequencing_software/bcl2fastq-conversion-software/documentation.html).
  Use the following command:
  nohup Pathway_of_bcl2fastq/bcl2fastq –runfolder-dir Pathway_of_Run –ignore-missing-bcl –output-dir Pathway_of_Output_Directory –barcode-mismatches 1 –aggregated-tiles AUTO -r 16 -d 16 -p 16 -w 16
2. Remove adapter sequences using Cutadapt¹⁹ version 1.15. A manual can be downloaded here: https://cutadapt.readthedocs.io/en/stable/guide.html.
  Use the following command:
  Pathway_to_cutadapt/cutadapt -a TGGAATTCTCGGGTGCCAAGG -n 5 -O 4 -m 10 -j 0 –nextseq-trim 10 -o Output_File_Read1_cutadapt.fastq.gz Input_File_Read1.fastq.gz
  Note that the sequence in the command corresponds to the 3’ HD adapter without the 4 random nucleotides; these will therefore not be removed during this step.
3. Use seqtk (https://github.com/lh3/seqtk) for the removal of the terminal random nucleotides in the sequencing reads. Use the following command (variable names in bold):
  seqtk trimfq -b 4 -e 4 Output_File_Read1_cutadapt.fastq > Output_File_Read1_trimmed.fastq
4. Use the following awk command in order to discard the sequences shorter than 10 nt:
  awk 'BEGIN {FS = "t" ; OFS = "n"} {header = $0 ; getline seq ; getline qheader ; getline qseq ; if (length(seq) >= 10) {print header, seq, qheader, qseq}}' Output_File_Read1_trimmed.fastq > Length_Filtered.fastq
Mapping of the trimmed sequences
1. Download the database corresponding to the organism of study from miRBase as follows. Go to http://www.mirbase.org/ftp.shtml and download the ‘mature.fa’ file. Note that the sequences are indicated in RNA notation. Replace the U residues by T with the following command:
  sed -i '/^>/! s/U/T/g' mature.fa
  NOTE: This will yield a complete list of all miRNAs in miRBase, originating from a variety of organisms.
2. Select the miRNA sequences of your organism of interest with the following command:
  awk 'name_of_the_organism{print; nr[NR+1]; next}; NR in nr' mature.fa > mature_name_of_the_organism_mirs.fa
3. Map the reads to the above-created file using Bowtie2²⁰ (version 2.3.0) allowing no mismatches. First, build an index for your file with the following command:
  bowtie2-build mature_name_of_the_organism_mirs.fa mature_name_of_the_organism_mirs
4. Align the sequencing reads to the database, requiring that a read maps entirely to a miRNA of the database, without any mismatches. To this end, use the following tool:
  bowtie2 -N 0 -L 10 –score-min C,0,0 –end-to-end –time -x mature_name_of_the_organism_mirs -U Length_Filtered.fastq -S Length_Filtered_ALIGNMENT.sam
  NOTE: The option: –score-min C,0,0 ensures that alignment is without any mismatches. For an explanation of the various parameters in the tool, please visit the following website: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml
5. To discard the reads that did not align, use the following command:
  samtools view -F 4 Length_Filtered_ALIGNMENT.sam > Reads_aligned_to_Mirs.sam
  NOTE: As a result of these steps, you should now have obtained the aligned reads, corresponding to miRNAs.

Representative Results

Critical steps are the isolation of the small RNA fraction of the starting total RNA material (Figure 3) and the desired final library product (Figure 4). Both steps involve polyacrylamide gel purification; small RNA is isolated from 15% TBE urea gels, while the final libraries are isolated from 6% native TBE gels. Small RNA isolated from gel can be analyzed on a small RNA capillary electrophoresis chip (Table of Materials; Figure 3B). This will allow users to estimate the amount of small RNA recovered and the proportion of miRNA in the preparation.

Gel purification of the final library product is a delicate step as a number of additional products are formed that migrate close to the desired library. It is important to not overload the gel as this will increase the risk to contaminate the library with other species such as adapter dimers. As An example (Figure 4), increasing amounts of PCR-amplified library (from B. napus RNA) were loaded on the gel and the product corresponding to the expected size (150 bp) was cut out (Figure 4A). After elution, the purified library was checked on a capillary gel electrophoresis chip; in addition to the expected 150 bp product, an increasing proportion of a 130 bp species, corresponding to adapter dimers was observed as increasing amounts of PCR product were loaded (Figure 4B,C).

We have tested if protocol TS5 performs similarly with homemade reagents as with reagent from the kits. To this end, we prepared libraries from a mix of synthetic small RNAs 1-6, each present without or with a 2'-OMe modification, as done in our previous study¹². Figure 5 shows the proportion of reads corresponding to each of these RNAs obtained previously and with the new libraries made using reagents from the TS and Nf kits or from other suppliers. As can be seen, very similar results were obtained.

Figure 6 shows a comparison of the performance of protocol TS5 with the standard TS protocol for the detection of plant (A. thaliana and B. napus) miRNAs, which are 2'-OMe modified, and of unmodified human miRNAs. We also tested the detection of piRNAs, 2'-OMe modified in human samples. As can be seen, TS5 performs significantly better than TS for the detection of 2'-OMe RNAs but not for unmodified RNAs. However, also for unmodified sRNAs, even though not a larger number of RNAs are detected, the obtained read numbers probably better reflect true expression levels with TS5 than with TS due to lower levels of bias.

Figure 1. Schematic representation of the sRNA library preparation workflow for Illumina sequencing. First, sRNAs are isolated by a gel purification step. Here, the size range of miRNAs is indicated but any other size range could be selected, depending on the RNAs of interest. A quality control (QC) step is then performed to check the quality and quantity of isolated sRNA. During library preparation, a preadenylated (App) 3' adapter is first ligated to the sRNA. Then, a 5' adapter is ligated. Subsequently reverse transcription is performed using a primer complementary to the 3' adapter, followed by PCR amplification, during which the Illumina P5, P7, and index ('In') sequences are added. The resulting library is gel purified, followed by a quality control step. Then the library is sequenced, and data are analyzed. Please click here to view a larger version of this figure.

Figure 2. Bias due to sequence-dependent adapter-sRNA co-folding. In the adapter ligation mixture, adapters will cofold with sRNAs in a manner that depends on the sequence of the adapter and the sRNA. Thus, different sRNAs (illustrated by examples a, b, and c; indicated by different shades of green) will cofold differently with a given adapter (the 3' adapter is indicated in red, the 5' adapter is indicated in blue). The black arrows indicate ligation junctions. This may lead to a favorable or unfavorable context for ligation. As RNL2 appears to prefer a double stranded environment, 3' ligation is expected to be more efficient for RNAs a and b than for RNA c. With RNL1 having a preference for a single stranded region around the ligation junction, 5' adapter ligation may be most efficient for RNA a, followed by b and least efficient for c. Together this may result in an overrepresentation of sRNA a, an intermediate representation of RNA b and an underrepresentation of RNA c in the final library, even if the three RNAs are present at equal amounts in the original RNA sample. Note that in a similar fashion, a given sRNA will be represented differently when changing the adapter sequence (e.g., by adding different barcodes). Please click here to view a larger version of this figure.

Figure 3. Isolation of small RNA and quality control. (A). Electrophoretic separation of Brassica napus total RNA (10 µg) on a 15% TBE urea denaturing polyacrylamide gel. A small RNA ladder (see Table of Materials) was migrated along as a molecular size marker. After migration the gel was stained and the RNA was visualized on a trans illuminator. The region from 17 to 29 nucleotides was cut out (indicated by a red rectangle) and RNA was eluted. (B). The quality of the purified RNA was checked by capillary gel electrophoresis. Note that this analysis provides information on the proportion of miRNA in the sample (93% in this case). Please click here to view a larger version of this figure.

Figure 4. Gel purification of a B. napus small RNA library prepared following protocol TS5 and quality control. (A). Increasing amounts of a PCR amplified library from B. napus small RNA were loaded on a 6% native TBE gel; 2.5 µL (a), 5 µL (b), 10 µL (c) or 20 µL (d) PCR product. A 50 bp ladder was migrated alongside. PCR products migrating at the expected 150 pb position were isolated (red rectangle), DNA was eluted and purified. (B). Quality control of the purified library; gel representation. (C). Electropherogram representation of the same analysis. As can be seen, the 150 pb product is increasingly contaminated with adapter dimers (~130 bp) as larger amounts of PCR product are migrated on gel. Please click here to view a larger version of this figure.

Figure 5. Protocol TS5 performs similarly with reagents from the TS and Nf kits or with reagents from other suppliers. Histograms representing the percentage of the total numbers of raw reads (before trimming) corresponding to RNA(OMe)1-6 with protocol TS5 followed with reagents from the TS and Nf kits (blue bars), or reagents from other suppliers (orange bars). For comparison, results from our previous study are shown by grey bars. The total numbers of reads corresponding to RNA1-6 (total RNA) or RNA-OMe1-6 (total RNA-OMe) are shown as well. Shown are the mean values of at least two independent experiments. Error bars represent standard deviations. Part of this figure has been modified from Dard-Dascot et al, 2018¹² Please click here to view a larger version of this figure.

Figure 6. Comparison of miRNA detection of protocol TS5 with the classical TS method. (A). The proportion of reads mapping to A. thaliana, B. napus, or H. sapiens miRNAs in miRBase were determined. We also mapped the reads of the human libraries to piRBase for piRNA detection. Note that A.thaliana and B.napus miRNAs, as well as human piRNAs are 2'-OMe modified, in contrast to human miRNAs. Shown are the mean values of at least two independent experiments and the error bars represent standard deviations. (B). Numbers of miRNAs (or piRNAs) identified. We determined the numbers of known miRNAs from the different species identified with protocols TS or TS5. Note that for A. thaliana, B. napus, or H. sapiens, 427, 92, and 2588 miRNAs have been registered in miRBase, respectively. 0.5 million reads from the TS or TS5 libraries were mapped to B. napus miRNAs in miRbase, 1 million reads were mapped to the A. thaliana or human databases. Shown are the mean values of at least two independent experiments with standard deviations represented by error bars. This figure has been modified from Dard-Dascot et al, 2018¹². Please click here to view a larger version of this figure.

Name oligonucleotide	5' modification	3' modification	sequence 5' to 3'	purification
5' HD (TS5) adapter	5AmMC6		[5AmMC6]GTTCAGAGTTCTACAGTCCGACGATCNrNrNrN (note that this oligo is a DNA-RNA chimeric; the three 3' terminal nts are RNA)	HPLC
3' HD (TS5) adapter	phosphate	3AmMO	[Phos]rNrNrNrNTGGAATTCTCGGGTGCCAAGG[3AaMO] (note that this oligo is a DNA-RNA chimeric; the four 5' terminal nts are RNA)	HPLC
5' MRL (TS7) adapter	5AmMC6		[5AmMC6]GTTCAGAGTTCTACAGTCCGACGATCNNNNrArCrGrArUrArC (note that this oligo is a DNA-RNA chimeric; the seven 3' terminal nts are RNA)	HPLC
3' MRL (TS7) adapter	phospate	3AmMO	[Phos]GTATCGTNNNNNNTGGAATTCTCGG[3AmMO]	HPLC
RT primer			GCCTTGGCACCCGAGAATTCCA	HPLC
Universal P5 primer			AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGA	HPLC
P7-index primer			CAAGCAGAAGACGGCATACGAGATNNNNNNGTGACTGGAGTTCCTTGGCACCCGAGAATTCCA (NNNNNN = index)	HPLC
			index 1: CGTGAT
			index 2: ACATCG
			index 3: GCCTAA
			index 4: TGGTCA
			index 5: CACTGT
			index 6: ATTGGC
			index 7: GATCTG
			index 8: TCAAGT
			index 9: CTGATC
			index 10: AAGCTA
			index 11: GTAGCC
			index 12: TACAAG

Table 1. Oligonucleotides used with this protocol.

Discussion

Small RNA library preparation remains challenging due to bias, mainly introduced during adapter ligation steps. RNAs with a 2'-OMe modification at their 3' end such as plant miRNAs, piRNA in insects, nematodes and mammals, and small interfering RNAs (siRNA) in insects and plants are particularly difficult to study because the 2'-OMe modification inhibits 3' adapter ligation. A number of solutions have been proposed in the literature to improve sRNA library preparation protocols, but most commercially available kits are still based on the classical TS protocol, which has severe bias. A few 'low bias' kits exist, however, including the Nf kit with randomized adapters and PEG in the ligation reactions, and a few kits appeared recently that avoid adapter ligation altogether¹². We reported previously that Nf detects more different miRNAs than the standard TS protocol, but protocol 'S' (without adapter ligation) performed relatively poorly due to a significant formation of side-products¹². Surprisingly, upon modification the TS protocols had a more sensitive detection of 2'-OMe sRNAs than the other protocols, but not of normal sRNAs. We chose here to describe in detail the TS5 protocol, in which the adapters are randomized at their extremities, PEG is used in the ligation reactions and excess 3' adapter is eliminated by purification on beads. It should be noted here that a different protocol (TS7), using MRL adapters may perform slightly better than the TS5 protocol. However, as there are only minor differences between the two and because with TS7, it is more difficult to separate the desired library product from adapter dimers, we preferred here to describe in detail the TS5 protocol. However, users can, if desired, replace the TS5 adapters by TS7 adapters. Note that these are slightly longer leading to a final library product of ~170 bp rather than ~150 bp.

The possibility to perform protocol TS5 or TS7 with 'home-made' materials allows to substantially reduce cost. However, there may be a larger variability in terms of quality with home-made materials; especially home-made pre-adenylated 3' adapter may be subject to variable quality due to varying efficacies of pre-adenylation. It is therefore recommended to prepare a large stock and if a new stock is prepared, compare its performance with the previous one. A control RNA sample can be used for this purpose.

A disadvantage of the protocols describe herein is the relatively strong formation of side-products and the difficulty to separate the desired library from these products. Care must be taken to not overload the acrylamide gel for purification. Recently, modified adapters were developed that have a strongly reduced tendency to form dimers²¹. A 2'-OMe modification at the 3' end of the 5' adapter combined with a methylphosphonate modification at the 5' extremity of the 3' adapter efficiently suppressed adapter dimers. It will be interesting to test such modified adapters in the herein described protocol.

The gel purification steps in protocol TS5 are relatively labor-intensive. If the use of modified adapters efficiently reduces the formation of adapter dimers, gel purification of the final library may not be necessary anymore. In addition, as an alternative to the gel purification step to isolate small RNA (step 1), a strategy using magnetic beads to enrich for small RNAs exists (https://ls.beckmancoulter.co.jp/files/appli_note/Supplemental_Protocol_for_miRNA_.pdf). We have not used this method ourselves, but it is worth testing and if it works well it could significantly simplify the protocol.

In conclusion, while protocol TS5 could be further improved, it performs better than commercially available kits, at least those tested in our previous comparative analysis, for the detection of 2'OMe sRNA. It can be followed using home-made materials, allowing significant cost reduction. For convenience, and perhaps more constant performance, reagents from the TS and Nf kits can be used.

開示

The authors have nothing to disclose.

Acknowledgements

This work was supported by the National Center for Scientific Research (CNRS), The French Alternative Energies and Atomic Energy Commission (CEA) and Paris-Sud University. All library preparation, Illumina sequencing and bioinformatics analyses for this study were performed at the I2BC Next-Generation Sequencing (NGS) facility. The members of the I2BC NGS facility are acknowledged for critical reading of the manuscript and helpful suggestions.

Materials

2100 Bioanalyzer Instrument	Agilent	G2939BA
Acid-Phenol:Chloroform, pH 4.5 (with IAA, 125:24:1)	ThermoFisher	AM9720
Adenosine 5'-Triphosphate (ATP)	Nex England Biolabs	P0756S
Agencourt AMPure XP beads	Beckman Coulter	A63880
Bioanalyzer High Sensitivity DNA Kit	Agilent	5067-4626
Bioanalyzer Small RNA Kit	Agilent	5067-1548
Corning Costar Spin-X centrifuge tube filters	Sigma Aldrich	CLS8162-96EA
Dark Reader transilluminator	various suppiers
HotStart PCR Kit, with dNTPs	Kapa Biosystems	KK2501
NEXTflex small RNA-seq V3 kit	BIOO Scientific	NOVA-5132-05	optional
Novex TBE gels 6%	ThermoFisher	EC6265BOX
Novex TBE Urea gels 15%	ThermoFisher	EC6885BOX
QIAquick Nucleotide Removal Kit	Qiagen	28304
Qubit 4 Quantitation Starter Kit	ThermoFisher	Q33227
Qubit ssDNA Assay Kit	ThermoFisher	Q10212
RNA Gel Loading Dye (2X)	ThermoFisher	R0641
RNA Gel Loading Dye (2X)	ThermoFisher	R0641
RNase Inhibitor, Murine	Nex England Biolabs	M0314S
SuperScript IV Reverse Transcriptase	ThermoFisher	18090200
SYBR Gold Nucleic Acid Gel Stain	ThermoFisher	S11494
T4 RNA Ligase 1 (ssRNA Ligase)	Nex England Biolabs	M0204S
T4 RNA Ligase 2, truncated	Nex England Biolabs	M0242S
TrackIt 50 bp DNA ladder	ThermoFisher	10488043
TruSeq Small RNA Library Prep Kit	Illumina	RS-200-0012/24/36/48	optional
UltraPure Glycogen	ThermoFisher	10814010
XCell SureLock Mini-Cell	ThermoFisher	EI0001
XCell SureLock Mini-Cell	ThermoFisher	EI0001
ZR small RNA ladder	Zymo Research	R1090
			the last two numbers correspond to the set of indexes

参考文献

Ghildiyal, M., Zamore, P. D. Small silencing RNAs: an expanding universe. Nature Reviews Genetics. 10, 94-108 (2009).
Chang, T. C., Mendell, J. T. microRNAs in vertebrate physiology and human disease. Annual Review of Genomics and Human Genetics. 8, 215-239 (2007).
Zhuang, F., Fuchs, R. T., Robb, G. B. Small RNA expression profiling by high-throughput sequencing: implications of enzymatic manipulation. Journal of Nucleic Acids. 2012, 360358 (2012).
van Dijk, E. L., Jaszczyszyn, Y., Thermes, C. Library preparation methods for next-generation sequencing: tone down the bias. Experimental Cell Research. 322, 12-20 (2014).
Munafo, D. B., Robb, G. B. Optimization of enzymatic reaction conditions for generating representative pools of cDNA from small RNA. RNA. 16, 2537-2552 (2010).
Hafner, M., et al. RNA-ligase-dependent biases in miRNA representation in deep-sequenced small RNA cDNA libraries. RNA. 17, 1697-1712 (2011).
Sorefan, K., et al. Reducing ligation bias of small RNAs in libraries for next generation sequencing. Silence. 3, 4 (2012).
Sun, G., et al. A bias-reducing strategy in profiling small RNAs using Solexa. RNA. 17, 2256-2262 (2011).
Jayaprakash, A. D., Jabado, O., Brown, B. D., Sachidanandam, R. Identification and remediation of biases in the activity of RNA ligases in small-RNA deep sequencing. Nucleic Acids Research. 39, 141 (2011).
Zhuang, F., Fuchs, R. T., Sun, Z., Zheng, Y., Robb, G. B. Structural bias in T4 RNA ligase-mediated 3′-adapter ligation. Nucleic Acids Research. 40, 54 (2012).
Fuchs, R. T., Sun, Z., Zhuang, F., Robb, G. B. Bias in ligation-based small RNA sequencing library construction is determined by adaptor and RNA structure. PLoS One. 10, 0126049 (2015).
Dard-Dascot, C., et al. Systematic comparison of small RNA library preparation protocols for next-generation sequencing. BMC Genomics. 19, 118 (2018).
Van Nieuwerburgh, F., et al. Quantitative bias in Illumina TruSeq and a novel post amplification barcoding strategy for multiplexed DNA and small RNA deep sequencing. PLoS One. 6, 26969 (2011).
Harrison, B., Zimmerman, S. B. Polymer-stimulated ligation: enhanced ligation of oligo- and polynucleotides by T4 RNA ligase in polymer solutions. Nucleic Acids Research. 12, 8235-8251 (1984).
Song, Y., Liu, K. J., Wang, T. H. Elimination of ligation dependent artifacts in T4 RNA ligase to achieve high efficiency and low bias microRNA capture. PLoS One. 9, 94619 (2014).
Zhang, Z., Lee, J. E., Riemondy, K., Anderson, E. M., Yi, R. High-efficiency RNA cloning enables accurate quantification of miRNA expression by deep sequencing. Genome Biology. 14, 109 (2013).
Barberan-Soler, S., et al. Decreasing miRNA sequencing bias using a single adapter and circularization approach. Genome Biology. 19, 105 (2018).
Chen, Y. R., et al. A cost-effective method for Illumina small RNA-Seq library preparation using T4 RNA ligase 1 adenylated adapters. Plant Methods. 8, 41 (2012).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. , (2011).
Langmead, B., Trapnell, C., Pop, M., Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology. 10, 25 (2009).
Shore, S., et al. Small RNA Library Preparation Method for Next-Generation Sequencing Using Chemical Modifications to Prevent Adapter Dimer Formation. PLoS One. 11, 0167009 (2016).

Play Video

PDF

DOI

DOWNLOAD MATERIALS LIST

記事を引用

van Dijk, E. L., Eleftheriou, E., Thermes, C. Improving Small RNA-seq: Less Bias and Better Detection of 2′-O-Methyl RNAs. J. Vis. Exp. (151), e60056, doi:10.3791/60056 (2019).

Improving Small RNA-seq: Less Bias and Better Detection of 2′-O-Methyl RNAs