The role of RNA modifications in viral infections is just starting to be explored and could highlight new viral-host interaction mechanisms. In this work, we provide a pipeline to investigate m6A and m5C RNA modifications in the context of viral infections.
The role of RNA modifications in biological processes has been the focus of an increasing number of studies in the last few years and is known nowadays as epitranscriptomics. Among others, N6-methyladenosine (m6A) and 5-methylcytosine (m5C) RNA modifications have been described on mRNA molecules and may have a role in modulating cellular processes. Epitranscriptomics is thus a new layer of regulation that must be considered in addition to transcriptomic analyses, as it can also be altered or modulated by exposure to any chemical or biological agent, including viral infections.
Here, we present a workflow that allows analysis of the joint cellular and viral epitranscriptomic landscape of the m6A and m5C marks simultaneously, in cells infected or not with the human immunodeficiency virus (HIV). Upon mRNA isolation and fragmentation from HIV- infected and non-infected cells, we used two different procedures: MeRIP-Seq, an RNA immunoprecipitation-based technique, to enrich for RNA fragments containing the m6A mark and BS-Seq, a bisulfite conversion-based technique, to identify the m5C mark at a single nucleotide resolution. Upon methylation-specific capture, RNA libraries are prepared for high-throughput sequencing. We also developed a dedicated bioinformatics pipeline to identify differentially methylated (DM) transcripts independently from their basal expression profile.
Overall, the methodology allows exploration of multiple epitranscriptomic marks simultaneously and provides an atlas of DM transcripts upon viral infection or any other cell perturbation. This approach offers new opportunities to identify novel players and novel mechanisms of cell response, such as cellular factors promoting or restricting viral replication.
It is long known that RNA molecules can be modified, and more than 150 post-transcriptional modifications have been described to date1. They consist in the addition of chemical groups, mainly methyl groups, to virtually any position of the pyrimidine and purine rings of RNA molecules2. Such post-transcriptional modifications have already been shown to be highly enriched in transfer RNA (tRNA) and ribosomal RNA (rRNA) and have recently been described on mRNA molecules as well.
The rise of new technologies, such as Next Generation Sequencing (NGS), and the production of specific antibodies recognizing definite chemical modifications allowed, for the first time, the investigation of the location and the frequency of specific chemical modifications at a transcriptome-wide level. These advancements have led to a better understanding of RNA modifications and to the mapping of several modifications on mRNA molecules3,4.
While epigenetics investigates the role of DNA and histone modifications in transcriptome regulation, epitranscriptomics in a similar fashion focuses on RNA modifications and their role. The investigation of epitranscriptomic modifications provides new opportunities to highlight novel mechanisms of regulation that may tune a variety of cellular processes (i.e., RNA splicing, export, stability and translation)5. It was thus no great surprise that recent studies uncovered many epitranscriptomic modifications upon viral infection in both cellular and viral RNAs6. Viruses investigated so far include both DNA and RNA viruses; among them, HIV can be considered as a pioneering example. Altogether, the discovery of RNA methylation in the context of viral infections may allow the investigation of yet undescribed mechanisms of viral expression or replication, thus providing new tools and targets to control them7.
In the field of HIV epitranscriptomics, modifications of viral transcripts have been widely investigated and have shown that the presence of this modification was beneficial for viral replication8,9,10,11,12,13. To date various techniques can be used to detect epitranscriptomic marks at the transcriptome-wide level. The most used techniques for m6A identification rely on immune precipitation techniques such as MeRIP-Seq and miCLIP. While MeRIP-Seq relies on RNA fragmentation to capture fragments containing methylated residues, miCLIP is based on the generation of α-m6A antibody specific signature mutations upon RNA-antibody UV crosslinking, thus allowing a more precise mapping.
Detection of m5C modification can be achieved either by antibody-based technologies similar to m6A detection (m5C RIP), or by bisulfite conversion or by AZA-IP or by miCLIP. Both Aza-IP and m5C miCLIP use a specific methyltransferase as bait to target RNA while going through RNA methylation. In Aza-IP, target cells are exposed to 5-azacytidine, resulting in the random introduction of cytidine analog 5-azacytidine sites into nascent RNA. In miCLIP, the NSun2 methyltransferase is genetically modified to harbor the C271A mutation14,15.
In this work, we focus on the dual characterization of m6A and m5C modifications in infected cells, using HIV as a model. Upon methodological optimization, we have developed a workflow that combines methylated RNA immunoprecipitation (MeRIP) and RNA bisulfite conversion (BS), allowing the simultaneous exploration of m6A and m5C epitranscriptomic marks at a transcriptome-wide level, in both cellular and viral contexts. This workflow can be implemented on cellular RNA extracts as well as on RNA isolated from viral particles.
The Methylated RNA ImmunoPrecipitation (MeRIP)16 approach allowing investigation of m6A at the transcriptome-wide level is well established and an array of m6A-specific antibodies are commercially available to date17. This method consists in the selective capture of m6A-containing RNA pieces using an m6A-specific antibody. The two major drawbacks of this technique are (i) the limited resolution, which is highly dependent on the size of RNA fragments and thus provides an approximated location and region containing the methylated residue, and (ii) the large amount of material needed to perform the analysis. In the following optimized protocol, we standardized the fragment size to about 150 nt and reduced the amount of starting material from 10 µg of poly-A-selected RNA, which is currently the advised amount of starting material, to only 1 µg of poly-A-selected RNA. We also maximized the recovery efficiency of m6A RNA fragments bound to specific antibodies using an elution by a competition approach with an m6A peptide instead of more conventional and less specific elution methods using phenol-based techniques or proteinase K. The main limitation of this RIP-based assay, however, remains the suboptimal resolution that does not allow the identification of the precise modified A nucleotide.
Analysis of the m5C mark can be currently performed using two different approaches: a RIP-based method with m5C-specific antibodies and RNA bisulfite conversion. As RIP offers only limited resolution on the identification of the methylated residue, we used bisulfite conversion that can offer single nucleotide resolution. RNA exposure to bisulfite (BS) leads to cytosine deamination, thereby converting the cytosine residue into uracil. Thus, during the RNA bisulfite conversion reaction, every non-methylated cytosine is deaminated and converted to uracil, while the presence of a methyl group in position 5 of the cytosine has a protective effect, preventing the BS-induced deamination and preserving the cytosine residue. The BS-based approach allows for the detection of a m5C modified nucleotide at single base resolution and for assessment of the methylation frequency of each transcript, providing insights into m5C modification dynamics18. The main limitation of this technique however relies on the false positive rate of methylated residues. Indeed, BS conversion is effective on single-stranded RNA with accessible C residues. However, the presence of a tight RNA secondary structure could mask the N5C position and hamper BS conversion, resulting in non-methylated C residues that are not converted to U residues, and thus false positives. To circumvent this issue and minimize the false positive rate, we applied 3 rounds of denaturation and bisulfite conversion cycles19. We also introduced 2 controls in the samples to enable estimation of bisulfite conversion efficiency: we spike-in ERCC sequencing controls (non-methylated standardized and commercially available sequences)20 as well as poly-A-depleted RNAs to assess bisulfite conversion rate on one hand, and to verify by RT-PCR the presence of a known and well conserved methylated site, C4447, on 28S ribosomal RNA on the other hand21.
In the field of virology, coupling these two epitranscriptomic investigation methods with next generation sequencing and accurate bioinformatic analysis allows for the in-depth study of m6A and m5C dynamics (i.e., RNA modification temporal changes that could occur upon viral infection and could uncover an array of new therapeutically relevant targets for clinical use).
1. Cell Preparation
NOTE: Depending on the cell type and its RNA content, the starting number of cells can vary.
2. RNA Extraction
3. mRNA Isolation by poly-A Selection with Oligo(dT)25
NOTE: Due to the presence of highly methylated ribosomal RNA in cellular extracts, it is highly recommended to isolate poly-A RNA either by rRNA depletion or preferentially by poly-A positive selection. This step is optional and should be performed for cellular RNA samples only, to obtain sequencing results at higher resolution. If analyzing methylation of non-poly-adenylated viral RNAs, favor rRNA depletion rather than poly-A selection or eventually perform the analysis on total RNA.
4. RNA workflow
5. RNA Fragmentation
NOTE: RNA fragmentation is carried out with the RNA fragmentation reagent and is intended for MeRIP-Seq and control RNA samples. This is a very important step that requires careful optimization in order to obtain fragments that range between 100-200 nt.
6. RNA Purification
NOTE: This step can be carried out by ethanol precipitation or with any kind of column-based RNA purification and concentration method (i.e., RNA Clean and Concentrator).
7. MeRIP
NOTE: A minimum of 2.5 µg of fragmented mRNA is required for each immunoprecipitation (IP), either using a specific anti-m6A antibody (test condition) or using an anti-IgG antibody (negative control).
8. RNA Bisulfite Conversion
9. Library Preparation and High-Throughput Sequencing
10. Bioinformatics Analyses
This workflow has proven useful to investigate the role of m6A and m5C methylation in the context of HIV infection. For this, we used a CD4+ T cell line model (SupT1) that we either infect with HIV or left untreated. We started the workflow with 50 million cells per condition and obtained an average of 500 µg of total RNA with an RNA quality number of 10 (Figure 1A-B). Upon poly-A selection we retrieved between 10 and 12 µg of mRNA per condition (representing about 2% of total RNA) (Figure 1B). At this point, we used 5 µg of poly-A-selected RNA for the MeRIP-Seq pipeline and 1 µg for the BS-Seq pipeline. Since HIV RNA is poly-adenylated, no further action is needed and MeRIP-Seq and BS-Seq procedures can be directly applied.
Figure 1: RNA preparation for downstream applications. A) Workflow depicting RNA preparation and distribution for simultaneous MeRIP-Seq and BS-Seq pipelines. Every filled hexagonal shape represents an RNA modification type, such as m6A (green) or m5C (pink). Amounts of RNA material needed to carry out the experiment are indicated. B) Representative results depicting expected RNA distribution profiles (size and amount) upon total RNA extraction (upper panel) and poly-A selection (lower panel). Samples were loaded on the fragment analyzer with standard sensitivity kit in order to assess RNA quality before entering specific MeRIP-Seq and BS-Seq procedures. RQN: RNA quality number; nt: nucleotides. Please click here to view a larger version of this figure.
MeRIP-Seq pipeline is an RNA immunoprecipitation-based technique that allows investigation of m6A modification along RNA molecules. For this, RNA is first fragmented and then incubated with m6A-specific antibodies coupled to magnetic beads for immunoprecipitation and capture. MeRIP-enriched RNA fragments and the untouched (input) fraction are then sequenced and compared to identify m6A-modified RNA regions and thus m6A-methylated transcripts (Figure 2A). The resolution of the technique relies on the efficiency of RNA fragmentation. Indeed, shorter fragments allow for a more precise localization of the m6A residue. Here, cellular poly-A-selected RNAs and viral RNAs were subjected to ion-based fragmentation with RNA fragmentation buffer during 15 min in a 20 µL final volume to obtain RNA fragments of 100-150 nt. Starting with 5 µg of mRNA, we recovered 4.5 µg of fragmented RNA, corresponding to a recovery rate of 90% (Figure 2B). We used 100 ng of fragmented, purified RNA as input control, subjected directly to library preparation and sequencing. The remaining RNA (~4.4 µg) was processed according to the MeRIP-Seq pipeline, which starts with incubation of fragmented RNA with beads bound either to anti-m6A specific antibodies or to anti-IgG antibodies as control. m6A-specific RIP (MeRIP) of 2.5 µg of fragmented RNA allowed retrieving around 15 ng of m6A-enriched material that underwent library preparation and sequencing (Figure 2B). RIP with anti-IgG control, as expected, did not yield enough RNA to allow further analysis (Figure 2B).
Figure 2: MeRIP-Seq pipeline. A) Schematic representation of MeRIP-Seq workflow and input control. Upon poly-A selection, samples were fragmented into 120-150 nt pieces and, either directly subjected to sequencing (100 ng, input control), or used for RNA immunoprecipitation (2.5 µg, RIP) with anti-m6A specific antibody or anti-IgG antibody as negative control prior to sequencing. B) Representative results showing expected RNA distribution profiles (size and amount) upon fragmentation (upper panel) and RIP (lower panels, MeRIP: left, IgG control: right). Samples were loaded on fragment analyzer to evaluate RNA quality and concentration before further processing to library preparation and sequencing. Fragmented RNA analysis was performed using the RNA standard sensitivity kit while immunoprecipitated RNA used the high sensitivity kit. Please click here to view a larger version of this figure.
BS-Seq pipeline allows exploration of m5C RNA modification at nucleotide resolution and leads to the identification of m5C-methylated transcripts. Upon bisulfite conversion, non-methylated cytosines are converted into uracil, while methylated cytosines remain unchanged (Figure 3A). Due to the harsh conditions of bisulfite conversion procedure (i.e., high temperature and low pH), converted mRNAs are highly degraded (Figure 3B), however this does not interfere with library preparation and sequencing. Bisulfite conversion is efficient only on single-stranded RNA and can thus potentially be hindered by secondary double-stranded RNA structures. To evaluate the efficiency of C-U conversion we introduced two controls. As a positive control, we took advantage of the previously described presence of a highly methylated cytosine in position C4447 of the 28S rRNA23. Upon RT-PCR amplification and sequencing of a 200 bp fragment surrounding the methylated site we could observe that all cytosines were successfully converted to uracils, thereby appearing as thymidines in the DNA sequence, except the cytosine in position 4447 that remained unchanged. As a control for bisulfite conversion rate, we used commercially available synthetic ERCC RNA sequences. This mixture consists in a pool of known, non-methylated and poly-adenylated RNA sequences, with a variety of secondary structures and lengths. Upon library preparation and sequencing, we focused on these ERCC sequences to calculate the conversion rate, which can be performed by counting the number of converted C among the total C residues in all the ERCC sequences and in each sample. We obtained a conversion rate of 99.5%, confirming the efficiency and the success of the bisulfite conversion reaction (Figure 3D).
Figure 3: BS-Seq pipeline. A) Schematic representation of BS-Seq workflow. Upon poly-A selection, samples are exposed to bisulfite, resulting in C to U conversion (due to deamination) for non-methylated C residues. In contrast, methylated C residues (m5C) are not affected by bisulfite treatment and remain unchanged. B) Representative result of bisulfite converted RNA distribution profile (size and amount) upon analysis on fragment analyzer with a standard sensitivity kit. C) Electropherogram showing representative sequencing result of RT-PCR amplicon of the region surrounding the 100% methylated C at position 4447 in 28S rRNA (highlighted in blue). In contrast, C residues of the reference sequence were identified as T residues in the amplicon sequence due to bisulfite conversion success. D) Evaluation of C-U conversion rate by analysis of ERCC spike-in sequences in HIV-infected and noninfected cells. The average conversion rate is of 99.5%. Please click here to view a larger version of this figure.
M6A-enriched samples, bisulfite converted samples and input controls are further processed for library preparation, sequencing and bioinformatic analysis (Figure 4). According to the experimental design and biological question(s) addressed, multiple bioinformatic analyses can be applied. As proof of principle here, we show representative results from one potential application (i.e., differential methylation analysis), which focuses on the identification of differentially methylated transcripts induced upon HIV infection. Briefly, we investigated the m6A or m5C methylation level of transcripts, independently from their gene expression level, in both non-infected and HIV-infected cells, in order to further understand the role of RNA methylations during viral life cycle. Upon gene expression normalization, we identified that the ZNF469 transcript was differentially m6A-methylated according to the infection status, indeed this transcript was not methylated in non-infected cells while it displayed several methylated peaks upon HIV infection (Figure 5A). A similar differential methylation analysis on m5C revealed that the PHLPP1 transcript contained several methylated residues, which tend to be more frequently methylated in the HIV condition (Figure 5B). In this context, both analyses suggest that HIV infection impacts the cellular epitranscriptome.
Figure 4: Schematic representation of the bioinformatic workflow for the analysis of m6A and m5C data. Please click here to view a larger version of this figure.
Figure 5: Example of differentially methylated transcripts upon infection. A) Representative result showing m6A methylation of ZNF459 transcript in HIV-infected (green) and non-infected (grey) cells. Peak intensity (upon input expression subtraction) is shown on the y-axis and position in the chromosome along the x-axis. Differential methylation analysis reveals that ZFN469 transcript is hypermethylated upon HIV infection. B) Representative result of m5C methylated gene in HIV-infected (upper lane) and non-infected (lower lane) cells. The height of each bar represents the number of reads per nucleotide and allows coverage assessment. Each C residue in represented in red, and the proportion of methylated C is represented in blue. The exact methylation rate (%) is reported above each C residue. Arrows highlight statistically significant differentially methylated C. Samples were visualized using IGV viewer. Please click here to view a larger version of this figure.
The role of RNA modifications in viral infection is still largely unknown. A better understanding of the role of epitranscriptomic modifications in the context of viral infection could contribute to the quest for new antiviral treatment targets.
In this work, we provide a complete workflow that allows investigation of the m6A and m5C epitranscriptomes of infected cells. Depending on the biological question, we advise to use poly-A-selected RNA as starting material. Although optional, as the pipeline could be used with total RNA, it is important to keep in mind that rRNAs as well as small RNAs are highly modified and contain an important number of methylated residues. This could result in a decreased quality and quantity of meaningful sequencing data.
However, if the focus of the study is non-poly-adenylated RNA, the RNA extraction step should be adapted in order to avoid discarding small RNA (in case of column-based RNA extraction) and to privilege ribosome-depletion techniques rather than poly-A selection to enter the pipeline.
In order to ensure high quality RNA, correct fragmentation and suitable m6A-enriched and BS converted RNA quality for library preparation we strongly advise to use a fragment analyzer or a bioanalyzer. However, this equipment is not always available. As an alternative, quality of RNA, mRNA and size of fragmented RNA could also be assessed by visualization on agarose gel. Alternatively, library preparation can be performed without previous assessment of RNA quantity.
We used the antibody-based MeRIP-Seq16 technique to explore the m6A epitranscriptomic landscape. This technique is based on RNA immunoprecipitation and is successful; however, some steps need careful optimization and can be critical. Although m6A methylation has been described to occur mainly within the consensus sequence RRA*CH, this motif is highly frequent along mRNA molecules and does not allow precise identification of the methylated site. It is thus critical to achieve a reproducible and consistent RNA fragmentation, generating small RNA fragments, to improve the RIP-based resolution. In this protocol, we recommend an optimized procedure, providing reproducible and consistent results in our experimental setting; however, this fragmentation step may need further optimization according to specific sample features.
Recently a new technique allowing m6A direct sequencing was described. It is based on the use of specific reverse transcriptase variants that exhibit unique RT-signatures as a response to encountering m6A RNA modification24. This technology, upon careful optimization, could circumvent the major limitation faced with MeRIP-Seq (decreasing the amount of initial material and allowing a higher resolution). To explore the m5C modification we decided to use the bisulfite conversion technique in order to detect at nucleotide resolution the modified C residues. In order to reduce the false positive rate due to the presence of RNA secondary structures, we performed 3 cycles of denaturation/bisulfite conversion and further control bisulfite conversion rate performance thanks to the use of ERCC spike-in controls. One of the limitations linked to this technique is that bisulfite conversion is very harsh and three cycles of denaturation/bisulfite conversion could degrade some RNA and hence reduce resolution. However, in our setting, we chose to settle for a potentially slightly lower resolution in order to increase the quality of the dataset.
Thanks to these optimizations and controls, we were able to provide a reliable and sound workflow that can be exploited to investigate the epitranscriptomic landscape and its alteration in the context of viral infections, host-pathogen interactions, or any exposure to specific treatments.
The authors have nothing to disclose.
This work was supported by the Swiss National Science Foundation (grants 31003A_166412 and 314730_188877).
AccuPrime Pfx SuperMix | Invitrogen | 12344-040 | |
anti-m6A antibody _Clone 17-3-4-1 | Millipore | MABE1006 | |
Chloroform | Merck | 67-66-3 | |
ERCC | Invitrogen | 4456740 | |
EZ RNA Methylation Kit | Zymo Research | EZR5001 | |
Fragment analyzer RNA Kit – HS RNA Kit | Agilent | DNF-472-0500 | |
Fragment analyzer RNA Kit – RNA Kit | Agilent | DNF-471-0500 | |
High-Capacity cDNA Reverse Transcription Kit | Applied Biosystem | 4368814 | |
Illumina TruSeq Stranded mRNA | Illumina | 20020594 | |
Magnetic Beads A/G Blend | Merck | 16-663 | |
N6-Methyladenosine, 5′-monophosphate sodium salt (m6A) | Sigma Aldrich | M2780-10MG | |
Normal Mouse IgG | Merk | 12371 | |
Oligo(dT)25 | Life Technologies | 61005, | |
PCRapace | Stratec | 1020220300 | |
Quick RNA Viral Kit | Zymo Research | 1034 | |
RNA Clean & Concentrator | Zymo Research | R1015 | |
RNA Fragmentation Reagent | Ambion | AM8740 | |
RNase Inhibitor | Ambion | AM2684 | |
Trizol | TRIzol Reagent | 15596026 |