Here we present a modified CLIP-seq protocol called FbioCLIP-seq with FLAG-biotin tandem purification to determine the RNA targets of RNA-binding proteins (RBPs) in mammalian cells.
RNA and RNA-binding proteins (RBPs) control multiple biological processes. The spatial and temporal arrangement of RNAs and RBPs underlies the delicate regulation of these processes. A strategy called CLIP-seq (cross-linking and immunoprecipitation) has been developed to capture endogenous protein-RNA interactions with UV cross-linking followed by immunoprecipitation. Despite the wide use of conventional CLIP-seq method in RBP study, the CLIP method is limited by the availability of high-quality antibodies, potential contaminants from the copurified RBPs, requirement of isotope manipulation, and potential loss of information during a tedious experimental procedure. Here we describe a modified CLIP-seq method called FbioCLIP-seq using the FLAG-biotin tag tandem purification. Through tandem purification and stringent wash conditions, almost all the interacting RNA-binding proteins are removed. Thus, the RNAs interacting indirectly mediated by these copurified RBPs are also reduced. Our FbioCLIP-seq method allows efficient detection of direct protein-bound RNAs without SDS-PAGE and membrane transfer procedures in an isotope-free and protein-specific antibody-free manner.
RNAs and RNA-binding proteins (RBPs) control diverse cellular processes including splicing, translation, ribosome biogenesis, epigenetic regulation, and cell fate transition1,2,3,4,5,6. The delicate mechanisms of these processes depend on the unique spatial and temporal arrangement of RNAs and RBPs. Therefore, an important step towards understanding RNA regulation at the molecular level is to reveal the positional information about the binding sites of RBPs.
A strategy referred to as cross-linking and immunoprecipitation (CLIP-seq) has been developed to capture protein-RNA interactions with UV cross-linking followed by immunoprecipitation of the protein of interest7. The key feature of the methodology is the induction of covalent cross-links between an RNA-binding protein and its directly bound RNA molecules (within ~1 Å) by UV irradiation8. The RBP footprints can be determined by CLIP tag clustering and peak calling, which usually have a resolution of 30−60 nt. Alternatively, the reverse transcription step of CLIP can lead to indels (insertions or deletions) or substitutions to the cross-linking sites, which allows identification of protein binding sites on the RNAs at a single-nucleotide resolution. Pipelines like Novoalign and CIMS have been developed for the analysis of the high-throughput sequencing results of CLIP-seq8. Several modified CLIP-seq methods have also been proposed, including individual-nucleotide resolution cross-linking and immunoprecipitation (iCLIP), enhanced CLIP (eCLIP), irCLIP, and photoactivatable ribonucleoside-enhanced cross-linking and immunoprecipitation (PAR-CLIP)9,10,11,12.
Despite the wide use of traditional CLIP-seq methods in the study of RBPs, the CLIP methods have several drawbacks. First, the tedious denatured gel electrophoresis and membrane transfer procedure may lead to loss of information, and cause limited sequence complexity. Second, the protein specific antibody-based CLIP method may pull down a protein complex instead of a single target protein, which may lead to false positive protein-RNA interactions from the copurified RBPs. Third, the antibody-based strategy requires a large amount of high-quality antibodies, which makes the application of these methods inadequate for the study of RBPs without high-quality antibodies available. Fourth, the traditional CLIP method requires radiolabeled ATP to label the protein-bound RNAs.
The high affinity of streptavidin to biotinylated proteins makes it a very powerful approach to purify specific proteins or protein complexes. The efficient biotinylation of proteins carrying an artificial peptide sequence by ectopically expressed bacterial BirA biotin ligase in mammalian cells makes it an efficient strategy to perform biotin purification in vivo13. We developed a modified CLIP-seq method called FbioCLIP-seq (FLAG-Biotin-mediated Cross-linking and Immunoprecipitation followed by high-throughput sequencing) using FLAG-biotin tag tandem purification14 (Figure 1). Through tandem purification and stringent wash conditions, almost all the interacting RBPs are removed (Figure 2). The stringent wash conditions also allow circumventing the SDS-PAGE and membrane transfer, which is labor intensive and technically challenging. And similar to eCLIP and irCLIP, the FbioCLIP-seq method is isotope-free. Skipping the gel running and transfer steps avoids the loss of information, keeps authentic protein-RNA interactions intact, and increases the library complexity. Moreover, the high efficiency of the tagging system makes it a good choice for RBPs without high-quality antibodies available.
Here we provide a step-by-step description of the FbioCLIP-seq protocol for mammalian cells. Briefly, cells are cross-linked by 254 nm UV, followed by cell lysis and FLAG immunoprecipitation (FLAG-IP). Next, the protein-RNA complexes are further purified by biotin affinity capture and RNAs are fragmented by partial digestion with MNase. Then, the protein-bound RNA is dephosphorylated and ligated with a 3’ linker. A 5’ RNA linker is added after the RNA is phosphorylated with PNK and eluted by proteinase K digestion. After reverse transcription, the protein-bound RNA signals are amplified by PCR and purified by agarose gel purification. Two RBPs were chosen to exemplify the FbioCLIP-seq result. LIN28 is a well-characterized RNA-binding protein involved in microRNA maturation, protein translation, and cell reprogramming15,16,17. WDR43 is a WD40 domain-containing protein thought to coordinate ribosome biogenesis, eukaryotic transcription, and embryonic stem cell pluripotency control14,18. Consistent with previously reported results for LIN28 with CLIP-seq, FbioCLIP-seq reveals binding sites of LIN28 on “GGAG” motifs in the microRNA mir-let7g and mRNAs16,19 (Figure 3). WDR43 FbioCLIP-seq also identified the binding preference of WDR43 with 5' external transcribed spacers (5'-ETS) of pre-rRNAs20 (Figure 4). These results validate the reliability of the FbioCLIP-seq method.
1. Cell line construction
2. Cross-linking
3. Cell lysate preparation
4. FLAG beads preparation
5. Immunoprecipitation
6. Elution with 3x FLAG peptide
7. Streptavidin beads preparation
8. Biotin affinity purification
9. Partial RNA digestion
10. Dephosphorylation of RNA
11. 3’ linker ligation
12. PNK treatment
13. RNA isolation
14. 5’ RNA linker ligation
15. Reverse transcription
16. PCR amplification
17. Bioinformatics analysis
The schematic representation of the FbioCLIP-seq procedure is shown in Figure 1. Compared with FLAG-mediated or streptavidin-mediated one-step affinity purification, FLAG-biotin tandem purification removed almost all the copurified proteins, avoiding the contamination of indirect protein-RNA interactions (Figure 2). Representative results for FbioCLIP-seq for LIN28 and WDR43 are depicted in Figure 3 and Figure 4. We performed LIN28 or WDR43 FbioCLIP-seq with mESCs. Figure 3A shows the track view of LIN28 and WDR43 FbioCLIP-seq in pre-let-7g. The reported GGAG motif in pre-let-7g and the cross-linked sites identified by FbioCLIP-seq are shown in Figure 3B. The classification of the mutation sites called by CIMS algorithm showed that LIN28 prefers to bind and cross-link to G nucleotide (Figure 3C). The enriched RNA motifs in LIN28 FbioCLIP-seq binding sites are shown in Figure 3D. More representative tracks of FbioCLIP-seq on LIN28 and HNRNPU are shown in Figure 3E. Comparison of WDR43 and LIN28 FbioCLIP-seq showed their different binding patterns in Rn45 pre-rRNA locus (Figure 4). Figure 5 shows the representative results of FLAG and biotin tag validation after cell line construction. Figure 6 shows representative results of efficient (Figure 6A) or unsuccessful (Figure 6B) FLAG-IP and FLAG-biotin tandem purification.
Figure 1: Schematic representation of FbioCLIP. UV-cross-linked cells (step 1) are lysed in lysis buffer (step 2). The FBRBP-RNA complex is immunoprecipitated using anti-FLAG resin (steps 3−4). The eluted protein-RNA complex is further purified with streptavidin beads and stringent wash conditions are used to remove protein-protein interactions (steps 5−6). RNAs are partially digested by MNase and non-protein-bound RNA fragments are removed by further wash (steps 7−8). The RNAs are then dephosphorylated and ligated with 3’ linker (step 9). Then the RNAs are phosphorylated and eluted by proteinase K treatment (step 10). Purified RNAs are ligated with a 5’ RNA linker containing a 6 nt random barcode (step 11). After reverse transcription (step 12), the cDNA library is amplified with PCR and high-throughput sequencing is performed (steps 13−15). Please click here to view a larger version of this figure.
Figure 2: FLAG-biotin tandem purification removes copurified proteins. Silver staining of FBWDR43 showed that almost all the interacting proteins presented in FLAG- or biotin-mediated one-step purification were eliminated after stringent wash in tandem purification. FLAG: FLAG-mediated affinity purification, SA: streptavidin-mediated biotin purification, tandem: FLAG-biotin tandem purification. Please click here to view a larger version of this figure.
Figure 3: Representative results of LIN28 FbioCLIP-seq. (A) Track view of LIN28 FbioCLIP-seq tags and mutation reads called by CIMS in pre-let-7g RNA. The tags shown are unique reads of FbioCLIP-seq. In total, ~7.7 million unique reads for LIN28 FbioCLIP-seq and ~2.2 million unique reads for WDR43 FbioCLIP-seq were retrieved after removing redundant reads. (B) Cross-linked sites on the GGAG motif of pre-let7-g RNA by LIN28 FbioCLIP-seq. The arrows indicate the cross-linked sites. (C) Percentage of mutated nucleotides in different types of mutations. G is the most frequent cross-linked and mutated nucleotide. Random: random distribution of the four nucleotides. (D) Predicted LIN28 binding motifs by FbioCLIP-seq with HOMER algorithm. (E) Track views of LIN28 FbioCLIP-seq on LIN28 and HNRNPU mRNAs. Please click here to view a larger version of this figure.
Figure 4: Representative results of WDR43 and LIN28 FbioCLIP-seq on Rn45S locus. Track view of WDR43 and LIN28 FbioCLIP-seq on Rn45S locus. WDR43 and LIN28 showed different binding patterns on pre-rRNA. Please click here to view a larger version of this figure.
Figure 5: Representative results of FLAG and biotin tag validation. FLAG or biotin Western blot validates the tagging of the cell lines. Clones #1, #2, and #4 showed efficient expression and biotinylation of the tag while #3 showed poor expression of the tagged protein. Please click here to view a larger version of this figure.
Figure 6: Representative results of FLAG immunoprecipitation and tandem purification efficiency validation. (A) Example of efficient purification of tagged protein by FLAG and biotin affinity purification. (B) Example of inefficient purification of tagged protein by FLAG and biotin affinity purification. Please click here to view a larger version of this figure.
Buffer | Composition |
SDS loading buffer | 50 mM Tris pH 6.8, 2% SDS, 0.1% bromophenolblue, 10% glycerol, 100 mM DTT |
Wash buffer A | 1x PBS, 0.5% NP-40, 0.5% sodium deoxycholate, 0.1% SDS |
Wash buffer B | 5x PBS, 0.5% NP-40, 0.5% sodium deoxycholate, 0.1% SDS |
Wash buffer C | 50 mM Tris pH 7.4, 2% SDS |
Wash buffer D | 5x PBS, 0.5% NP-40, 0.5% SDS, 1 M urea |
PNK buffer | 50 mM Tris pH 7.4, 0.5% NP-40, 10 mM MgCl2 |
MNase reaction buffer | 10 mM Tris pH 8.0, 1 mM CaCl2 |
PNK+EGTA buffer | 50 mM Tris pH 7.4, 0.5% NP-40, 10 mM EGTA |
Proteinase K digestion buffer | 50 mM Tris pH 7.4, 10 mM EDTA, 50 mM NaCl, 0.5% SDS, 20 µg of proteinase K |
Table 1: Composition of buffers used in this study.
Oligo name | Sequence | Notes | ||||
3' linker | rAppAGATCGGAAGAGCACACGTCT-NH2 | |||||
5' RNA linker | GUUCAGAGUUCUACAGUCCGACGUCNNNNN | |||||
RT primer | AGACGTGTGCTCTTCCGATCT | |||||
Forward primer 1 | GTTCAGAGTTCTACAGTCCGACGATC | |||||
Reverse primer 1 | AGACGTGTGCTCTTCCGATCT | |||||
Forward primer 2 | AATGATACGGCGACCACCGAGATCTACACGTTCAGAGTTCTACAGTCCGAC | |||||
Reverse primer 2 | CAAGCAGAAGACGGCATACGAGATCGTGATGTGACTGGAGTTCAGACGTGTGCTCTTCCGATC-s-T | The red and underlined sequences represent Illumina index sequence. |
Table 2: Oligonucleotides used in this study.
Here we introduce a modified CLIP-seq method called FbioCLIP-seq, taking advantage of the FLAG-biotin double tagging system to perform tandem purification of protein-RNA complexes. The FLAG-biotin double tagging system has been shown to be powerful in identifying protein-protein and protein-DNA interactions13,21. Here we demonstrate the high specificity and convenience of this system in identifying the RNAs interacting with proteins. Through tandem purification and stringent wash conditions, we skipped the challenging SDS-PAGE and membrane transfer step, so that more protein-RNA interactions were preserved and to avoid contaminated RNA signals mediated by RBPs in the same complex. Besides, bypass of these steps avoids labeling of the RNA with radiolabeled ATP. This makes the procedure much easier. Note that during the preparation of our publication, a similar method called uvCLAP has also been proposed24.
Several steps are important for the success of the protocol. First, the efficiency of the biotinylation of the tagged protein should be confirmed by Western blot before the experiment (Figure 5). Second, the efficient elution of purified protein by 3 x FLAG peptide is required for the successful amplification of the RNA signals. The beads from step 12.4 should be analyzed with Western blot or silver staining to guarantee that a decent amount of target protein is retrieved (Figure 6). To increase the efficiency, either increase the 3 x FLAG peptide concentration in step 6 up to 500 ng/mL or repeat the elution step multiple times to get better production. Third, it is optimal to titrate the MNase concentration for each protein at the first trial. Overdigestion or insufficient treatment may lead to RNA fragments of inappropriate sizes. Last, the protocol contains two rounds of affinity purification and highly stringent wash. Only a small amount of purified RNAs is retrieved. The reagent should be RNase-free and avoid RNA degradation after MNase treatment step.
Despite the ease of this method, there are still some aspects that can be improved. For example, ligation of a 5’ adaptor to the RNA directly may limit the recovery of the signals because a significant portion of reverse transcription will be terminated by residual cross-linked peptides. Our current study is mainly based on ectopic expression of tagged proteins, which may lead to some artifacts due to protein overexpression. It is worth improving the method by introducing the tag into the endogenous locus. The CRISPR/Cas9 system makes cell line construction much easier and doable. Some studies have applied the eCLIP-seq experiment to mouse tissues25. Derivation of knock-in mice with the double tag is also a potential and promising direction in the improvement and application of the FbioCLIP-seq method. Furthermore, ectopically expressed bacterial BirA ligase may lead to unexpected biotinylation events in vivo. Tagging of the protein may also affect its biological functions.
A large number of studies have shown that the gene expression and epigenome of cells are heterogeneous instead of fully homogeneous, suggesting that it is valuable to study the interaction network at a single-cell level. Several strategies have been developed to study the transcriptome and epigenome of single cells. However, no strategies have been reported to analyze the protein-RNA interactome at a single-cell level. Without the denatured gel running and membrane transfer steps, FbioCLIP-seq can preserve more signals, which makes it a potential strategy to study protein-RNA interactions at a single-cell level.
The authors have nothing to disclose.
Grant support is from the National Basic Research Program of China (2017YFA0504204, 2018YFA0107604), the National Natural Science Foundation of China (31630095), and the Center for Life Sciences at Tsinghua University.
Equipment | |||
UV crosslinker | UVP | HL-2000 HybrilLinker | |
Affinity Purification Beads | |||
ANTI-FLAG beads | Sigma-Aldrich | A2220 | |
Streptavidin beads | Invitrogen | 112.06D | |
Reagents | |||
10x PBS | Gibco | 70013032 | |
3 M NaOAc | Ambion | AM9740 | |
3 x FLAG peptide | Sigma-Aldrich | F4799 | |
ATP | Sigma-Aldrich | A6559 | |
Calcium chloride (CaCl2) | Sigma-Aldrich | C1016 | |
CIP | NEB | M0290S | CIP buffer is in the same package. |
DTT | Sigma-Aldrich | D0632 | |
EDTA | Sigma-Aldrich | E9884 | |
EGTA | Sigma-Aldrich | E3889 | |
Gel purification kit | QIAGEN | 28704 | |
Glycogen | Ambion | AM9510 | |
Magnesium chloride (MgCl2) | Sigma-Aldrich | 449172 | |
MNase | NEB | M0247S | |
NP-40 | Amresco | M158-500ML | |
PMSF | Sigma-Aldrich | 10837091001 | |
Porteinase K | TAKARA | 9033 | |
Protease inhibitor cocktail | Sigma-Aldrich | P8340 | |
Q5 High-Fidelity 2X Master Mix | NEB | 0492S | |
reverse trancriptase (SupperScriptIII) | Invitrogen | 18080093 | |
RNA isolation reagent (Trizol) | Invitrogen | 15596018 | |
RNase Inhibitor | ThermoFisher | EO0381 | |
RNaseOUT | Invitrogen | 10777019 | |
RQ1 Dnase | Promega | M6101 | |
SDS | Sigma-Aldrich | 1614363 | |
Sodium chloride | Sigma-Aldrich | S9888 | |
Sodium deoxycholate | Sigma-Aldrich | D6750 | |
T4 PNK | NEB | M0201S | PNK buffer is in the same package. |
T4 RNA ligaes | ThermoFisher | EL0021 | T4 RNA ligase buffer and BSA are in the same package. |
T4 RNA ligase2, truncated | NEB | M0242S | T4 RNA ligase buffer and 50% PEG are in the same package. |
Trypsin-EDTA | ThermoFisher | 25200072 | |
Urea | Sigma-Aldrich | 208884 | |
mESC culture medium | |||
DMEM (80%) | Gibco | 11965126 | |
2-Mercaptoethanol | Gibco | 21985023 | |
FCS (15%) | Hyclone | ||
Glutamax (1%) | Gibco | 35050061 | |
LIF | purified recombinant protein; 10,000 fold dilution | ||
NEAA (1%) | Gibco | 11140050 | |
Nucleoside mix (1%) | Millipore | ES-008-D | |
Penicillin-Streptomycin (1%) | Gibco | 15140122 | |
Kit | |||
DNA gel extraction kit | QIAGEN | 28704 |