This protocol describes the manual sorting procedure to isolate single fluorescently labeled neurons followed by in vitro transcription-based mRNA amplification for high-depth single-cell RNA sequencing.
Single-cell RNA sequencing (RNA-seq) is now a widely implemented tool for assaying gene expression. Commercially available single-cell RNA-sequencing platforms process all input cells indiscriminately. Sometimes, fluorescence-activated cell sorting (FACS) is used upstream to isolate a specifically labeled population of interest. A limitation of FACS is the need for high numbers of input cells with significantly labeled fractions, which is impractical for collecting and profiling rare or sparsely labeled neuron populations from the mouse brain. Here, we describe a method for manually collecting sparse fluorescently labeled single neurons from freshly dissociated mouse brain tissue. This process allows for capturing single-labeled neurons with high purity and subsequent integration with an in vitro transcription-based amplification protocol that preserves endogenous transcript ratios. We describe a double linear amplification method that uses unique molecule identifiers (UMIs) to generate individual mRNA counts. Two rounds of amplification results in a high degree of gene detection per single cell.
Single-cell RNA sequencing (RNA-seq) has transformed transcriptomic studies. While large-scale single-cell RNAseq can be performed using a variety of techniques, such as droplets1,2, microfluidics3, nanogrids4, and microwells5, most methods cannot sort defined cell types that express genetically encoded fluorophores. To isolate a select cell population, fluorescence-activated cell sorting (FACS) is often used to sort labeled cells in a single-cell mode. However, FACS has some restrictions and requires meticulous sample processing steps. First, a large number of input cells are typically needed (often several million cells per mL), with a significant fraction (>15–20%) containing the labeled population. Second, cell preparations may require multiple rounds of density gradient centrifugation steps to remove glial fraction, debris, and cell clumps that might otherwise clog the nozzle or flow cell. Third, FACS usually employs staining and destaining steps for live/dead staining (e.g., 4′,6-diamidino-2-phenylindole (DAPI), propidium iodide (PI), and Cytotracker dyes), which take up additional time. Fourth, as a rule of thumb for two-color sorting (such as DAPI and green/red fluorescent protein (GFP/RFP)), usually two samples and one control are needed, requiring an unlabeled sample to be processed in addition to the desired mouse strain. Fifth, filtering is often performed multiple times before and during sample sorting to proactively prevent clogged sample lines in an FACS machine. Sixth, time must be allotted in most commonly used FACS setups to initialize and stabilize the fluid stream and perform droplet calibration. Seventh, control samples are typically run in sequence prior to the actual sample collection to set up compensation matrices, doublet rejection, setting gates, etc. Users either perform steps six and seven themselves ahead of time or require the assistance of a technician in parallel. Finally, post-FACS, there are often steps to ensure that only labeled single cells are present in each well; for example, by checking samples in a high-content screening setup such as a fast plate imager.
To circumvent the steps outlined above and facilitate a relatively quick, targeted sequencing of a small population of single fluorescently labeled neurons, we describe a manual sorting procedure followed by two rounds of a highly sensitive in vitro amplification protocol, called double in-vitro transcription with absolute counts sequencing (DIVA-Seq). The RNA amplification and cDNA library generation are adapted from Eberwine et al.6 and Hashimshony et al.7, with certain modifications to suit mouse interneurons that have smaller cellular volumes; furthermore, we have also found that it is equally useful for excitatory pyramidal neurons.
All the procedures including animal subjects have been approved by IACUC at Cold Spring Harbor Laboratory, NY (IACUC #16-13-09-8).
1. Manual Sorting of Fluorescently Labeled Mouse Neurons
2. First Round RNA Amplification
NOTE: The following procedure is for single strip of eight 0.2 mL microfuge tubes. Scale the reactions as needed.
3. Second Round Amplification
4. Amplified RNA Fragmentation and Cleanup
5. Library Preparation
NOTE: IVTs can be pooled at this point, if there is no overlap in barcodes used. The phosphatase treatment time is 40 min. Poly-nucleotide kinase treatment time is 1 h.
6. PCR Product Cleanup and Size Selection
7. Determination of Library Amount and Quality
8. Sample Submission
Using the protocol described above, GABAergic neurons were manually sorted (Figure 1) and RNA was amplified, then made into a cDNA library (Figure 2) and sequenced at high depth8. The amplified RNA (aRNA) products ranged between 200–4,000 bp in size, with a peak distribution slightly above 500 bp (Figure 3A). The bead-purified cDNA library was further size-restricted by a second round of purification using beads that eliminated smaller fragments less than 200 bp (Figure 3B and 3C) and with a peak at ~350 bp. Having shorter fragments will lead to empty reads (no mRNA inserts, only adapter and primer sequences), whereas longer fragments will occupy more space on the flow cells, reducing total read output. Upon sequencing, we routinely obtained a 4.8 x 105 median, or 6.9 x 105 average mapped reads per cell (Figure 4A). After duplicate RNA removal using UMIs, each single cell had a 1.0 x 105 median, or 1.4 x 105 average unique reads per cell (Figure 4B). In each single cell external RNA controls consortium (ERCC), spike-in RNA was used as an internal control for which the absolute number of molecules that are added to the sample can be calculated. There was a linear relationship of input to observed counts, with a slope of 0.92 and adjusted R2 = 0.94 (Figure 4C). We detected on average ~10,000 genes per single neuron (ranging from ~7,500 to 12,000 genes/cell), with > 95% of the single cells detecting > 6,000 genes (Figure 4D-4F). This number compares favorably against published data from similar mouse brain-derived single neurons (e.g., 1,865-4,760 genes in Zeisel et al.9, 7,250 genes in Tasic et al.10, and 8,000 genes in Okaty et al.11). Readers are directed to Poulin et al.12 for a detailed comparison.
Figure 1: Workflow of manual sorting of neurons followed by DIVA-Seq. Fresh mouse brains were collected and sliced, and the region of interest was microdissected. Single neurons expressing fluorescent proteins were collected manually and amplified using two rounds of linear amplification by in vitro transcription. Please click here to view a larger version of this figure.
Figure 2: Schematic workflow of DIVA-Seq with two rounds of amplification while incorporating unique molecular identifiers (UMIs) or varietal-tags. Sample bar code (SBC) allows each single cell to be identified by its 9-nucleotide code (teal). UMI is a stretch of random nucleotides 10 bp in length that is different for each primer used. During the bioinformatics step, two mapped transcripts having the same UMI sequence will be counted only once, thus eliminating amplification duplicates and allowing for absolute transcript counting. RA5 and RA3 are sequencing primers, and T7-RA5 primer is needed to add the T7 sequences back to the first-round aRNA products so that the T7 RNA polymerase can rebind and perform a second round of linear amplification by in vitro transcription. Please click here to view a larger version of this figure.
Figure 3: Example bioanalyzer plots. (A) aRNA size distributions should be between 200-4,000 bp with a peak at around 500 bp. X-axis has arbitrary fluorescence unit [FU]. (B) Size distribution of cDNA library products after bead cleanup. (C) Size distribution after 0.7x SPRI size selection (step 6.6) with a peak around 350 bp.
Figure 4: Example sequence read distributions and gene detection from manually-sorted neurons after DIVA-Seq8. (A) Total mapped read distribution. (B) Total unique reads distribution. (C) ERCC reads show linear relationship over 4 orders of magnitude. (D) Genes detected vs. read counts shows that >95% single cells have >6000 genes/cell. (E) Genes detected amongst 6 interneuron types are comparable. (F) Distribution of genes detected per cell, GEO accession #GSE92522. This figure has been adapted from Paul et al.8. Please click here to view a larger version of this figure.
Buffer | Item | Concentration | Amount (µL) |
Sample Collection buffer | Recombinant ribonuclease inhibitor | 55 | |
ERCC | 1:50K diluted | 110 | |
Nuclease free water | 605 | ||
Aliquot 43.75 µL of above into 16 tubes (2 strips of 8; 200 µL PCR tubes); add following per tube | |||
T7-UMI-primers (e.g. N10B1-N10B16) | 1 ng/µL | 6.25 µL/tube | |
Final volume in each tube 50 µL (each tube can be split in 25 µL aliquots and frozen at -80 °C) | |||
Solutions: | Item | Concentration | Amount |
ACSF: to make 5 L dissolve | NaCl | 126 mM | 36.8 g |
KCl | 3 mM | 1.15 g | |
NaH2PO4 | 1.25 mM | 0.75 g | |
NaHCO3 | 20 mM | 8.4 g | |
To 500 mL of ACSF, bubble oxygen for 10-15 min then add following fresh each time: | |||
D-glucose | 20 mM | 1.8 g | |
MgSO4 | 2 mM | 0.5 mL from 4 M stock | |
CaCl2 | 2 mM | 0.5 mL from 4 M stock | |
Keep ACSF oxygenated through out | |||
Solutions: | Item | Concentration | |
Activity blocker cocktail: make a 100x stock | |||
To 100 mL ACSF add | |||
APV | 0.05 mM | ||
CNQX | 0.02 mM | ||
TTX | 0.0001 mM (0.1 µM) | ||
Solutions: | Item | Amount | |
Protease soltion (100 mL) | Protease from Streptomyces griseus | 100 mg | |
Fetal bovine serum: commercial source, aliquot in 500 µL for each use. |
Table 1: List of solutions and buffers.
Table 2: List of primers and sequences. N10B1-N10B96 are first strand primers and T7-RA5 is the second-round primer. Please click here to download this table.
The manual sorting protocol is suitable for a supervised RNA sequencing of neuron populations that are either sparsely labeled in the mice brain or are representing a rare cell population that is otherwise not feasible to study using current high-throughput cell sorting and amplification methods. Cells subjected to FACS usually undergo sheath and sample line pressures in the range of ~9–14 psi, depending on nozzle size and desired event rates. In addition, upon being ejected from the nozzle, the cells can land hard on the surface of the collection tube or wells coated with sample buffer causing impact stress. During manual sorting, such high pressures are never applied, as the cells are sucked into the pipette by capillary action and expelled by gently blowing them out and simply breaking tips of the glass pipettes. The DIVA-Seq protocol is useful for RNA amplification from cells with small cellular volumes (<8 µL) and low starting material and consistently yields large numbers of detectable genes (8–10 K), which, when coupled with deep sequencing, allows for detailed reconstructions of a coherent molecular picture of cellular functions underlying cell identity8,13,14. Due to purity of cell collection steps, high sensitivity of gene detection, and the ability to perform absolute molecule counts, this method is useful for studying cellular states and disease pathophysiology with high depth and precision.
While the yield of amplified RNA and the degree of gene detection is relatively high in this protocol, certain procedural measures help maintain consistency. During second-strand synthesis, assembly must be done on ice, the thermal cycler must be pre-cooled before transfer to the unit, and the reaction must be done strictly at 16 °C (or slightly below) to avoid formation of hairpins that may reduce aRNA yield. It is also advised not to exceed 2 h at 16 °C during second-strand synthesis, and it is important to move to the cDNA purification step as soon as possible. During the IVT steps, incubation for less than 12 h might reduce aRNA yield, whereas exceeding 14 h of IVT time may result in some aRNA degradation.
We did not perform a comparative study with the same input sample from litter-mates subjected to FACS and manual sorting using DIVA-Seq; hence, we do not claim that any particular gene category is misregulated in FACS and not in manual sorting. Both FACS and manual sorting will likely introduce some degree of gene expression artifacts. For differential gene expression situations, any such effect should in theory cancel out one another, as it will be manifested in both the control and sample groups. Recently, a cocktail of transcription inhibitors have been used to prevent the activation of immediate early gene expression, and such steps can also be incorporated to this protocol15.
The manual sorting process is gentle and quicker (usually 90–160 min) compared to FACS (excluding the sample preparation times) that requires density gradient centrifugation, staining with viability, cytotracker dyes, and post-sorting visualization. Manual sorting does not subject the cells to high sheath pressure and impact stress upon sorting onto wells. It also allows near-constant access to oxygenated ACSF and overall provides a hospitable and less stressful sorting environment, which may be crucial for cells that are sensitive to stress such as fast spiking cells with high metabolic demands. In DIVA-Seq currently, up to 96 cells can be multiplexed to save reagent costs and provide absolute mRNA counting with high gene counts per cell.
However, there are drawbacks to this method; for example, manual sorting needs reliable fluorescently labeled cells as a starting population. It is inherently a low-throughput process, with each sorting session yielding 32–64 cells at its maximum, which is considerably lower than in FACS. Manual sorting also requires fine motor skills and some practice to manipulate glass pipettes under a dissection microscope and capture single cells in microcapillary pipettes. The DIVA-Seq amplification is 3'-biased; hence, it cannot be used for whole transcriptome amplification and is also not suitable for splice isoform detection.
The authors have nothing to disclose.
This work was supported by grants from the NIH (5R01MH094705-04 and R01MH109665-01 to Z.J.H.), by the CSHL Robertson Neuroscience Fund (to Z.J.H.), and by a NARSAD Post-Doctoral Fellowship (to A.P.).
ERCC RNA Spike-In Control Mixes | Thermo Fisher | Cat# 4456740 | |
SuperScript III | Thermo Fisher | Cat# 18080093 | |
RNaseOUT Recombinant Ribonuclease Inhibitor | Thermo Fisher | Cat# 10777019 | |
RNA fragmentation buffer | New England Biolabs | Cat# E6105S | |
RNA MinElute kit | Qiagen | Cat# 74204 | |
Antarctic phosphatase | New England Biolabs | Cat# M0289 | |
Poly nucleotide kinase | New England Biolabs | Cat# M0201 | |
T4 RNA ligase2, truncated | New England Biolabs | Cat# M0242 | |
Ampure Xp magnetic beads | Beckman Coulter | Cat# A63880 | |
SPRIselect size selection magnetic beads | Thermo Fisher | Cat# B23317 | |
DL-AP5 | Tocris | Cat# 0105 | |
CNQX | Tocris | Cat# 1045 | |
TTX | Tocris | Cat# 1078 | |
Protease from Streptomyces griseus | Sigma-Aldrich | Cat# P5147 | |
Message Amp II kit | Thermo Fisher | Cat# AM1751 | |
Carbogen | Airgas | Cat# UN3156 | |
Sylgard 184 | Sigma-Aldrich | Cat# 761036 | |
Illumina TrueSeq smallRNA kit | Illumina | Cat# RS-200-0012 | |
Bioanalyzer RNA Pico chip | Agilent | Cat# 5067-1513 | |
Bioanalyzer High Sensitvity DNA chip | Agilent | Cat# 5067-4626 | |
Bioanalyzer 2100 | Agilent | ||
Dissection microscope with fluorescence and bright field illumination with DIC optics. (Leica model MZ-16F). | Leica | Model MZ-16F | |
Glass microcapillary: Borosilicate capillary tubes 500/pk. OD= 1 mm, ID=0.58 mm, wall= 0.21 mm, Length= 150 mm. | Warner instruments | Model GC100-15, Order# 30-0017 | |
Capillary pipette puller | Sutter Instruments Co | P-97 | |
Vibratome | Thermo Microm | HM 650V | |
Vibratome tissue cooling unit | Thermo Microm | CU 65 |