Here we presented a multiplexed single cell mRNA sequencing method to profile gene expression in mouse embryonic tissues. The droplet-based single cell mRNA sequencing (scRNA-Seq) method in combination with multiplexing strategies can profile single cells from multiple samples simultaneously, which significantly reduces reagent costs and minimizes experimental batch effects.
Single cell mRNA sequencing has made significant progress in the last several years and has become an important tool in the field of developmental biology. It has been successfully used to identify rare cell populations, discover novel marker genes, and decode spatial and temporal developmental information. The single cell method has also evolved from the microfluidic based Fluidigm C1 technology to the droplet-based solutions in the last two to three years. Here we used the heart as an example to demonstrate how to profile the mouse embryonic tissue cells using the droplet based scRNA-Seq method. In addition, we have integrated two strategies into the workflow to profile multiple samples in a single experiment. Using one of the integrated methods, we have simultaneously profiled more than 9,000 cells from eight heart samples. These methods will be valuable to the developmental biology field by providing a cost-effective way to simultaneously profile single cells from different genetic backgrounds, developmental stages, or anatomical locations.
The transcriptional profile of each single cell varies among cell populations during embryonic development. Although single molecular in situ hybridization can be used to visualize the expression of a small number of genes1, single cell mRNA sequencing (scRNA-Seq) provides an unbiased approach to illustrate genome-wide expression patterns of genes in single cells. After it was first published in 20092, scRNA-Seq has been applied to study multiple tissues at multiple developmental stages in the recent years3,4,5. Also, as the human cell atlas has launched its developmental-focused projects recently, more single cell data from human embryonic tissues are expected to be generated in the near future.
The heart as the first organ to develop plays a critical role in embryonic development. The heart consists of multiple cell types and the development of each cell type is tightly regulated temporally and spatially. Over the past few years, the origin and cell lineage of cardiac cells at early developmental stages have been characterized6, which provide a tremendous useful navigation tool for understanding congenital heart disease pathogenesis, as well as for developing more technologically advanced methods to stimulate cardiomyocyte regeneration7.
The scRNA-Seq has undergone a rapid expansion in recent years8,9,10. With the newly developed methods, design and analysis of single cell experiments has become more achievable11,12,13,14. The method presented here is a commercial procedure based on the droplet solutions (see Table of Materials)15,16. This method features capturing cells and sets of uniquely barcoded beads in an oil-water emulsion droplet under control of a microfluidic controller system. The rate of cell loading into the droplets is extremely low so that the majority of droplet emulsions contain only one cell17. The procedure's ingenious design comes from single cell separation into droplet emulsions occurring simultaneously with barcoding, which enables the parallel analysis of individual cells using RNA-Seq on a heterogeneous population.
The incorporation of multiplexing strategies is one of the important additions to the traditional single cell workflow13,14. This addition is very useful in discarding cell doublets, reducing experimental costs, and eliminating batch effects18,19. A lipid based barcoding strategy and an antibody based barcoding strategy (see Table of Materials) are the two mostly used multiplexing methods. Specific barcodes are used to label each sample in both methods, and the labeled samples are then mixed for single cell capturing, library preparation, and sequencing. Afterwards, the pooled sequencing data can be separated by analyzing the barcode sequences (Figure 1)19. However, significant differences exist between the two methods. The lipid based barcoding strategy is based on lipid-modified oligonucleotides, which has not been found to have any cell type preferences. While the antibody based barcoding strategy can only detect the cells expressing the antigen proteins19,20. In addition, it takes about 10 min to stain the lipids but 40 min to stain the antibodies (Figure 1). Furthermore, the lipid-modified oligonucleotides are cheaper than antibody-conjugated oligonucleotides but not commercially available at the time of writing this article. Finally, the lipid-based strategy can multiplex 96 samples in one experiment, but the antibody-based strategy currently can only multiplex 12 samples.
The recommended cell number to multiplex in a single experiment should be lower than 2.5 x 104, otherwise, it will lead to a high percentage of cell doublets and potential ambient mRNA contamination. Through the multiplexing strategies, the cost of single cell capturing, cDNA generation, and library preparation for multiple samples will be reduced to the cost of one sample but the sequencing cost will remain the same.
The animal procedure is in accordance with the University of Pittsburgh Institutional Animal Care and Use Committee (IACUC).
1. Mouse Embryonic Heart Dissection and Single Cell Suspension Preparation
NOTE: This step could take a few hours depending on the numbers of embryos to dissect.
2. Single Cell Multiplexing Barcoding
NOTE: This step takes at least 40 min which varies based on the number of samples processed. A clean bench area treated with RNase decontamination solution is required for pre-amplification steps (step 2.11 to 3.11), and a separate clean bench area is required for the post-amplification steps (the steps after 3.11).
3. Droplet Generation and mRNA Reverse Transcription
NOTE: This step takes about 90 min for one multiplexed reaction.
4. cDNA Amplification
NOTE: This step takes about 150 min.
5. Endogenous Transcript Library Preparation
NOTE: This step takes about 120 min.
6. Preparation of Multiplexing Sample Barcode cDNA Libraries
NOTE: This step takes at least 120 min.
7. Library Sequencing
NOTE: Multiple next generation sequencing platforms such as HiSeq 4000 and NovaSeq can be used to sequence the endogenous transcript libraries and multiplexing barcode libraries.
8. Data Analysis
NOTE: De-multiplex the sequencing data using the cloud-based resource BaseSpace or by running the bcl2fastq package on a UNIX server.
In this study, we used mouse embryonic heart as an example to exhibit how multiplexed single cell mRNA sequencing was performed to process the different samples from separate parts of an organ simultaneously. E18.5 CD1 mouse hearts were isolated and dissected into left atrium (LA), right atrium (RA), left ventricle (LV) and right ventricle (RV). The atrial and ventricular cells were then barcoded independently using a lipid-based barcoding procedure and mixed together before GEMs generation and reverse-transcription. The schematic overview is shown in Figure 1. We quantified the cDNA concentration before library construction (Figure 2A). One of the distinctions in performing multiplexed scRNA-Seq from the standard scRNA-Seq is that the endogenous cDNA library and the sample barcode cDNA library were acquired separately after cDNA amplification and purification (Step 4.2.1 and 4.2.2.2). The two libraries were also qualified in our experiment (Figure 2B,C). Next generation sequencing and data analysis were performed followed by library construction and QC.
We used HiSeqX platform to sequence both libraries in the same sequencing lane. With the sequencing data, we first separated the endogenous transcript data and barcode data using the BaseSpace program. Then we analyzed barcode expression in each single cell and found 8 groups of single cells that uniquely express one type of barcode, representing cells from 8 different samples (Figure 3A). In addition, we also found that some cells do not express any barcode, which we defined as negative cells, and some cells express two different barcodes, which represent doublets (Figure 3B). In summary, we found that around 70% of cells are singlets, 25% of cells are negative and 5% of the cells are doublets.
With the singlet cells, we can perform further downstream analyses to understand the cellular heterogeneity and molecular regulations. The potential analyses can be cell type annotation (Figure 4A), novel/rare cell type identification (Figure 4B), anatomical zone comparative analysis (Figure 4C), and gene ontology pathway analysis such as cell cycle phase separations (Figure 4D).
Figure 1: Multiplexed single cell mRNA sequencing workflow. Embryonic day 18.5 stage hearts were analyzed using a multiplexed droplet-based single cell sequencing procedure. RT = reverse transcription. Please click here to view a larger version of this figure.
Figure 2: Representative QC results at different steps. (A) QC analysis of cDNA from step 4.1.7. The target fragment size is 200 to 9000 bp. (B) Endogenous library and (C) barcode library were analyzed with an automated electrophoresis instrument. The target fragment size for the endogenous library is 300-600 bp, and the barcode library DNA size is around 172 bp. Please click here to view a larger version of this figure.
Figure 3: Demultiplexing the sequencing data from the lipid based barcoding strategy. (A) Unsupervised analysis of the barcode expression. X-axis represents single cells, and y-axis represents barcodes. Each of the 8 single cell populations were identified to uniquely express one of the 8 barcodes. Note some cells express more than one barcode, and some cells do not express any barcodes. (B) t-SNE plot of the singlet cells, doublet cells, and negative cells. Please click here to view a larger version of this figure.
Figure 4: Advanced analysis of single cell transcriptional data. (A-D) Single cell data can be analyzed in different ways to understand the cellular heterogeneity and molecular pathways. We have listed several applications here as examples. Single cells were loaded into an R package to identify cell types (A), rare cell populations (B), cell anatomical zones (C), and cell cycle phases (D). Please click here to view a larger version of this figure.
Mixture Name | Composition |
Collagenase mixture | 10 mg/mL collagenase A and 10 mg/mL collagenase B, dissolved in HBSS++ with 40% FBS. |
2 μM Anchor/Barcode stock solution | Mix 50 μM anchor and 10 μM barcode strand in 1:1 molar ratio in PBS (without FBS or BSA) for a total volume of 25 μL. |
2 μM Co-Anchor stock solution | Dilute 1 μL 50 μM Co-Anchor with 24 μL PBS (without FBS or BSA). |
Staining buffer | PBS containing 2% BSA, 0.01% Tween 20 |
Master Mixture | 20 μL RT Reagent, 3.1 μL Oligo, 2 μL Reducing Agent B, 8.3 μL RT Enzyme C. |
Beads Cleanup Mixture | 182 μL Cleanup Buffer, 8 μL Selection Reagent, 5 μL Reducing Agent B, 5 μL Nuclease-free Water. |
Amplification Reaction Mixture | 1 μL of 10 μM Lipid-tagged additive primer, 15 μL cDNA primer, 50 μL Amp Mix |
Elution Solution | 98 μL Buffer EB, 1 μL 10% Tween 20, 1 μL Reducing Agent B. |
Fragmentation Mixture | 5 μL Fragmentation Buffer, 10 μL Fragmentation Enzyme. |
Adaptor Ligation mixture | 20 μL Ligation Buffer, 10 μL DNA Ligase, 20 μL Adaptor Oligos. |
Sample Index PCR Mixture | 50 μL Amp Mix, 10 μL SI Primer |
Lipid barcode library mixture | 26.25 μL of 2× Hot Start master mix, 2.5 μL of 10 μM RPIX primer, 2.5 μL of 10 μM TruSeq Universal Adapter primer (see table of materials) |
Antibody barcode library mixture | 50 μL of 2× Hot Start master mix, 2.5 μL of 10 μM RPIX primer, 2.5 μL of 10 μM P5-smart-pcr hybrid oligo |
Table 1: The reagent mixtures used in the protocol.
Incubating Procedure | Temperature(1) | Time |
GEM-RT Incubation | Lid Temperature 53 °C | |
Step 1 | 53 °C | 45 min |
Step 2 | 85 °C | 5 min |
Step 3 | 4 °C | Hold |
10x Genomics cDNA Amplification | Lid Temperature 105 °C | |
Step 1 | 98 °C | 3 min |
Step 2 | 98 °C | 15 s |
Step 3 | 63 °C | 20 s |
Step 4 | 72 °C | 1 min |
Step 5 | Repeat steps 2 to 4 for 12 cycles in total (2) | |
Step 6 | 72 °C | 1 min |
Step 7 | 4 °C | Hold |
Library construction | Lid Temperature 65 °C | |
Pre-cool block | 4 °C | Hold |
Fragmentation | 32 °C | 5 min |
End Repair and A-tailing | 65 °C | 30 min |
Hold | 4 °C | Hold |
Adaptor ligation | Lid Temperature 30 °C | |
Step 1 | 20 °C | 15 min |
Step 2 | 4 °C | Hold |
Sample index PCR | Lid Temperature 105 °C | |
Step 1 | 98 °C | 45 s |
Step 2 | 98 °C | 20 s |
Step 3 | 54 °C | 30 s |
Step 4 | 72 °C | 20 s |
Step 5 | Repeat steps 2 to 4 for 12 cycles in total (3) | |
Step 6 | 72 °C | 1 min |
Step 7 | 4 °C | Hold |
Lipid barcode library PCR | ||
Step 1 | 95 °C | 5 min |
Step 2 | 98 °C | 15 s |
Step 3 | 60 °C | 30 s |
Step 4 | 72 °C | 30 s |
Step 5 | Repeat steps 2 to 4 for 10 cycles in total (4) | |
Step 6 | 72 °C | 1 min |
Step 7 | 4 °C | Hold |
Antibody barcode library PCR | ||
Step 1 | 95 °C | 3 min |
Step 2 | 95 °C | 20 s |
Step 3 | 60 °C | 30 s |
Step 4 | 72 °C | 20 s |
Step 5 | Repeat steps 2 to 4 for 8 cycles in total (5) | |
Step 6 | 72 °C | 5 min |
Step 7 | 4 °C | Hold |
Table 2: The incubating procedure used in the protocol. (1) Pay attention to the different lid temperature used in every Procedure. (2) Set total cycle numbers according to the cell load: 13 cycles for <500 cell load; 12 cycles for 500-6,000 cell load; 11 cycles for >6,000 cell load. (3) Set total cycle numbers according to the cDNA input: 14-16 cycles for 1-25 ng cDNA; 12-14 cycles for 25-150 ng cDNA; 10-12 cycles for 150-500 ng cDNA; 8-10 cycles for 500-1,000 ng cDNA; 6-8 cycles for 1000-1500 ng cDNA. (4) Set total cycle numbers according to the cDNA input: 8-12 cycles. (5) Set total cycle numbers according to the cDNA input: 6-10 cycles.
Lipid based barcoding Oligonucleotides | |
Anchor LMO | 5'-TGGAATTCTCGGGTGCCAAGGGTAACGATCCAGCTGTCACT-Lipid-3' |
Co-Anchor LMO | 5'-Lipid-AGTGACAGCTGGATCGTTAC-3' |
Barcode Oligo | 5'-CCTTGGCACCCGAGAATTCCANNNNNNNNA30-3' |
Lipid barcoding Additive Primer | 5'-CTTGGCACCCGAGAATTCC-3' |
RPIX Primer | 5'-CAAGCAGAAGACGGCATACGAGATNNNNNNGTGACTGGAGTTCC TTGGCACCCGAGAATTCCA-3' |
Universal Adapter Primer | 5'-AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGAC GCTCTTCCGATCT-3' |
Antibody based barcoding Oligonucleotides | |
Antibody barcoding oligo | 5'-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTNNNNNNNNNN NNBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA*A*A-3' |
HTO additive Primer | 5'-GTGACTGGAGTTCAGACGTGTGCTC-3' |
ADT additive Primer | 5'-CCTTGGCACCCGAGAATTCC-3' |
P5-smart-pcr hybrid oligo | 5'-AATGATACGGCGACCACCGAGATCTACACGCCTGTCCGCGGAA GCAGTGGTATCAACGCAGAGT*A*C-3' |
Table 3: Oligonucleotide sequences used in this protocol. N = Barcode or index sequence; * = Phosphorothioate bond
In this study, we have demonstrated a protocol to analyze single cell transcriptional profiles. We have also provided two optional methods to multiplex samples in the scRNA-Seq workflow. Both methods have proved to be feasible at various labs and provided solutions to run a cost-effective and batch effect-free single cell experiment18,26.
There are a few steps that should be followed carefully when going through the protocol. An ideal single cell suspension should have >90% of viable cells and the cell density should also be within a specific range27. It is critical to obtain a good quality of cells to minimize the presence of cellular aggregates, debris, and fibers. Cellular aggregates have negative impact on sample multiplexing and have a potential risk to clog the droplet generating machine17. Generally speaking, a 30-40 µm cell strainer is ideal for removing large clumps and debris while preserving the cell samples because most cells will shrink below 30 µm after dissociation. Single cell nuclei are recommended to use instead if the cell diameter is larger than 30 µm. At early embryonic stages, the cell size for all types of mouse cells should be smaller than 30 µm. However, at later stages, the cardiomyocytes in the heart, neurons in the brain, muscle cells in limbs, and some fat cells may have a cell size larger than 30 µm. Cell size should be measured for these types of cells before starting the single cell experiments.
The multiplexing strategies provide a way to simultaneously analyze a large number of samples in a cost-effective way. In addition, by profiling multiple samples together, we can significantly avoid the batch effects and identify cell doublets. These advantages will be very attractive to the single cell field. However, there are some factors that may limit their usage. As more cells are multiplexed in a single experiment, the cell doublet ratio will also increase. Although those doublets can be identified and removed by analyzing the multiplexing barcode data, it will lead to a large waste of sequencing reads. In addition, as more cells are pooled together, the cells are easier to break and cause an increase of the ambient mRNA, which will be captured into droplets with cells and interfere with the detection sensitivity. We are expecting that further optimization of the experimental workflow or bioinformatics analysis pipeline will resolve these two issues in the near future.
The authors have nothing to disclose.
We thank David M. Patterson and Christopher S. McGinnis from Dr. Zev J. Gartner lab for their kind supply of the lipid based barcoding reagents and suggestions on the experimental steps and data analysis. This work was founded by the National Institutes of Health (HL13347202).
10% Tween-20 | Bio-Rad | 1610781 | |
10x Chip Holder | 10x Genomics | 120252 330019 | |
10x Chromium Controller | 10x Genomics | 120223 | |
10x Magnetic Separator | 10x Genomics | 120250 230003 | |
10x Vortex Adapter | 10x Genomics | 330002, 120251 | |
10x Vortex Clip | 10x Genomics | 120253 230002 | |
4200 TapeStation System | Agilent | G2991AA | |
Agilent High Sensitivity DNA Kit | Agilent | 5067-4626 | University of Pittsburgh Health Sciences Sequencing Core |
Barcode Oligo | Integrated DNA Technologies | Single-stranded DNA | 25 nmol |
Buffer EB | Qiagen | 19086 | |
CD1 mice | Chales River | Strain Code 022 | ordered pregnant mice |
Centrifuge 5424R | Appendorf | 2231000214 | |
Chromium Chip B Single Cell Kit, 48 rxns | 10x Genomics | 1000073 | Store at ambient temperature |
Chromium i7 Multiplex Kit, 96 rxns | 10x Genomics | 120262 | Store at -20 °C |
Chromium Single Cell 3' GEM Kit v3,4 rxns | 10x Genomics | 1000094 | Store at -20 °C |
Chromium Single Cell 3' Library Kit v3 | 10x Genomics | 1000095 | Store at -20 °C |
Chromium Single Cell 3' v3 Gel Beads | 10x Genomics | 2000059 | Store at -80 °C |
Collagenase A | Sigma/Millipore | 10103578001 | Store powder at 4 °C, store at -20 °C after it dissolves |
Collagenase B | Sigma/Millipore | 11088807001 | Store powder at 4 °C, store at -20 °C after it dissolves |
D1000 ScreenTape | Agilent | 5067-5582 | University of Pittsburgh Health Sciences Sequencing Core |
DNA LoBind Tube Microcentrifuge Tube, 1.5 mL | Eppendorf | 022431021 | |
DNA LoBind Tube Microcentrifuge Tube, 2.0 mL | Eppendorf | 022431048 | |
Dynabeads MyOne SILANE | 10x Genomics | 2000048 | Store at 4 °C, used in Beads Cleanup Mix (Table 1) |
DynaMag-2 Magnet | Theromo Scientific | 12321D | |
Ethanol, Pure (200 Proof, anhydrous) | Sigma | E7023-500mL | |
Falcon 15mL High Clarity PP Centrifuge Tube | Corning Cellgro | 14-959-70C | |
Falcon 50mL High Clarity PP Centrifuge Tube | Corning Cellgro | 14-959-49A | |
Fetal Bovine Serum, qualified, United States | Fisher Scientific | 26140079 | Store at -20 °C |
Finnpipette F1 Multichannel Pipettes, 10-100μl | Theromo Scientific | 4661020N | |
Finnpipette F1 Multichannel Pipettes, 1-10μl | Theromo Scientific | 4661000N | |
Flowmi Cell Strainer | Sigma | BAH136800040 | Porosity 40 μm, for 1000 uL Pipette Tips, pack of 50 each |
Glycerin (Glycerol), 50% (v/v) | Ricca Chemical Company | 3290-32 | |
HBSS, no calcium, no magnesium | Thermo Fisher Scientific | 14170112 | |
Human TruStain FcX (Fc Receptor Blocking Solution) | BioLegend | 422301 | Add 5 µl of Human TruStain FcX per million cells in 100 µl staining volume |
Isopropanol (IPA) | Fisher Scientific | A464-4 | |
Kapa HiFi HotStart ReadyMix (2X) | Fisher Scientific | NC0295239 | Store at -20 °C, used in Lipid-tagged barcode library mix (Table 1) |
Lipid Barcode Primer (Multi-seq Primer) | Integrated DNA Technologies | Single-stranded DNA | 100 nmol |
Low TE Buffer (10 mM Tris-HCl pH 8.0, 0.1 mM EDTA) | Thermo Fisher Scientific | 12090-015 | |
MasterCycler Pro | Eppendorf | 950W | |
Nuclease-Free Water (Ambion) | Thermo Fisher Scientific | AM9937 | |
PCR Tubes 0.2 ml 8-tube strips | Eppendorf | 951010022 | |
Phosphate-Buffered Saline (PBS) 1X without calcium & magnesium | Corning Cellgro | 21-040-CV | |
Phosphate-Buffered Saline (PBS) with 10% Bovine Albumin (alternative to Thermo Fisher product) | Sigma-Aldrich | SRE0036 | |
Pipet 4-pack (0.1–2.5μL, 0.5-10μL, 10–100μL, 100–1,000μL variable-volume pipettes | Fisher Scientific | 05-403-151 | |
Selection reagent (SPRIselect Reagent Kit) | Beckman Coulter | B23318 (60ml) | |
Template Switch Oligo | 10x Genomics | 3000228 | Store at -20 °C, used in Master Mix (Table 1) |
The antibody based barcoding strategy is also known as Cell Hashing | |||
The cell browser is Loup Cell Browser | 10x Genomics | https://support.10xgenomics.com/single-cell-gene-expression/software/visualization/latest/what-is-loupe-cell-browser | |
The commercial available analysis pipline in step 8.1 is Cell Ranger | 10x Genomics | https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger | |
The lipid based barcoding strategy is also known as MULTI-seq | |||
The well maintained R platform is Seurat V3 | satijalab | https://satijalab.org/seurat/ | |
TipOne RPT 0.1-10/20 ul XL ultra low retention filter pipet tip | USA Scientific | 1180-3710 | |
TipOne RPT 1000 ul XL ultra low retention filter pipet tip | USA Scientific | 1182-1730 | |
TipOne RPT 200 ul ultra low retention filter pipet tip | USA Scientific | 1180-8710 | |
TotalSeq-A0301 anti-mouse Hashtag 1 Antibody | BioLegend | 155801 | 0.1 – 1.0 µg of antibody in 100 µl of staining buffer for every 1 million cells |
TotalSeq-A0302 anti-mouse Hashtag 2 Antibody | BioLegend | 155803 | 0.1 – 1.0 µg of antibody in 100 µl of staining buffer for every 1 million cells |
TotalSeq-A0302 anti-mouse Hashtag 3 Antibody | BioLegend | 155805 | 0.1 – 1.0 µg of antibody in 100 µl of staining buffer for every 1 million cells |
TrueSeq RPI primer | Integrated DNA Technologies | Single-stranded DNA | 100 nmol, used in Lipid-tagged barcode library mix (Table 1) |
Trypan Blue Solution, 0.4% | Fisher Scientific | 15250061 | |
Trypsin-EDTA (0.25%), phenol red | Fisher Scientific | 25200-056 | |
Universal I5 | Integrated DNA Technologies | Single-stranded DNA | 100 nmol |