Here, we present a protocol to generate high-quality, large-scale transcriptome data of single cells from isolated human pancreatic islets using a droplet-based microfluidic single-cell RNA sequencing technology.
Pancreatic islets comprise of endocrine cells with distinctive hormone expression patterns. The endocrine cells show functional differences in response to normal and pathological conditions. The goal of this protocol is to generate high-quality, large-scale transcriptome data of each endocrine cell type with the use of a droplet-based microfluidic single-cell RNA sequencing technology. Such data can be utilized to build the gene expression profile of each endocrine cell type in normal or specific conditions. The process requires careful handling, accurate measurement, and rigorous quality control. In this protocol, we describe detailed steps for human pancreatic islets dissociation, sequencing, and data analysis. The representative results of about 20,000 human single islet cells demonstrate the successful application of the protocol.
Pancreatic islets release endocrine hormones to regulate blood glucose levels. Five endocrine cell types, which differ functionally and morphologically, are involved in this essential role: α-cells produce glucagon, β-cells insulin, δ-cells somatostatin, PP cells pancreatic polypeptide, and ε-cells ghrelin1. Gene expression profiling is a useful approach to characterize the endocrine cells in normal or specific conditions. Historically, the whole islet gene expression profiling was generated using microarray and next-generation RNA sequencing2,3,4,5,6,7,8. Although the whole islet transcriptome is informative to identify the organ-specific transcripts and disease candidate genes, it fails to uncover the molecular heterogeneity of each islet cell type. Laser capture microdissection (LCM) technique has been applied to directly obtain specific cell types from islets9,10,11,12 but falls short of purity of the targeted cell population. To overcome these limitations, fluorescence-activated cell sorting (FACS) has been used to select specific endocrine cell populations, such as α- and β-cells13,14,15,16,17,18. Moreover, Dorrell et al. used an antibody-based FACS sorting approach to classify β-cells into four subpopulations19. FACS-sorted islet cells can also be plated for RNA sequencing of single cells; however, the plate-based methods face challenges in scalability20,21,22.
To generate high quality, large-scale transcriptome data of each endocrine cell type, we applied microfluidic technology to human islet cells. The microfluidic platform generates transcriptome data from a large number of single cells in a high-throughput, high-quality, and scalable manner23,24,25,26,27. In addition to revealing molecular characteristics of a cell type captured in a large quantity, highly-scalable microfluidic platform enables identification of rare cell types when enough cells are provided. Hence, application of the platform to human pancreatic islets allowed profiling of ghrelin-secreting ε-cells, a rare endocrine cell type with little known function due to its scarcity28. In recent years, several studies have been published by us and others reporting large-scale transcriptome data of human islets using the technology29,30,31,32,33. The data are publicly available and useful resources for the islet community to study endocrine cell heterogeneity and its implication in diseases.
Here, we describe a droplet-based microfluidic single-cell RNA sequencing protocol, which has been used to produce transcriptome data of approximately 20,000 human islet cells including α-, β-, δ-, PP, ε-cells, and a smaller proportion of non-endocrine cells32. The workflow starts with isolated human islets and depicts steps of islet cell dissociation, single-cell capture, and data analysis. The protocol requires the use of freshly isolated islets and can be applied to islets from humans and other species, such as rodents. Using this workflow, unbiased and comprehensive islet cell atlas under baseline and other conditions can be built.
1. Human islet dissociation
2. Single cell suspension quality control
3. Single cell partitioning using a microfluidic chip. Follow protocol from microfluidic chip manufacturer35.
4. Single cell cDNA amplification. Follow protocol from microfluidic chip manufacturer35.
5. Sequencing library construction
6. Library sequencing
7. Read alignment (Supplemental File 1)
8. Data analysis (Supplemental File 2)
The single-cell RNA sequencing workflow consists of three steps: dissociating intact human islets into single cell suspension, capturing single cells using a droplet-based technology, and analyzing RNA-seq data (Figure 1). Firstly, the acquired human islets were incubated overnight. The intact islets were examined under the microscope (Figure 2A). The integrity of dissociated islet cells has been validated using RNA fluorescence in situ hybridization (RNA-FISH). As shown in Figure 2B, dissociated α- and β -cells were visualized using GCG and INS mRNA probes, respectively.
Cell count and viability need to be determined before the single-cell capture step. Cells with low viability or high debris are not suitable for further processing. A good cell concentration usually ranges from 400 to 500 cells/µL. Approximately 6000 cells were loaded to the microfluidic chip in the single-cell partitioning step, and 100 μL of gel beads in emulsion were removed from the chip. Figure 3A exemplifies a successful example of emulsion following the partitioning step. The liquid in each pipette tips is uniform pale cloudy with minimal partitioning oil separated from the gel beads. In contrast, Figure 3B shows a poor-quality emulsion with clear phase separation between the gel beads and oil. This could be due to a clog during the chip run.
Following single cell partitioning, cDNA amplification was performed. Figure 4A illustrates a representative fragment size distribution after cDNA amplification. The typical peak for a good quality cDNA sample resided near 1000-2000 bp. Interestingly, a spike near 600 bp was specific to islet cDNA. The fragment size distribution for the RNA-seq libraries was between 300 and 500 bp (Figure 4B).
After sequencing, we employed a set of read alignment metrics to evaluate single-cell RNA-seq data quality (Table 7). The first three metrics well summarized single-cell sequencing library quality. On an average, 92% of reads were derived from intact cells and 72% of reads were mapped to exons. Out of all exon reads captured in droplets, 90% of them were produced by intact cells and the rest was likely ambient RNAs in cell-free droplets. These alignment metrics suggest good data quality. The ratio between exon reads and UMI was an empirical measurement to evaluate sequencing saturation and usually, 10:1 ratio was a good indicator. Additionally, the number of detected genes (UMI > 0) was a useful feature to characterize different cell types. For human islet cells, the number of detected genes is about 1,900 in each cell.
We sequenced a total of 20,811 islet cells from 12 non-diabetic donors. Expression of more than one hormone was detected in about 6% of the cells. These multi-hormonal cells are most likely doublets because our previous work showed that less than < 0.1% of single islet cells co-expressed more than one endocrine hormone33. We removed all the identified multi-hormonal cells. It is also important to exclude low-quality cells based on total UMI, detected genes, and cell viability33. After these quality control steps, 19,174 remained for further analysis. The clustering analysis revealed 12 cell types: α-, β-, δ-, PP, ε-cells, acinar, ductal, quiescent stellate, activated stellate, endothelial, macrophage, and mast cells (Figure 5). As expected, endocrine cells were the majority (Table 8). The top enriched genes in α-cells (i.e., GCG, TTR, CRYBA2, TM4SF4, TMEM176B) and β-cells (i.e., IAPP, INS, HADH, DLK1, RBP4) are consistent with other studies13,15,16,17,18,20,21,22,29,30,31,33. Interestingly, both α- and β-cells consisted of several subpopulations. Three β-cell subpopulations, Beta sub1, 2, and 3, were similar with small numbers of subpopulation-enriched genes (18 in Beta sub1, 33 in Beta sub 2, and 18 in Beta sub 3). The fourth subpopulation had 488 enriched genes. The small α-cell subpopulation (Alpha sub3) comprised proliferating cells, characterized by enriched expression of MKI67, CDK1, and TOP2A.
Figure 1: Schematic diagram of single-cell RNA sequencing workflow. Please click here to view a larger version of this figure.
Figure 2: Representative images of intact and dissociated human islets. (A) An image of islets taken after overnight incubation. (B) Dissociated islet cells visualized by RNA-FISH staining for INS (white) and GCG (red). Please click here to view a larger version of this figure.
Figure 3: Examination of the quality of single-cell emulsion prior to reverse transcription. (A) A single-cell emulsion of good quality. The liquid in each pipette tip was homogeneously cloudy. (B) A single-cell emulsion of poor quality. The liquid in the pipette tip was not homogeneous and showed separation between oil and the gel beads. Please click here to view a larger version of this figure.
Figure 4: Examination of the quality of single-cell cDNA and library. (A) A representative cDNA traces. This cDNA was of good quality and yield, with the main peak for the sample occurring near 1000-2000 bp. The spike in the trace around 600 bp was typical and distinctive of islet cDNA. (B) A representative final sequencing library trace. This library was of good quality and yield, with the main peak occurring between 300-500 bp. Please click here to view a larger version of this figure.
Figure 5: Cell types and subpopulations identified in single-cell RNA sequencing of human pancreatic islets. Cells were clustered by distinctive cell types in the space of t-distributed stochastic neighbor embedding (tSNE) dimensions. The analysis also revealed three subpopulations in α-cells and four in β-cells. Please click here to view a larger version of this figure.
Reagent Name | Vol. to Use (μL) per reaction |
RT Reagent Mix | 50 |
RT Primer | 3.8 |
Additive A | 2.4 |
RT Enzyme Mix | 10 |
Total | 66.2 |
Table 1: Reverse transcript mix.
Reagent Name | Volume to Use (uL) per reaction |
Nuclease-free water | 9 |
Buffer Sample Clean Up 1 | 182 |
Dynabeads MyOne Silane | 4 |
Additive A | 5 |
Total | 200 |
Table 2: Cleanup mix.
Reagent Name | Volume to Use (uL) per reaction |
Buffer EB | 98 |
10% Tween 20 | 1 |
Additive A | 1 |
Total | 100 |
Table 3: Elution solution.
Reagent Name | Volume to Use (uL) per reaction |
Nuclease-free water | 8 |
Amplification Master Mix | 50 |
cDNA Additive | 5 |
cDNA Primer Mix | 2 |
Total | 65 |
Table 4: cDNA amplification mix.
Reagent Name | Volume to Use (μL) per reaction |
Tagmentation Enzyme | 5 |
Tagmentation Buffer | 25 |
Total | 30 |
Table 5: Tagmentation mix.
Reagent Name | Volume to Use (μL) per reaction |
Nuclease-free water | 8 |
Amplification Master Mix | 50 |
SI-PCR Primer | 2 |
Total | 60 |
Table 6: Sample index PCR master mix.
Sample ID | % Reads with Valid Cell Barcodes | % Exon Reads in Captured Cells among Total Cells | % Exon Reads among Total Reads | Mean Exon Reads per Cell | Median UMI per Cell | Median Genes per Cell |
Sample-1 | 92% | 93% | 76% | 142,015 | 10,310 | 1,747 |
Sample-2 | 92% | 91% | 74% | 151,395 | 11,350 | 1,754 |
Sample-3 | 94% | 92% | 75% | 120,538 | 19,604 | 2,180 |
Sample-4 | 95% | 93% | 67% | 160,657 | 11,870 | 2,111 |
Sample-5 | 94% | 92% | 62% | 177,809 | 13,821 | 2,288 |
Sample-6 | 95% | 89% | 67% | 138,208 | 8,235 | 1,296 |
Sample-7 | 94% | 89% | 72% | 147,484 | 13,606 | 2,272 |
Sample-8 | 94% | 91% | 69% | 159,793 | 9,505 | 1,865 |
Sample-9 | 95% | 92% | 72% | 168,436 | 12,794 | 2,389 |
Sample-10 | 83% | 83% | 74% | 88,067 | 13,323 | 1,805 |
Sample-11 | 82% | 88% | 77% | 67,752 | 9,295 | 1,278 |
Sample-12 | 91% | 85% | 74% | 194,781 | 14,877 | 1,746 |
Table 7: Read alignment metrics.
Cell type | Number of cells | Ave. cells per donor (standard deviation) |
Alpha | 6546 | 546 (258) |
Beta | 7361 | 613 (252) |
Delta | 922 | 77 (37) |
PP | 545 | 45 (25) |
Epsilon | 11 | 1 (1) |
Acinar | 836 | 70 (71) |
Ductal | 1313 | 109 (95) |
Quiescent stellate | 225 | 19 (14) |
Activated stellate | 890 | 74 (58) |
Endothelial | 408 | 34 (21) |
Macrophage | 80 | 7 (6) |
Schwann | 37 | 3 (3) |
Table 8: Cell type composition. Total number of cells for each cell type and average cells for each donor in each cell type.
Supplemental File 1: Commands used for sequence alignment. Please click here to download this file.
Supplemental File 2: R scripts to perform cell quality control, cell clustering, and to identify cell-type enriched genes. Please click here to download this file.
Single-cell technologies developed in recent years provide a new platform to characterize cell types and study molecular heterogeneity in human pancreatic islets. We adopted a protocol of droplet-based microfluidic single-cell isolation and data analysis to study human islets. Our protocol successfully produced RNA sequencing data from over 20,000 single human islet cells with relatively small variations in sequence quality and batch effects.
In particular, two steps are critical in this protocol for high-quality outcomes. Caution needs to be taken when dissociating human islets. It is important to not over digest the islets. Single-cell partitioning is another key step for a successful single-cell experiment. We demonstrated examples of good- and poor-quality emulsions in Figure 3. A clear emulsion is usually an indicative of inadequate number of cells being collected in the partitioning step.
The access to isolated primary human islets is a rate-limiting step to generate large-scale human islet single-cell transcriptomes. Isolated islets from individual cadaver donors are usually processed at different times, thus potential sample-dependent batch effects should be carefully examined during data analysis. Integrative analysis can be used to identify common cell types and subpopulations across individual batches41. The batch effect can also be adjusted by batch-corrected expression quantification42. Another challenge to analyze single-cell RNA-seq data is to identify doublets. In the data pre-processing, we took measures to remove endocrine doublets by identifying cells expressing multiple hormone genes (GCG, INS, SST, PPY, and GHRL). Identification of doublets formed by two different cell types is a relatively easy task due to the extremely high expression of endocrine hormones. The real challenge is to identify within-cell-type doublets, e.g., doublets by two α-cells. Because higher UMI and higher number of detected genes are suggestive of potential doublets, one solution is to remove outliers with a high number of genes and UMI during the cell QC step. Additionally, tools to detect doublets are available43,44,45.
A major limitation of single-cell RNA sequencing is low sensitivity. Using spike-in External RNA Controls Consortium (ERCC) RNAs, we estimated that only 10% of all expressed genes were detected using the current protocol and that detected ones were biased toward high abundance genes46. Pancreatic endocrine cells express extremely high-level of hormone genes (i.e., GCG, INS, SST, and PPY). As a result, the mRNAs of these genes have the risk to become ambient RNA. Such background noises cannot be entirely avoided. However, this step-by-step protocol will help researchers minimize undesired experimental noises. The current protocol is designed for freshly isolated tissues. Other technologies, such as single-nucleus RNA sequencing47,48, are available for RNA-seq of fresh, frozen, or lightly fixed tissues. Additionally, a recently developed cell hashing technology49 can be considered as an advanced microfluidic single-cell protocol allowing sample multiplexing.
The authors have nothing to disclose.
NONE
30 µm Pre-Separation Filters | Miltenyi Biotec | 130-041-407 | Cell strainer |
8-chamber slides | Chemometec | 102673-680 | Dell counting assay slides |
Bioanalyzer High Sensitivity DNA Kit | Agilent | 5067-4626 | for QC |
Bovine Serum Albumin | Sigma-Aldrich | A9647 | Single cell media |
Chromium Single Cell 3' Library & Gel Bead Kit v2, 16 rxns | 10X Genomics | 120237 | Single cell reagents |
Chromium Single Cell A Chip Kit v2, 48 rx (6 chips) | 10X Genomics | 120236 | Microfluidic chips |
CMRL-1066 | ThermoFisher | 11530-037 | Complete islet media |
EB Buffer | Qiagen | 19086 | Elution buffer |
Eppendorf twin-tec PCR plate, 96-well, blue, semi-skirted | VWR | 47744-112 | Emulsion plate |
Fetal Bovine Serum | ThermoFisher | 16000-036 | Complete islet media |
Human islets | Prodo Labs | HIR | Isolated human islets |
L-Glutamine (200 mM) | ThermoFisher | 25030-081 | Complete islet media |
Nextera DNA Library Preparation Kit (96 samples) | Illumina | FC-121-1031 | Library preparation reagents |
NextSeq 500/550 High Output Kit v2.5 (75 cycles) | Illumina | FC-404-2005 | Sequencing |
Penicillin-Streptomycin (10,000 U/mL) | ThermoFisher | 15140-122 | Complete islet media |
Qubit High Sensitivity dsDNA Kit | Life Technologies | Q32854 | for QC |
Solution 18 | Chemometec | 103011-420 | Cell counting assay reagent |
SPRISelect Reagent | Fisher Scientific | B23318 | Purification beads |
Tissue Culture Dishes (10 cm) | VWR | 10861-594 | for islet culture |
TrypLE Express | Life Technologies | 12604-013 | Cell dissociation solution |
Zymo DNA Clean & Concentrator-5, 50 reactions | VWR | 77001-152 | Library clean up columns |