Here, the authors showcase the utility of MULTI-seq for phenotyping and subsequent paired scRNA-seq and scATAC-seq in characterizing the transcriptomic and chromatin accessibility profiles in retina.
Powerful next generation sequencing techniques offer robust and comprehensive analysis to investigate how retinal gene regulatory networks function during development and in disease states. Single-cell RNA sequencing allows us to comprehensively profile gene expression changes observed in retinal development and disease at a cellular level, while single-cell ATAC-Seq allows analysis of chromatin accessibility and transcription factor binding to be profiled at similar resolution. Here the use of these techniques in the developing retina is described, and MULTI-Seq is demonstrated, where individual samples are labeled with a modified oligonucleotide-lipid complex, enabling researchers to both increase the scope of individual experiments and substantially reduce costs.
Understanding how genes can influence cell fate plays a key role in interrogating processes such as disease and embryonic progression. The complex relationships between transcription factors and their target genes can be grouped in gene regulatory networks. Mounting evidence places these gene regulatory networks at the center of both disease and development across evolutionary lineages1. While previous techniques such as qRT-PCR focused on a single gene or set of genes, the application of high-throughput sequencing technology allows for the profiling of complete cellular transcriptomes.
RNA-seq offers a glimpse into large scale transcriptomics2,3. Single-cell RNA sequencing (scRNA-seq) gives investigators the ability to not only profile transcriptomes but link specific cell types with gene expression profiles4. This is achieved bioinformatically by feeding individual cell profiles into sorting algorithms using known gene markers5. Multiplexing using lipid-tagged indices sequencing (MULTI-seq) offers unprecedented diversity in the number of scRNA-Seq profiles that can be collected6. This lipid based technique differs from other sample indexing techniques such as cell-hashing that rely on the presence of surface antigens and high affinity antibodies instead of plasma membrane integration7. Not only is it now possible to profile gene expression profiles into cell types but different experiments can be combined into a single sequencing library, dramatically lowering the cost of an individual scRNA-seq experiment6. The cost of scRNA-seq may seem prohibitive for use in phenotyping experiments where many different genotypes, conditions or patient samples are analyzed, but multiplexing allows the combination of up to 96 samples in a single library6.
Profiling gene expression via scRNA-seq has not been the only high-throughput sequencing-based technique to revolutionize the current understanding of how molecular mechanisms dictate cell fate. While understanding which gene transcripts are present in a cell enables the identification of cell type, equally important is understanding how genomic organization regulates development and disease progression. Early studies relied on detecting DNase-mediated cleavage of sequences not bound to histones, followed by sequencing of the resulting DNA fragments to identify regions of open chromatin. In contrast, single cell assay for transposon accessible chromatin sequencing (scATAC-seq) allows researchers to probe DNA with a domesticated transposon to readily profile open chromatin at the single nucleotide level8. This has gone through a similar scaling to scRNA-seq and now investigators can identify individual cell types and profile phenotypes across thousands of individual genomes8.
The pairing of scRNA-seq and scATAC-seq has allowed researchers the ability to profile thousands of cells to determine cell populations, genomic organization, and gene regulatory networks in disease models and developmental processes9,10,11,12. Here the authors outline how to first utilize MULTI-seq to condense phenotyping of a myriad of animal models and employ paired scRNA-seq and scATAC-seq to gain a better understanding of the chromatin landscape and regulatory networks in these animal models.
The use of animals for these studies was conducted using protocols approved by the Johns Hopkins Animal Care and Use Committee, in compliance with ARRIVE guidelines, and were performed in accordance with relevant guidelines and regulations.
1. MULTI-seq
2. Paired scRNA-seq and scATAC-seq
This workflow lays out a strategy for investigation of developmental phenotypes and regulatory processes using single cell sequencing. MULTI-seq sample multiplexing enables an initial low-cost phenotyping assay while paired collection and fixation of samples for scRNA-seq and scATAC-seq allows for more in-depth investigation (Figure 1).
MULTI-seq barcoding enables the combined sequencing of multiple samples and their subsequent computational deconvolution. The sample of origin can be determined for each cell based on their barcode expression (Figure 2A). These combined samples can be analyzed as a single dataset for the purposes of cell clustering and cell type identification (Figure 2B). Because each cell is barcoded before GEM generation, cell doublets will have a high probability of showing expression for multiple MULTI-seq barcodes and a majority of doublets can therefore be identified and removed prior to clustering and cell type identification (Figure 2C). Increasing the number of cells used in the GEM generation step will increase the proportion of doublets found. scATAC-seq can be used to generate a dataset with cell types to match those found by scRNA-seq (Figure 2D). The pairing of scRNA-seq gene expression and scATAC-seq DNA accessibility information enables the reconstruction of gene regulatory networks.
Figure 1: Schematic demonstrating the use of MULTI-Seq in initial analysis, followed by separate scRNA-Seq and scATAC-Seq analysis in in-depth characterization of phenotypes, treatments, or disease states of interest. Please click here to view a larger version of this figure.
Figure 2: UMAP dimensional reduction representations of MULTI-seq data for an allelic series of P0 Sstr2 knockout mice demonstrating (a) the deconvolution of genotype for each cell in the dataset and (b) the identification of cell types in the dataset. Overloading cells during the GEM generation and barcoding step will result in an increase in cell doublets like those as seen in (c), which shows the data from (a) and (b) before doublet removal and reclustering. In (d), scATAC-Seq data from GFP-positive cells obtained at E16 from retinal explants electroporated with a GFP-expressing control plasmid at E14. Cell types are annotated based on accessibility of cell type-specific genes. This figure has been modified from Weir, K., Kim, D. W., Blackshaw, S. Regulation of retinal neurogenesis by somatostatin signaling. bioRxiv 2020.09.26.314104 (2020) doi:10.1101/2020.09.26.31410418 and original, unpublished data. Please click here to view a larger version of this figure.
The power of MULTI-seq stems from seamless integration of data from multiple experimental conditions or models and the enormous benefit in terms of cost and limiting batch effects. Utilizing MULTI-seq offers a laboratory unprecedented phenotyping depth. Non-genetic multiplexing methods such as cell hashing or nuclei hashing opened the door to multiplexed samples through the use of barcoded antibodies7,19,20. However, this relies on the availability of high affinity antibodies that recognize surface proteins expressed on cells or their nuclei, which will not be possible if these antibodies are unavailable or the cells do not express appropriate cell surface or nuclear antigens7. Because MULTI-seq utilizes lipid-modified oligo barcodes to stably incorporate into the cells or their nuclei, it allows researchers to gather transcriptomic data from up to 96 fresh or fixed samples in a cheaper and more broadly applicable manner6.
Following up on the initial MULTI-Seq phenotyping with paired scRNA-Seq and scATAC-Seq is suggested and showcased to gain an understanding of the genomic organization that coincides with the transcriptomic data1. This not only gives an idea of the heterochromatin and euchromatin regions but also valuable understanding of the transcription factor networks driving gene expression. A multi-omic approach can be used to reveal the dynamic chromatin changes that take place at key points of cell fate decisions and determine cellular trajectories in development and disease1,21,22. This can be accomplished through bioinformatically subjecting the data to pseudotime, cis-regulatory interactions and footprinting analysis5,23,24,25. Fixing the samples and sequencing multiple in a multiplexed run reduces sources of batch effect, enabling comparison across samples such as through a time course experiment.The number of tissue samples required depends on the scope of the experiment. When examining the phenotypes associated with embryonic time points, single retinas are often sufficient and can provide hundreds of thousands of cells. A single retina may not provide enough cells in more complex experimental schemes: ex vivo electroporations of retinal explants, probing for rare cell populations, or genetic models with insufficient CRE activation. While a single retina may provide a few hundred cells, these will not capture the full tissue complexity. For such experiments, optimization will be required based on a researcher's needs. Utilizing these techniques, one can gain insight into how the genes analyzed in MULTI-seq are regulated during dynamic processes such as development and disease.
Regulation of gene regulatory networks lies at the heart of understanding cellular processes and how they contribute to development and disease1,26,27. The workflow presented here can be used to identify these gene regulatory networks in specific cell types. However, this protocol has been optimized for use with mouse retinal tissue. Optimization of various steps, such as lysis or dissociation time, centrifugation speeds/times, and number of filtration steps may be required to maximize the number of cells or nuclei and minimize the cell debris in cell suspensions from other tissue types, ex vivo samples, or species, whether samples are fresh, frozen, or fixed in methanol. Methanol fixation time may need to be increased if samples show high levels of viable cells with trypan blue staining. The MULTI-seq technique introduces many additional wash steps over traditional scRNA-seq. To avoid the accidental disposal of valuable cells or DNA, it is prudent to optimize centrifugation speeds and maintain supernatants that would normally be discarded in the cell barcoding, post-GEM RT cleanup, and barcode library construction steps on ice until that step has been verified to be successful. One limitation to MULTI-seq is the limited number of cells, and therefore samples, that can be sequenced from a single well in the GEM generation step. It is recommended to not try to excessively overload cells during this step to avoid a substantial increase in doublet cell capture. Avoid loading a GEM well with more than 20,000 cells. Rather, multiple wells can be prepared from a single combined suspension and multiplexed during sequencing. This will require preparing a large enough volume of cell suspension for GEM generation and the addition of replicates to steps 5, 6, and 7 and will increase the cost of sequencing as more total reads are needed for more cells. With proper optimization, this workflow will enable cost and time efficient identification of cell type-specific phenotypes and gene regulatory networks.
The authors have nothing to disclose.
We thank Linda Orzolek from the Johns Hopkins Transcriptomics and Deep Sequencing Core for help in sequencing the produced libraries and Lizhi Jiang for performing the ex vivo retinal explants.
10 µL, 200 µL, 1000 µL pipette filter tips | |||
10% Tween 20 | Bio-Rad | 1662404 | |
100 µM Barcode Solution | Request from Gartner lab | https://docs.google.com/forms/d/1bAzXFEvDEJse_cMvSUe_yDaP rJpAau4IPx8m5pauj3w/viewform?ts=5c47a897&edit_requested =true |
|
100% Ethanol | Millipore Sigma | E7023-500ML | |
100% Methanol | Millipore Sigma | 322415-100ML | |
10x Chip Holder | 10x Genomics | 1000195 | |
10x Chromium controller & Accessory Kit | 10x Genomics | PN-120263 | |
15mL Centrifuge Tube | Quality Biological | P886-229411 | |
40 µm FlowMi Cell Strainer | Bel-Art | H13680-0040 | |
50 µM Anchor Solution | Sigma or request from Gartner lab | https://docs.google.com/forms/d/1bAzXFEvDEJse_cMvSUe_yDaP rJpAau4IPx8m5pauj3w/viewform?ts=5c47a897&edit_requested =true |
|
50 µM Co-Anchor Solution | Sigma or request from Gartner lab | https://docs.google.com/forms/d/1bAzXFEvDEJse_cMvSUe_yDaP rJpAau4IPx8m5pauj3w/viewform?ts=5c47a897&edit_requested =true |
|
5200 Fragment Analyzer system | Agilent | M5310AA | |
70 um FlowMi cell strainer | Bel-Art | H13680-0070 | |
Allegra X-12R Centrifuge | VWR | BK392302 | |
Bovine Serum Albumin | Sigma-Aldrich | A9647 | |
Chromium Next GEM Chip G | 10x Genomics | PN-1000120 | |
Chromium Next GEM Chip H | 10x Genomics | PN-1000161 | |
Chromium Next Gem Single Cell ATAC Reagent Kit v1.1 | 10x Genomics | PN-1000175 | |
Chromium Single Cell 3' GEM, Library & Gel Bead Kit v3.1 | 10x Genomics | PN-1000121 | |
Digitonin | Fisher Scientific | BN2006 | |
Dissection microscope | Leica | ||
DNA LoBind Tubes, 1.5 mL | Eppendorf | 22431021 | |
Dry Ice | |||
EVA Foam Ice Pan | Tequipment | 04393-54 | |
FA 12-Capillary Array Short, 33 cm | Agilent | A2300-1250-3355 | |
Fisherbrand Isotemp Water Bath | Fisher Scientific | 15-460-20Q | |
Forma CO2 Water Jacketed Incubator | ThermoFisher Scientific | 3110 | |
Glycerol 50% Aqueous solution | Ricca Chemical Company | 3290-32 | |
Hausser Scientific Bright-Line Counting Chamber | Fisher Scientific | 02-671-51B | |
Illumina NextSeq or NovaSeq | Illumina | ||
Kapa Hifi Hotstart ReadyMix | HiFi | 7958927001 | |
Low TE Buffer | Quality Biological | 351-324-721 | |
Magnesium Chloride Solution 1 M | Sigma-Aldrich | M1028 | |
Magnetic Separator Rack for 1.5 mL tubes | Millipore Sigma | 20-400 | |
Magnetic Separator Rack for 200 µL tubes | 10x Genomics | NC1469069 | |
MULTI-seq Primer | Sigma or IDT | See sequence list | |
MyFuge Mini Centrifuge | Benchmark Scientific | C1008 | |
Nonidet P40 Substitute | Sigma-Aldrich | 74385 | |
Nuclease-free water | Fisher Scientific | AM9937 | |
P2, P10, P20, P200, P1000 micropipettes | Eppendorf | ||
Papain Dissociation System | Worthington Biochemical Corporation | LK003150 | |
PBS pH 7.4 (1X) | Fisher Scientific | 10010-023 | |
Qiagen Buffer EB | Qiagen | 19086 | |
Refridgerated Centrifuge 5424 R | Eppendorf | 2231000655 | |
RNase-free Disposable Pellet Pestles | Fisher Scientific | 12-141-368 | |
RNasin Plus RNase Inhibitor | Promega | N2615 | |
RPI primer | Sigma or IDT | See sequence list | |
Single Index Kit N, Set A | 10x Genomics | PN-1000212 | |
Single Index Kit T Set A | 10x Genomics | PN-1000213 | |
Sodium Chloride Solution 5 M | Sigma-Aldrich | 59222C | |
SPRIselect Reagent Kit | Beckman Coulter | B23318 | |
Standard Disposable Transfer Pipettes | Fisher Scientific | 13-711-7M | |
TempAssure PCR 8-tube strip | USA Scientific | 1402-4700 | |
Trizma Hydrochloride Solution, pH 7.4 | Sigma-Aldrich | T2194 | |
Trypan Blue Solution, 0.4% (w/v) | Corning | 25-900-CI | |
Universal I5 primer | Sigma or IDT | See sequence list | |
Veriti Thermal Cycler | Applied Biosystems | 4375786 | |
Vortex Mixer | VWR | 10153-838 |