Here, we present a protocol to perform an assay for transposase-accessible chromatin sequencing (ATAC-seq) on activated CD4+ human lymphocytes. The protocol has been modified to minimize contaminating mitochondrial DNA reads from 50% to 3% through the introduction of a new lysis buffer.
ATAC-seq has become a widely used methodology in the study of epigenetics due to its rapid and simple approach to mapping genome-wide accessible chromatin. In this paper we present an improved ATAC-seq protocol that reduces contaminating mitochondrial DNA reads. While previous ATAC-seq protocols have struggled with an average of 50% contaminating mitochondrial DNA reads, the optimized lysis buffer introduced in this protocol reduces mitochondrial DNA contamination to an average of 3%. This improved ATAC-seq protocol allows for a near 50% reduction in the sequencing cost. We demonstrate how these high-quality ATAC-seq libraries can be prepared from activated CD4+ lymphocytes, providing step-by-step instructions from CD4+ lymphocyte isolation from whole blood through data analysis. This improved ATAC-seq protocol has been validated in a wide range of cell types and will be of immediate use to researchers studying chromatin accessibility.
The assay for transposase-accessible chromatin sequencing (ATAC-seq) has rapidly become the leading method for interrogating chromatin architecture. ATAC-seq can identify regions of accessible chromatin through the process of tagmentation, the fragmenting and tagging of DNA by the same enzyme, to produce libraries with which sequencing can determine chromatin accessibility across an entire genome. This tagmentation process is mediated by the hyperactive Tn5 transposase, which only cuts open regions of chromatin due to nucleosomic steric hindrance. As it cuts, the Tn5 transposase also inserts sequencing adapters that allow for rapid library construction by PCR and next-generation sequencing of genome-wide accessible chromatin1,2.
ATAC-seq has become the preferred method to determine regions of chromatin accessibility due to the relatively simple and fast protocol, quality and range of information that can be determined from its results, and small amount of starting material required. Compared to DNase-seq3 (which also measures genome-wide chromatin accessibility), MNase-seq4 (which determines nucleosome positions in open genome regions), and the formaldehyde-mediated FAIRE-seq5, ATAC-seq is faster, cheaper, and more reproducible1. It is also more sensitive, working with starting material of as few as 500 nuclei, compared to the 50 million nuclei required for DNase-seq3. ATAC-seq also has the ability to provide more information about chromatin architecture than other methods, including regions of transcription factor binding, nucleosome positioning, and open chromatin regions1. Effective, single-cell ATAC-seq protocols have been validated, providing information on chromatin architecture at the single-cell level6,7.
ATAC-seq has been used to characterize chromatin architecture across a wide spectrum of research and cells types, including plants8, humans9, and many other organisms. It has also been critical in identifying epigenetic regulation of disease states7. However, the most widely used ATAC-seq protocol includes the major drawback of contamination sequencing reads from mitochondrial DNA. In some data sets, this contamination level can be as high as 60% of sequencing results1. There is a concerted effort in the field to reduce these contaminating mitochondrial reads in order to allow for the more efficient application of ATAC-seq7,10,11. Here we present an improved ATAC-seq protocol that reduces the mitochondrial DNA contamination rate to just 3%, allowing reduction of around 50% in sequencing costs10. This is made possible by a streamlined process of CD4+ lymphocyte isolation and activation and an improved lysis buffer that is critical in minimizing mitochondrial DNA contamination.
This modifiedATAC-seq protocol has been validated with a wide range of primary cells, including human primary peripheral blood mononuclear cells (PBMCs)10, human primary monocytes, and mouse dendritic cells (unpublished). It has also been used successfully to interrogate melanoma cell lines in a clustered regularly interspaced short palindromic repeats (CRISPR) screen of non-coding elements11. Additionally, the data analysis package described in this protocol and provided on GitHub provides new and experienced researchers with tools to analyze ATAC-seq data. ATAC-seq is the most effective assay to map chromatin accessibility across an entire genome, and modifications to the existing protocol that are introduced here will allow researchers to produce high-quality data with low mitochondrial DNA contamination, reducing sequencing costs and improving ATAC-seq throughput.
This improved protocol provides step-by-step instructions for performing ATAC-seq of CD4+ lymphocytes, from the starting material of whole blood through data analysis (Figure 1).
1. Isolation of CD4+ T Cells from Whole Blood
NOTE: The starting material for this protocol is 15 mL of fresh whole blood collected using standard procedures, allowing the source of the starting material to be selected based on research requirements. Scale the protocol as needed. Pre-warm phosphate buffered saline (PBS) + 2% fetal calf serum (FCS) to room temperature (RT) and adjust the centrifuge to RT before starting the CD4+ T cell enrichment procedure.
2. Activate and Purify CD4+ T Cells
NOTE: This rapid protocol for activating and purifying CD4+ T cells only requires 48 h and results in 95% viable, activated CD4+ T cells. Cool the centrifuge to 4 °C before beginning the protocol.
3. ATAC-seq
NOTE: In this step, nuclei are isolated from activated CD4+ T cells for ATAC-seq. The lysis buffer used in this protocol has been improved to be gentler on the nuclei, resulting in more efficient digestion and higher quality results. All centrifugation steps in section 3 are performed with a fixed-angle centrifuge maintained at 4 °C. Cool the centrifuge before beginning protocol.
4. ATAC-seq Library Quality Analysis
NOTE: It is important to validate the quality and quantity of ATAC-seq libraries before next-generation sequencing. The quality and quantity of the libraries should be assessed using commercially available kits (see Table of Materials).
5. Sequencing and Data Analysis
NOTE: This analysis pipeline allows users to control the quality of reads mapping procedure, adjust coordinates for experimental design, and call peaks for downstream analysis. The following are the commands lines and explanation of execution. The data analysis package is available at (https://github.com/s18692001/bulk_ATAC_seq).
From 15 mL of fresh whole blood, this protocol generates an average of 1 million CD4+ T cells. These can be frozen for later processing or used immediately. Viability of the CD4+ T cells, fresh or thawed, was consistently >95%. This method of CD4+ T cell isolation allows for flexibility in source material and collection time. This improved ATAC-seq protocol produces a final library of greater than 1 ng/µL for sequencing. Quality control performed using commercially available systems should demonstrate DNA fragments between 200 and 1,000 bp (Figure 2). Sequencing should only be performed with high quality libraries.
All libraries were sequenced to an average depth of greater than 40 million read per sample. While commonly used ATAC-seq protocols have been challenged by contaminating sequencing reads from mitochondrial DNA that can range from 50%-60% of the total sequencing reads1 this improved protocol eliminates the issue. Libraries prepared following this protocol contain on average only 3% mitochondrial reads (Figure 3A). The high percentage of usable reads is sufficiently constant across biological replicates (Figure 3B). The protocol was able to provide highly reproducible results across technical replicates (Figure 4A, B) as well as biological replicates (Figure 4C, D). Additionally, the protocol for CD4+ T cell activation presented takes 48 h rather than one week or more and results in consistent and efficient activation, as demonstrated by reproducible sequencing results (Figure 4A, B). Predicted ATAC-seq peaks are accurately called by the analysis pipeline (Figure 4E). Analysis of sequencing results identified clear changes in chromatin state during human T cell activation. Differentially accessible regions of open chromatin were identified between six samples before and after 48 h of activation (Figure 5).
Figure 1: Experimental overview of the modified ATAC-seq protocol. (A) Sample acquisition and processing, from 15 mL of patient whole blood through CD4+ T cell isolation, plating and activation of the T cells, and nuclei isolation with the improved lysis buffer. (B) The transposase reaction and PCR amplification of the sequencing library. (C) Quality analysis, sequencing, and data analysis. Please click here to view a larger version of this figure.
Figure 2: Representative high quality ATAC-seq libraries from a microfluidics-based platform for sizing, quantification and quality control of DNA. (A) Electronic gel image of samples B1 and D1 with banding between 200 and 1,000 base pairs. (B and C) Electropherogram trace result of samples B1 (B) and D1 (C), with peaks between 200 and 1,000 base pairs. Please click here to view a larger version of this figure.
Figure 3: Decreased mitochondrial DNA contamination with the improved ATAC-seq protocol results in an increase in usable DNA sequencing reads. (A) Comparison of usable reads (purple), duplicate reads (green), and mitochondrial reads (red) from CD4+ T cell ATAC-seq profiling in the literature. (B) Comparison of usable reads (purple), duplicate reads (green), mitochondrial reads (red), and unmapped reads (blue) of CD4+ T cell ATAC-seq profiling from multiple healthy individuals (n = 22). This figure has been modified from Cheng et al.10. Please click here to view a larger version of this figure.
Figure 4: Improved ATAC-seq protocol reproducibility and accuracy. Scatter plots of chromatin accessibility (ATAC-seq signal, x and y-axes) for two replicate experiments of unstimulated (A; 36,486 Th peaks) or activated (B; 52,154 Thstim peaks) T cells demonstrates technical reproducibility. Chromatin accessibility for activated T cells from individuals IGTB1191 (y-axis) and IGTB1190 (x-axis) (C) and histogram of correlations between every pairs of individuals for the 52,154 Thstim peaks (D) demonstrates reproducibility between individuals. (E) ATAC-seq peaks called with our improved ATAC-seq protocol at chromosome 19 Q13.12. This figure has been modified from Cheng et al.10. Please click here to view a larger version of this figure.
Figure 5: Representative ATAC-seq results of changes in chromatin state after CD4+ T-cell activation. (A) Experimental overview (left) and nomenclature (right). (B) Differentially accessible regions of open chromatin (columns) in six samples (rows) before (top, Th) and after (bottom, Thstim) 48 h activation of primary CD4+ T cells. This figure has been modified from Cheng et al.10. Please click here to view a larger version of this figure.
2x TD Buffer | 25 µL |
TN5 Enzyme | 5 µL |
Nuclease Free Water | 20 µL |
Total Volume | 50 µL |
Table 1: Step 3.2 transposase reaction components.
Ad1_noMX: AATGATACGGCGACCACCGAGATCTACACTCGTCGGCAGCGTCAGATGTG |
Ad2.1_TAAGGCGA CAAGCAGAAGACGGCATACGAGATTCGCCTTAGTCTCGTGGGCTCGGAGATGT |
Ad2.2_CGTACTAG CAAGCAGAAGACGGCATACGAGATCTAGTACGGTCTCGTGGGCTCGGAGATGT |
Ad2.3_AGGCAGAA CAAGCAGAAGACGGCATACGAGATTTCTGCCTGTCTCGTGGGCTCGGAGATGT |
Ad2.4_TCCTGAGC CAAGCAGAAGACGGCATACGAGATGCTCAGGAGTCTCGTGGGCTCGGAGATGT |
Ad2.5_GGACTCCT CAAGCAGAAGACGGCATACGAGATAGGAGTCCGTCTCGTGGGCTCGGAGATGT |
Ad2.6_TAGGCATG CAAGCAGAAGACGGCATACGAGATCATGCCTAGTCTCGTGGGCTCGGAGATGT |
Ad2.7_CTCTCTAC CAAGCAGAAGACGGCATACGAGATGTAGAGAGGTCTCGTGGGCTCGGAGATGT |
Ad2.8_CAGAGAGG CAAGCAGAAGACGGCATACGAGATCCTCTCTGGTCTCGTGGGCTCGGAGATGT |
Ad2.9_GCTACGCT CAAGCAGAAGACGGCATACGAGATAGCGTAGCGTCTCGTGGGCTCGGAGATGT |
Ad2.10_CGAGGCTG CAAGCAGAAGACGGCATACGAGATCAGCCTCGGTCTCGTGGGCTCGGAGATGT |
Ad2.11_AAGAGGCA CAAGCAGAAGACGGCATACGAGATTGCCTCTTGTCTCGTGGGCTCGGAGATGT |
Ad2.12_GTAGAGGA CAAGCAGAAGACGGCATACGAGATTCCTCTACGTCTCGTGGGCTCGGAGATGT |
Ad2.13_GTCGTGAT CAAGCAGAAGACGGCATACGAGATATCACGACGTCTCGTGGGCTCGGAGATGT |
Ad2.14_ACCACTGT CAAGCAGAAGACGGCATACGAGATACAGTGGTGTCTCGTGGGCTCGGAGATGT |
Ad2.15_TGGATCTG CAAGCAGAAGACGGCATACGAGATCAGATCCAGTCTCGTGGGCTCGGAGATGT |
Ad2.16_CCGTTTGT CAAGCAGAAGACGGCATACGAGATACAAACGGGTCTCGTGGGCTCGGAGATGT |
Ad2.17_TGCTGGGT CAAGCAGAAGACGGCATACGAGATACCCAGCAGTCTCGTGGGCTCGGAGATGT |
Ad2.18_GAGGGGTT CAAGCAGAAGACGGCATACGAGATAACCCCTCGTCTCGTGGGCTCGGAGATGT |
Ad2.19_AGGTTGGG CAAGCAGAAGACGGCATACGAGATCCCAACCTGTCTCGTGGGCTCGGAGATGT |
Ad2.20_GTGTGGTG CAAGCAGAAGACGGCATACGAGATCACCACACGTCTCGTGGGCTCGGAGATGT |
Ad2.21_TGGGTTTC CAAGCAGAAGACGGCATACGAGATGAAACCCAGTCTCGTGGGCTCGGAGATGT |
Ad2.22_TGGTCACA CAAGCAGAAGACGGCATACGAGATTGTGACCAGTCTCGTGGGCTCGGAGATGT |
Ad2.23_TTGACCCT CAAGCAGAAGACGGCATACGAGATAGGGTCAAGTCTCGTGGGCTCGGAGATGT |
Ad2.24_CCACTCCT CAAGCAGAAGACGGCATACGAGATAGGAGTGGGTCTCGTGGGCTCGGAGATGT |
Table 2: ATAC-seq oligos designs used for PCR.
Nuclease Free Water | 11.9 µL |
100 µM Custom Nextera Primer 1 (Table 2) | 0.6 µL |
NEBNext High-Fidelity 2x PCR Master Mix | 25 µL |
ATAC-Seq Library | 10 µL |
25 µM Custom Nextera Primer 2 (Table 2) | 2.5 µL |
Total Volume | 50 µL |
Table 3: Step 3.5 initial PCR reaction mix.
CYCLE STEP | TEMPERATURE | TIME | CYCLES |
Extension | 72 °C | 5 min | 1 |
Initial Denaturation | 98 °C | 30 s | 1 |
Denaturation | 98 °C | 10 s | 10 |
Annealing | 63 °C | 30 s | |
Extension | 72 °C | 1 min | |
Hold | 4 °C | Infinity | 1 |
Table 4: Step 3.5 initial PCR amplification cycling program.
PCR Reaction Aliquot | 5 µL |
PCR Cocktail from Table 3 with 0.6x Syber Green | 10 µL |
Table 5: Step 3.6 qPCR reaction mix.
CYCLE STEP | TEMPERATURE | TIME | CYCLES |
Initial Denaturation | 98 °C | 30 s | 1 |
Denaturation | 98 °C | 10 s | 20 |
Annealing | 63 °C | 30 s | |
Extension | 72 °C | 1 min | |
Hold | 4 °C | Infinity | 1 |
Table 6: Step 3.6 qPCR cycling program.
CYCLE STEP | TEMPERATURE | TIME | CYCLES |
Initial Denaturation | 98 °C | 30 s | 1 |
Denaturation | 98 °C | 10 s | As Determined |
Annealing | 63 °C | 30 s | |
Extension | 72 °C | 1 min | |
Hold | 4 °C | Infinity | 1 |
Table 7: Step 3.7 PCR cycling program for final PCR amplification.
The modified ATAC-seq protocol presented in this article provides reproducible results with minimal mitochondrial DNA contamination. The protocol has been used to successfully characterize chromatin architecture of human primary PBMCs10, human monocytes, mouse dendritic cells (unpublished), and cultured melanoma cell lines11. We anticipate this improved lysis condition has the potential to work for other cell types as well. It is also anticipated that this nuclei isolation protocol will be compatible with single nuclei ATAC-seq protocols, minimizing mitochondrial DNA contamination to improve sequencing results.
An additional benefit of this modified protocol is the ability to freeze batches of isolated CD4+ T cells from PBMCs at different times depending on the availability of patient samples. As ATAC-seq can then be performed concurrently on all samples, potential batch effect bias in the transposase reaction and sequencing is minimized10. It is critical that freezing medium be made fresh with each use, in order to maintain the high viability that is achieved with this freeze-thaw protocol. Viability of isolated CD4+ T cells should remain above 90% in order to avoid non-specific digestion in the transposase reaction1.
Please note that further optimization may be required in order to use this protocol with variable cell types and quantities. Since the Tn5 transposase-to-nuclei ratio has been optimized for 500,000 nuclei in this protocol, if performing ATAC-seq with a different number of nuclei, the amount of Tn5 transposase should be adjusted accordingly. Over-lysis of nuclei due to an excess of Tn5 may lead to high background from closed chromatin and low complexity of sequencing libraries, while under-lysis may not provide a complete PCR amplified library1. In order to avoid these complications, it is advised to perform careful nuclei counting and optimize the Tn5 ratio as necessary. To further improve data quality, it is advised to optimize PCR amplification cycles by qPCR monitoring. If the final library undergoes too many amplification cycles, there may be bias introduced in the sequencing data2. It is recommended to perform proper quality control of the ATAC-seq libraries prior to next-generation sequencing in order to save time and money.
As we have demonstrated, a modified lysis buffer is key to the reduction of mitochondrial DNA contamination and is effective on a wide range of cell types. Other protocols have addressed the issue of mitochondrial contamination with alternative lysis buffers7,19 or by the intensive process of a CRISPR-mediated mitochondrial DNA depletion20. Our alternative ATAC-seq protocol contributes to the effort to decrease mitochondrial DNA contamination of sequencing reads with an improved lysis buffer and conservation of the simplicity of ATAC-seq that makes it such an accessible technique. Together with RNA-seq and single-cell sequencing, ATAC-seq is a powerful tool for exploring epigenetic regulation. This improved ATAC-seq protocol and data analysis package will help decrease sequencing costs and produce higher quality results.
The authors have nothing to disclose.
We thank Atsede Siba for technical support. C.S.C. is supported by NIH grant 1R61DA047032.
1X PBS, Sterile | Gibco | 10010023 | Can use other comparable products. |
5810/5810 R Swing Bucket Refrigerated Centrifuge with 50 mL, 15 mL, and 1.5 mL Tube Buckets | Eppendorf | 22625501 | Can use other comparable products. |
96 Well Round Bottom Plate | Thermo Scientific | 163320 | Can use other comparable products. |
Agilent 4200 Tape Station System | Agilent | G2991AA | Suggested for quality assessment. |
Cryotubes | Thermo Scientific | 374081 | Can use other comparable products. |
DMSO | Sigma | D8418 | Can use other comparable products. |
Dynabeads Human T-Activator CD3/CD28 | Invitrogen Life Technologies | 11131D | Critical Component |
Dynabeads Untouched Human CD4 T Cells Kit | Invitrogen Life Technologies | 11346D | Critical Component |
Dynamagnet | Invitrogen Life Technologies | 12321D | Critical Component |
FCS | Gemini Bio Products | 100-500 | Can use other comparable products. |
High Sensitivity DNA Kit | Agilent | 50674626 | Suggested for quality assessment. |
Magnesium Chloride, Hexahydrate, Molecular Biology Grade | Sigma | M2393 | Can use other comparable products. |
MinElute PCR Purification Kit (Buffer PB, Buffer PE, Elution Buffer) | Qiagen | 28004 | Critical Component |
Mr. Frosty | Thermo Scientific | 5100-0001 | Can use other comparable products. |
NaCl, Molecular Biology Grade | Sigma | S3014 | Can use other comparable products. |
NEBNext High Fidelity 2X PCR Master Mix | New England BioLabs | M0541 | Critical Component |
Nextera DNA Library Preparation Kit (2X TD Buffer, Tn5 Enzyme) | Illumina | FC1211030 | Critical Component |
Nuclease Free Sterile dH20 | Gibco | 10977015 | Can use other comparable products. |
Polysorbate 20 (Tween20) | Sigma | P9416 | Can use other comparable products. |
PowerUp SYBR Green Master Mix | Applied Biosystems | A25780 | Critical Component |
Precision Water Bath | Thermo Scientific | TSGP02 | Can use other comparable products. |
QIAquick PCR Purification Kit | Qiagen | 28104 | Critical Component |
Qubit dsDNA HS Assay Kit | Invitrogen Life Technologies | Q32851 | Suggested for quality assessment. |
Qubit FlouroMeter | Invitrogen Life Technologies | Q33226 | Suggested for quality assessment. |
Rosette Sep Human CD4+ Density Medium | Stem Cell Technologies | 15705 | Critical Component |
Rosette Sep Human CD4+ Enrichment Cocktail | Stem Cell Technologies | 15022 | Critical Component |
RPMI-1640 | Gibco | 11875093 | Can use other comparable products. |
Sterile Resevoir | Thermo Scientific | 8096-11 | Can use other comparable products. |
T100 Thermocycler with Heated Lid | BioRad | 1861096 | Can use other comparable products. |
Tris-HCL | Sigma | T5941 | Can use other comparable products. |