The current protocol describes a method for DNA isolation from blood samples and intestinal biopsies, generation of TCRβ and IGH PCR libraries for next-generation sequencing, performance of a NGS run and basic data analysis.
Immunological memory, the hallmark of adaptive immunity, is orchestrated by T and B lymphocytes. In circulation and different organs, there are billions of unique T and B cell clones, and each one can bind a specific antigen, leading to proliferation, differentiation and/or cytokine secretion. The vast heterogeneity in T and B cells is generated by random recombination of different genetic segments. Next-generation sequencing (NGS) technologies, developed in the last decade, enable an unprecedented in-depth view of the T and B cell receptor immune repertoire. Studies in various inflammatory conditions, immunodeficiencies, infections and malignancies demonstrated marked changes in clonality, gene usage, and biophysical properties of immune repertoire, providing important insights about the role of adaptive immune responses in different disorders.
Here, we provide a detailed protocol for NGS of immune repertoire of T and B cells from blood and tissue. We present a pipeline starting from DNA isolation through library preparation, sequencing on NGS sequencer and ending with basic analyses. This method enables exploration of specific T and B cells at the nucleotide or amino-acid level, and thus can identify dynamic changes in lymphocyte populations and diversity parameters in different diseases. This technique is slowly entering clinical practice and has the potential for identification of novel biomarkers, risk stratification and precision medicine.
The adaptive immune system, comprised of T and B lymphocytes, utilizes immunological memory to recognize a previously encountered antigen and initiate a rapid response. Lymphocytes are generated in the bone marrow and mature in the thymus (T cells) or bone marrow (B cells). Both the T cell receptor (TCR) and B cell receptor (BCR) display unique configurations that allow recognition of specific antigens. In homeostasis, T and B cells constantly circulate and survey the trillions of different peptides presented on antigen-presenting cells. TCR or BCR ligation of a specific antigen with high affinity, together with appropriate co-stimulation, leads to cell activation, resulting in cytokine secretion, clonal expansion and generation of antibodies, in the case of B cells.
The enormous array of the different T or B cells is collectively termed immune repertoire, enabling recognition of countless of different epitopes. In order to generate such a vast repertoire, a complex process of random assembly of different gene segments takes place, creating nearly endless combinations of receptors that can bind unique antigens1. This process, called V(D)J recombination, includes rearrangements of different variable (V), diversity (D) and joining (J) genes, accompanied by random deletions and insertions of nucleotides in the junctions2.
The architecture of the adaptive immune system has interested scientists in different fields for many decades. In the past, Sanger sequencing, complementary determining region 3 (CDR3) spectratyping, and flow cytometry were used to characterize the immune repertoire, but provided low resolution. In the last decade, advances in next-generation sequencing (NGS) methods enabled in-depth insight into the characteristics and composition of an individual’s TCR and BCR repertoires3,4. These high-throughput systems (HTS) sequence and process millions of rearranged TCR or BCR products simultaneously and permit a high-resolution analysis of specific T and B cells at the nucleotide or amino acid level. NGS provides a new strategy to study the immune repertoire in both health and disease. Studies utilizing HTS demonstrated altered TCR and BCR repertoires in autoimmune diseases5, primary immunodeficiencies6,7, and malignancies, such as in acute myeloid leukemia8. Using NGS, we and others have shown oligoclonal expansion of specific T and B cell clones, in patients with inflammatory bowel disease (IBD), including ulcerative colitis and Crohn’s disease9,10,11,12,13,14. Overall, studies from different fields suggest changes in the repertoire have a crucial role in the pathogenesis of immune-mediated disorders.
The current protocol describes a method for isolation of DNA from intestinal biopsies and blood, generation of TCRβ and IGH PCR libraries for NGS, and performance of sequencing run. We also provide basic steps in immune repertoire data analysis. This protocol can be applied for the generation of TCRα, TCRγ, and IGL libraries as well. The method is also compatible with other organs (e.g., lymph nodes, tumors, synovial fluid, fat tissue, etc.) as long as tissue-specific digestion protocols are used.
This study was approved by the institutional review board at Sheba Medical Center, and informed written consent was obtained from all participating subjects.
1. DNA isolation and quantification
2. Library preparation
NOTE: The current protocol utilizes a multiplex, PCR-based assay kit compatible with NGS sequencers. The kit contains 24 different indices each targeting conserved regions within the Vβ and the Jβ regions. This enables a one-step PCR reaction and pooling of different samples. See table of materials for details.
3. Amplicon purification and quantification
4. Next-generation sequencing
5. Sequencing analysis
Herein, we describe a method for DNA isolation from intestinal tissue and blood, preparation of libraries for NGS, and basic steps of a sequencing run for immune repertoire sequencing. The run will generate fastq files, which can be further converted to fasta files for use in the international ImMunoGeneTics (IMGT)/HighV-QUEST platform. This HTS performs and manages many analyses of tens of thousands of rearranged TCRβ and IGH sequences, at the nucleotide level15. IMGT/HighV-QUEST enables analysis of different TCRs and IGH repertoires in both health and disease. This can lead to identification of new "disease-specific" clones, analysis of clonal expansion and diversity parameters, delineation of differential V(D)J usage, analysis of somatic hyper-mutations, and more. The IMGT/HighV-QUEST provides CSV files, which contain specific sequences and their abundance. Using genomic DNA as a starting material yields sequence numbers that are representative of cell numbers. Thus, if original DNA quantity is equal, percentage of T cells in a given sample can also be calculated.
We include a basic analysis from representative, autologous blood and rectal samples of a patient with IL10 receptor deficiency and history of severe infantile-onset IBD, resulting from a deleterious IL10RA mutation. Samples were assayed for both TCRβ and IGH repertoire. The intestinal TCRβ sample yielded a total 12,450 sequences, of which 9,050 sequences were unique. The blood TCRβ sample had a total 54,880 sequences, of which 35,110 were unique. In the intestinal IGH sample, a total of 49,070 sequences were obtained, of which 23,670 were unique. In the blood IGH sample, a total of 13,710 sequences were obtained, of which 13,540 were unique. All clones can be identified by their unique sequence both at the nucleotide or amino acid level. These sequences can be compared between different patients, in search for shared clones, or in the same patient between different anatomical sites (e.g., blood vs. intestine). We present for each of the samples the 5 most frequent clones (Table 1).
To quantify the degree of clonal expansion different indices can be used, including Shannon’s H, Gini-Simpson, entropy and clonality. As an example, Shannon's H, which takes into account the number of unique sequences (richness of the repertoire) and how evenly they are distributed was found to be decreased in the patient's intestinal IGH vs. blood (8.3 vs. 9.5), suggesting clonal expansion of B cells in the inflamed gut.
For a broad overview of the repertoire, Treemap images (www.treemap.com) were generated (Figure 2). Each colored square represents a different clone, and the size correlates with its frequency. These Treemap images demonstrate clonal expansion in the intestinal IGH repertoire compared with the blood. In contrast, in the TCRβ repertoire, marked clonal expansion is observed in the blood, in comparison to autologous intestine.
At the gene level, NGS provides information regarding V- D- and J- usage at the level of either gene, family, or allele, as shown in Table 1. Moreover, specific V(D)J combinations can be inferred from the data for TCRβ and IGH repertoires (Figure 3A,B), which can reveal differential gene usage patterns in various conditions. Biophysical properties of the CDR3 region such as length (Table 1) or hydrophobicity can also be analyzed. Importantly, CDR3 length distribution is altered in different immune-mediated disorders10,16,17. For example, in IL10R deficient patients, blood-derived T cells, but not B cells, have shorter CDR3 length, and differential hydrophobicity, in comparison to healthy controls7.
Figure 1: Representative bioanalyzer data. Representative images of intestinal TCRβ samples showing optimal results (A) with the desired peak at 400 bp. An example of a low-quality library (B) with additional peaks at 179bp and 114bp (marked *). Peaks seen at ends are upper and lower markers of the electrophoresis strip. Please click here to view a larger version of this figure.
Figure 2: Overview of T and B cell immune repertoire. Treemap diagrams of TCRβ and IGH blood and intestinal samples. Each colored square represents a different clone, and the size correlates with its frequency. Please click here to view a larger version of this figure.
Figure 3: Representative graphs of TCRβ and IGH V-J usage. A representative intestinal sample depicting total number of sequences for each V-J combination in TCRβ (A) or IGH (B). Data is not shown for D usage. Please click here to view a larger version of this figure.
Changes in abundance and function of B and T lymphocytes are often encountered in different malignancies18, chronic inflammatory disorders (e.g., ulcerative colitis and rheumatoid arthritis)10,19, and in various immunodeficiencies17,20. The current method utilizes NGS to facilitate an in-depth view of TCR and BCR repertoires, enabling detection of subtle changes in T and B cell clonality, sharing of clones, V(D)J gene usage, and information on the degree of somatic hyper-mutations in the case of B cells.
The method of DNA isolation was described for intestinal biopsies and blood samples. However, with modifications of lysis and DNA extraction, the method can be applied to other tissues such as tumors, lymph nodes, synovial fluid, etc. It is important during library preparation not to cross-contaminate the different primer sets. At bead clean-up, care should be taken to leave tubes on the magnet at all times of incubation. For DNA samples of low concentrations, stocks can be concentrated at 65 °C for 10-20 minutes. Moreover, amplicons can be eluted from beads in as little as 20 µL elution buffer.
Different techniques used in the past to characterize the landscape of T and B cell composition provided only a superficial overview of the immune repertoire. One of the biggest advantages of NGS for immune repertoire analysis is the ability to identify unique clones, and consequently track them in different anatomical sites (e.g., intestine vs. blood vs. lymph node) or in different individuals. The clinical implications of such a technique are significant, and go beyond enhancing the understanding of the role of T or B cells in different immune-mediated diseases or malignancies21. In oncology, NGS is used to identify specific clones that can be predictive biomarkers for the patients who will most likely benefit from current immunotherapies22,23,24. In leukemia, NGS is used to confirm full eradication of cancerous cells by detection of residual leukemic clones that remain in a patient after treatment25.
One of the main limitations of immune repertoire analysis of whole tissue or blood samples is the inability to identify repertoire changes in less frequent populations. For example, if regulatory T cells, which comprise only a few percentages of total CD4+ T cells, express a unique repertoire profile, this would be missed if conducting these studies on whole blood. This problem could be addressed by conducting these studies on sorted immune populations. Alternatively, in recent years technology has developed to facilitate coupling of single cell RNA data with TCR or IGH repertoire profiles26. This provides another level of functionality of the T or B cells, since the transcriptional landscape of each clone can be characterized. As an example, Zemmour et al. used scRNAseq TCRseq to demonstrate that regulatory T cells from human blood, display broad heterogeneity with an activated subpopulation that is transcriptionally related to conventional T cells27. Similarly, in patients with hepatocellular carcinoma this method identified 11 distinct T cell subsets that infiltrate the tumor, each one with a unique transcriptional and repertoire profile28. Studies like these will be helpful especially when studying rare populations, and will provide important functional information of specific clones.
NGS of immune repertoire is a new research field that provides novel insights about adaptive immune function and the role of T and B cells in various diseases. In addition, it is rapidly entering the clinical world in different disciplines, by tracking disease associated clonotypes. There are currently several available kits for immune repertoire. These are based on similar methodology, regardless of whether they use RNA or DNA as the source. They use similar techniques for calibration, bias corrections, and analysis tools; and thus need to be properly chosen according to specific demands. One of the great advantages of the current kit is that it is a ready-to-use and optimized for human immune repertoire, without need for calibration. In the future we will see more studies using repertoire features as biomarkers for different disease, potentially also leading to development of targeted therapies against these clones. We believe researchers should consider applying these methods for studying the adaptive immune landscape in different disorders.
The authors have nothing to disclose.
None.
2-propanol | Sigma | I9516-500ML | |
1.7 mL micro-centrifuge tubes | Axygen | 8187631104051 | |
15 mL centrifuge tubes | Greiner | 188261 | |
Absolute ethanol | Merck | 1.08543.0250 | |
Amplitaq Gold | Thermo Fisher | N8080241 | |
AMPure XP Beads | Beckman Coulter | A63881 | |
Heat block | Bioer | Not applicable | |
High Sensitivity D1000 Sample Buffer | Agilent | 5067-5603 | For Tapestation |
High Sensitivity D1000 ScreenTape | Agilent | 5067-5584 | For Tapestation. Tubes sold seperately |
Lymphotrack Assay kit | Invivoscribe | TRB: 70-91210039 IGH: 70-92250019 | Each includes 24 indexes |
MiSeq Reagent Kit v2 (500 cycle) | Illumina | MS-102-2003 | Includes standard flow cell type and all reagents required |
MiSeq Sequencer | Illumina | SY-410-1003 | |
PCR strips | 4titude | 4ti-0792 | |
Proteinase K | Invitrogen | EO0491 | |
Qubit 4 Fluorometer | Thermo Fisher | Q33226 | |
Qubit dsDNA HS Assay Kit | Thermo Fisher | Q32854 | Includes buffer, dye, standards, and specialized tubes |
Shaker | Biosan | Not applicable | |
Tapestation 2100 Bioanalyzer | Agilent | G2940CA | |
ultra pure water | Bio-lab | 7501 | |
Wizard DNA isolation kit | Promega | A1120 | Includes cell lysis solution, nuclei lysis solution, and protein precipitation buffer |