Somatic mutation patterns in cells reflect previous mutagenic exposure and can reveal developmental lineage relationships. Presented here is a methodology to catalogue and analyze somatic mutations in individual hematopoietic stem and progenitor cells.
Hematopoietic stem and progenitor cells (HSPCs) gradually accumulate DNA mutations during a lifespan, which can contribute to age-associated diseases such as leukemia. Characterizing mutation accumulation can improve understanding of the etiology of age-associated diseases. Presented here is a method to catalogue somatic mutations in individual HSPCs, which is based on whole-genome sequencing (WGS) of clonal primary cell cultures. Mutations that are present in the original cell are shared by all cells in the clonal culture, whereas mutations acquired in vitro after cell sorting are present in a subset of cells. Therefore, this method allows for accurate detection of somatic mutations present in the genomes of individual HSPCs, which accumulate during life. These catalogues of somatic mutations can provide valuable insights into mutational processes active in the hematopoietic tissue and how these processes contribute to leukemogenesis. In addition, by assessing somatic mutations that are shared between multiple HSPCs of the same individual, clonal lineage relationships and population dynamics of blood populations can be determined. As this approach relies on in vitro expansion of single cells, the method is limited to hematopoietic cells with sufficient replicative potential.
Exposure of hematopoietic stem and progenitor cells (HSPCs) to endogenous or extrinsic mutagenic sources contributes to the gradual accumulation of mutations in the DNA during a lifespan1. Gradual mutation accumulation in HSPCs1 can result in age-related clonal hematopoiesis (ARCH)2,3, which is a non-symptomatic condition driven by HSPCs carrying leukemia-driver mutations. Initially, it was thought that individuals with ARCH have an increased risk for leukemia2,3. However, recent studies have shown an incidence of 95% of ARCH in elderly individuals4, making the association with malignancies less clear and raising the question of why some individuals with ARCH eventually do or do not develop malignancies. Nonetheless, somatic mutations in HSPCs can pose serious health risks, as myelodysplastic disorders and leukemia are characterized by the presence of specific cancer driver mutations.
To identify the mutational processes and study blood clonality, mutation accumulation in individual HSPCs needs to be characterized. Mutational processes leave characteristic patterns in the genome, so-called mutational signatures, which can be identified and quantified in genome-wide collections of mutations5. For instance, exposure to UV light, alkylating agents, and defects in DNA repair pathways have each been associated with a different mutational signature6,7. In addition, due to the stochastic nature of mutation accumulations, most (if not all) of the acquired mutations are unique between cells. If mutations are shared between multiple cells of the same individual, it indicates that these cells share a common ancestor8. Therefore, by assessing shared mutations, lineage relationships can be determined between cells and a developmental lineage tree can be constructed branch by branch. However, cataloguing rare somatic mutations in physiologically normal cells is technically challenging due to the polyclonal nature of healthy tissues.
Presented here is a method to accurately identify and determine somatic mutations in the genomes of individual HSPCs. This involves the isolation and clonal expansion of HSPCs in vitro. These clonal cultures reflect the genetic makeup of the original cell (i.e., mutations in the original cell will be shared by all other cells in the culture). This approach allows us to obtain sufficient DNA for whole genome sequencing (WGS). We have previously shown that mutations accumulated in vitro during clonal culture will be shared by a subset of cells. This enables the filtering of all in vitro mutations, as these will be present in a smaller fraction of reads compared to in vivo acquired mutations9. Previous methods have obtained sufficient DNA from a single cell for WGS using whole-genome amplification (WGA)10. However, the main disadvantage of WGA is its relatively error-prone and unbalanced amplification of the genome, which can result in allele dropouts11. Nonetheless, as this approach relies on in vitro expansion of single cells, it is limited to blood cells with sufficient replicative potential, which is not the case for WGA-dependent methods. Earlier efforts sequencing clonal cultures have relied on using feeder layers to ensure clonal amplification of single HSPCs12. However, DNA from the feeder layers can potentially contaminate the DNA of the clonal cultures, confounding the subsequent mutation calling and filtering. The method presented here solely relies on specified medium to clonally expand single HSPCs, and therefore avoids the issue of DNA contamination. Until now, we have successfully applied this method on human bone marrow, cord blood, viably frozen bone marrow, and peripheral blood.
Samples must be obtained in accordance with appropriate ethics protocols, and donors must give informed consent prior to the procedure.
1. Preparation of Sample Material
NOTE: When working with freshly obtained material, start with step 1.1. When working with frozen material, start with step 1.2.
2. Cell Culture
NOTE: To obtain catalogues of somatically acquired mutations, donor-specific germline variation needs to be filtered out. When starting with bone marrow biopsies or umbilical cord blood, mesenchymal stromal cells (MSCs) can be used as matched control to filter for germline variation. In this case, follow section 2.1. When using (mobilized) peripheral blood follow step 2.2 to isolate and use T-cells as matched control sample to filter for germline variation (Figure 1). The bulk T-cell population will share the same lineage relationship as HSPCs.
3. HSPC Isolation, Sorting, and Culture
Antibody | volume [μL] |
BV421-CD34 | 5 |
FITC-Lineage mix (CD3/14/19/20/56) | 5 |
PE-CD38 | 2 |
APC- CD90 | 0.5 |
PerCP/Cy5.5 – CD45RA | 5 |
PE/Cy7- CD49f | 1 |
FITC -CD16 | 1 |
FITC-CD11 | 5 |
FACS Buffer | 25.5 |
Table 1: HSC sorting mix. Shown is a table indicating the dilutions of antibodies used to sort the HSCs.
4. Harvesting HSPC Clones
5. DNA Isolation
6. Sequencing
7. Mapping and Somatic Mutation Calling
8. Indel Calling
9. Mutational Profile Inspection
10. Construction of a Developmental Lineage Tree Using Base Substitutions
Experimental procedure
The experimental workflow is depicted in Figure 1. Based on the type of input material, different steps must be followed. In Figure 2 a flow cytometric output of a cord blood cell sort is depicted. First, all monocytic cells are selected by loosely drawing a gate around this population. Then, singlets are isolated by selecting for cells with a linear FSC-H/FSC-A ratio, as a lower FSC-H/FSC-A ratio includes doublets or cell clumps. The unstained control sample is used to define cell sorting gates for lineage–, CD34+, CD38–, CD45RA–. Additionally, CD90 and CD49f can be used to distinguish between progenitor cells or self-renewing stem cells17 (Figure 2). Index sorting enables the re-tracing of individual cells, and the sorted cells are depicted as brown dots. During cell culture, individual clones can expand at a different pace, with some clones expanding within 3 weeks, while other clones are only fully expanded until the fifth week of culture. See Figure 3A,B for representative colony outgrowth. A representative picture is shown of a nearly confluent MSC bulk culture at 11 days after plating (Figure 3C).
Checking quality after sequencing and mutation analysis
Shown is an example output of the copy number analysis generated by Control-FreeC14 to check for copy number alterations (Figure 4). Karyotypic information can indicate which chromosomes to exclude during a SNVFI run (step 7.6). The VAF plot created by SNVFI (Figure 5) is a histogram of variant allele frequencies in the sample. A peak in the density plot at 0.5 indicates the sample is clonal. To get more insight in the underlying biological causes behind mutations, these can be analyzed using the R package MutationalPatterns15. Depicted here is a typical analysis producing a 96-trinucleotide plot (Figure 6). In addition to quantification of different mutation types, signature extraction can be also performed with this tool.
Constructing a developmental lineage tree
Mutations shared amongst clones or present in a clone (and at low VAF) in the germline control are validated using IGV. Mutations are considered true when present in the sample and not at high VAF levels in the germline (Figure 7A). Mutations are considered false when not present in IGV, which can happen in poorly mapped regions (Figure 7B). In other cases, events detected by SNVFI are missed germline mutations (Figure 7C). Independent re-sequencing of mutations by targeted re-sequencing is highly recommended for these mutations in selected clones. After detection of shared somatic mutations between clones, a binary matrix is generated (step 10.8). A heatmap is constructed containing cells with and without the shared mutations A-M. Above this heatmap the developmental lineage tree is indicated (Figure 8).
Figure 1: Flowchart depicting experimental procedure based on input material. Please click here to view a larger version of this figure.
Figure 2: Cell sorting strategy. First, gating is performed on small mononuclear cells. Second, single cells are gated by selection of the linear fraction. Lineage negative cells are gated. All CD34+ CD38– CD45– cells are single cell-sorted. The fraction of cells in brown should be noted, which are the sorted cells highlighted by the option “index sorting”. Please click here to view a larger version of this figure.
Figure 3: Representative cell culture results. Representative HSPC clones in a 384 well plate at (A) 2 weeks after plating and (B) 4 weeks after plating. (C) MSC culture after 2 weeks of medium replacement. Scale bar = 100 μm. Please click here to view a larger version of this figure.
Figure 4: Karyotypes. (A) Clonal HSPC culture and (B) MSC bulk sample. The karyotypes were determined by read-depth analysis. Both graphs indicate a karyotypically normal sample. Please click here to view a larger version of this figure.
Figure 5: Histogram of variant allele frequencies. Histogram of variant allele frequencies of the variants in a clone before the last filtering step of SNVFI (VAF >0.3). A peak at VAF = 0.5 indicates that the sample is clonal. The subclonal mutations with low VAF are excluded during last filtering step of SNVFI (VAF >0.3). Please click here to view a larger version of this figure.
Figure 6: Representative mutational spectrum analysis of somatic mutations in a HSPC sample. Depicted is the relative contribution of each trinucleotide change (of which the middle base is mutated) to the total spectrum. Please click here to view a larger version of this figure.
Figure 7: Manual inspection of mutations using IGV16. (A) Mutations are considered true when present in the clone and not in the bulk sample. (B) Mutations are considered as false positives when present in a poorly mapped region. (C) Mutations are considered as false positives when present in a germline control. The vertical line indicates the position of a called mutation. Please click here to view a larger version of this figure.
Figure 8: Construction of a developmental lineage tree. Depicted is a dendrogram indicating developmental lineages splitting off during development. The heatmap under the dendrogram indicates the presence of mutations in different clones. Please click here to view a larger version of this figure.
Presented here is a method to detect mutations that accumulated during life in individual HSPCs and to construct an early developmental lineage tree using these mutation data.
Several critical requirements must be met in order to successfully perform these assays. First, the viability of the sample must be ensured. Quick handling of the sample is key to ensure the efficiency of the procedure. Second, loss of growth factor potency will negatively affect the clonal expansion of HSPCs. To ensure high growth factor potency, it is important to avoid freeze-thaw cycles and prepare single-use aliquots. Third, after performing WGS, mutation calling and filtering, it is crucial to validate the clonality of the clonal culture. To confirm the clonality of the culture, the VAF of the mutations should cluster around of 0.5 in a karyotypically normal sample (Figure 3). In cells with a low mutational load, such as cord blood HSPCs, it is more difficult to determine clonality due to the low mutation numbers.
Our approach relies on in vitro expansion of single cells to allow for WGS. Therefore, our approach is restricted to cells that have the replicative potential to clonally expand, such as HSPCs. In our hands, about 5%-30% of all single-sorted cells are able to expand adequately. Reduced outgrowth rates can potentially result in a selection bias. As discussed previously, methods using WGA can overcome this selection bias as this technique is does not rely on the expansion of cells. However, WGA has its own shortcomings, and clonal amplification remains the only method to accurately determine the number of mutations in the whole genome without allelic dropouts and equal coverage along the genome, especially in samples with low true somatic mutation numbers.
The data generated using this approach can be used to determine phylogenies of the hematopoietic system, as the mutations detected in single cells can be used to dissect cell lineages, as depicted in Figure 6. Typically, one or two mutations can define each branch in a healthy donor1. Since lineages branch early after conception, mutations defining these first branches will also be present with a low VAF in the matched normal sample that was used for filtering the germline variants1,18,19. In this case, the use of non-hematopoietic cells, such as MSCs, are preferred as they are expected to separate very early during development from the hematopoietic system. As T-cells are of hematopoietic origin, the use of these cells as a matched normal sample to filter germline variants could therefore confound the construction of the earliest branching of the developmental lineage tree. Subclonal presence of branch-specific mutations in certain mature blood populations, which can be measured by targeted deep sequencing, will indicate that the progeny of that branch can give rise to that mature cell type. In addition, our approach allows for assessing the mutational consequences of mutagenic exposure in vivo and ultimately how this may contribute to leukemia development.
The authors have nothing to disclose.
This study was supported by a a VIDI grant of the Netherlands Organization for Scientific Research (NWO) (no. 016.Vidi.171.023) to R. v. B.
0.20 µm syringe filter | Corning | 431219 | |
50 mL Syringe, Luer lock | BD | 613-3925 | |
Bovine Serum Albumin (BSA) | Sigma-Aldrich | A9647-50G | |
CD11c FITC | BioLegend | 301603 | Clone 3.9 |
CD16 FITC | BioLegend | 302005 | Clone 3G8 |
CD3 BV650 | Biolegend | 300467 | Clone UCHT1 |
CD34 BV421 | BioLegend | 343609 | 561 |
CD38 PE | BioLegend | 303505 | Clone HIT2 |
CD45RA PerCP/Cy5.5 | BioLegend | 304121 | Clone HI100 |
CD49f PE/Cy7 | BioLegend | 313621 | Clone GoH3 |
CD90 APC | BioLegend | 328113 | Clone 5E10 |
Cell Strainer 5 mL tube | Corning | 352235 | |
CELLSTAR plate, 384w, 130 µL, F-bottom, TC, cover | Greiner | 781182 | |
Cryogenic vial | Corning | 430487 | |
Dimethyl sulfoxide (DMSO) | Sigma-Aldrich | D2650 | |
DMEM/F12 | ThermoFisher | 61965059 | |
EDTA | Sigma-Aldrich | E4884-500G | |
Fetal Bovine Serum | ThermoFisher | 10500 | |
GlutaMAX | ThermoFisher | 25030081 | |
Human Flt3-Ligand, premium grade | Miltenyi Biotech | 130-096-479 | Reconsititute in single-use aliquots (25 μL) at 100 μg/mL in 0.1% BSA in PBS |
Human Recombinant IL-3 (E. coli-expressed) | Stem Cell Technologies | 78040.1 | Reconsititute in single-use aliquots (2.5 μL) at 100 μg/mL in 0.1% BSA in PBS |
Human Recombinant IL-6 (E. coli-expressed) | Stem Cell Technologies | 78050.1 | Reconsititute in single-use aliquots (5 μL) at 100 μg/mL in 0.1% BSA in PBS |
Human SCF, premium grade | Miltenyi Biotech | 130-096-695 | Reconsititute in single-use aliquots (25 μL) at 100 μg/mL in 0.1% BSA in PBS |
Human TPO, premium grade | Miltenyi Biotech | 130-095-752 | Reconsititute in single-use aliquots (12.5 μL) at 100 μg/mL in 0.1% BSA in PBS |
Integrative Genomics Viewer 2.4 | Broad Institute | https://software.broadinstitute.org/software/igv/download | |
Iscove's Modified Eagle's Medium | ThermoFisher | 12440061 | |
Lineage (CD3/14/19/20/56) FITC | BioLegend | 348701 | Clones: UCHT1, HCD14, HIB19, 2H7, HCD56 |
Lymphoprep | Stem Cell Technologies | #07861 | Used for Density gradient separation |
PBS | Made in at Institute's facility. Commerically available PBS can also be used | ||
Penicillin-Streptomycin | ThermoFisher | 15140122 | |
Primocin | Invivogen | ant-pm-1 | Antibiotic formulation |
QIAamp DNA Micro Kit | Qiagen | 56304 | |
Qubit 2.0 fluorometer | ThermoFisher | Q32866 | |
Qubit dsDNA HS Assay Kit | ThermoFisher | Q32854 | |
RNAse A | Qiagen | 19101 | |
SH800S Cell Sorter | Sony | SH800S | |
StemSpan SFEM, 500mL | Stem Cell Technologies | 9650 | |
TE BUFFER PH 8.0, LOW EDTA | G-Biosciences | 786-151 | |
TrypLE Express | ThermoFisher | 12605-10 |