Herein, we describe a procedure for genome-wide analysis of DNA methylation in gastrointestinal cancers. The procedure is of relevance to studies that investigate relationships between methylation patterns of genes and factors contributing to carcinogenesis in gastrointestinal cancers.
DNA methylation is an important epigenetic change that is biologically meaningful and a frequent focus of cancer research. Genome-wide DNA methylation is a useful measure to provide an accurate analysis of the methylation status of gastrointestinal (GI) malignancies. Given the multiple potential translational uses of DNA methylation analysis, practicing clinicians and others new to DNA methylation studies need to be able to understand step by step how these genome-wide analyses are performed. The goal of this protocol is to provide a detailed description of how this method is used for the biomarker identification in GI malignancies. Importantly, we describe three critical steps that are needed to obtain accurate results during genome-wide analysis. Clearly and concisely written, these three methods are often lacking and not noticeable to those new to epigenetic studies. We used 48 samples of a GI malignancy (gastric cancer) to highlight practically how genome-wide DNA methylation analysis can be performed for GI malignancies.
Epigenetics refers to heritable changes in gene function without alteration of the sequence of DNA1. Such changes may be due to DNA methylation, in which methyl groups on a DNA base may alter the gene expression through changes in chromatin packing. Cancer development and progression may occur if this effect results in altered expression of tumor suppressor genes2. Aging and chronic inflammation are both causes of cancer and the main reasons for changes in DNA methylation in humans3,4,5. Consequently, this allows the utilization of DNA methylation as a biomarker in cancer diagnosis, and as a target for treatment and prevention. For early detection and cancer prognosis, DNA methylation are being measured in tumor, blood, urine, and stool samples6, while demethylating agents are now being used to treat leukemias such as myelodysplastic syndrome7.
Genome-wide DNA methylation analysis using an array platform for complex evaluation of DNA methylation at an individual CpG locus in the human genome can be utilized to examine the methylation status of more than 450,000 CpG sites in genomic DNA8,9, which permits exploration of cancer epigenetics (see Table of Materials). Whole genome bisulfite sequencing (WGBS) technologies have changed our approaches in the field of epigenetics10,11. However, there are some disadvantages to the technologies in terms of a substantial cost and processing time for epigenetic analysis of a large number of samples10,11. Therefore, the array platform is more feasible for complex evaluation of DNA methylation in the human genome. The availability of approaches for genome-wide methylation analyses has improved in the past few years and allows us to expand our knowledge of how DNA methylation contributes to cancer development and progression12. Recent progresses in microarray-platform approaches provide us the rationale for genome-wide methylation analysis to identify a novel epigenetic signature in gastrointestinal cancers13. The goal of this protocol is to provide a detailed description of how this method is used for biomarker identification in GI malignancies.
All procedures followed were in accordance with the ethical standards of the institutions’ human research ethics committee. The study was approved by the Institutional Review Board at the Juntendo University Shizuoka Hospital, and written informed consent was waived because of the retrospective design.
1. Washing the slides
2. Scraping the slides
3. Bisulfite treatment
4. Array platform for evaluation of DNA methylation at a CpG locus in the human genome
5. Quantitative Methylation-Specific PCR (qMSP)
The characteristics of 48 patients with gastric cancer in the training cohort are as follows (Table 2): the median age of patients was 74 years (52–89 years), and the cohort included 38 males (79.2%), and 10 females (20.8%). There were 35 patients (72.9%) with primary gastric cancer and in 13 patients (27.1%) with remnant gastric cancer (primary gastric cancer: first occurrence of a non-metastatic malignancy in the stomach; remnant gastric cancer: cancer in the remnant stomach that developed more than 5 years after distal gastrectomy, regardless of the reason for the original resection19). There were 23 patients (47.9%) with lymph node metastasis and 25 patients (52.1%) without. First, all 48 samples (the training cohort) were loaded for identification of outliers (Figure 1A). Two samples gave peaks that were greater than two standard deviations displaced from the others, and these samples were removed (Figure 1B). Therefore, 46 samples were clustered by DNA promoter hypermethylation. The resultant heatmap was divided into two groups based on high and low methylation (Figure 2). This heatmap allows visualization of the top 50 probes within 1,500 bp of the transcriptional start site (TSS) in the differential methylation analysis. The high and low methylation groups differed in clinicopathological factors related to an aggressive malignant phenotype. That is, the type of cancer (primary gastric cancer: PGC) (p = 0.01, odds ratio = 9.09 (1.67–50.00)) and presence of lymph node metastasis (positive) (p = 0.03, odds ratio = 6.82 (1.16–40.08)) emerged as significant independent predictive factors when the clinicopathological factors were used as covariates in multivariate analysis (Table 3). Finally, we identified the EPB41L3 gene20,21 (primer and probe sequences shown in Table 1) to be strongly associated with codifying the training cohort into high and low methylation groups in the microarray analysis. Using qMSP, the results of the microarray analysis for EPB41L3 in the test cohort (126 samples) were validated. The characteristics of the patients in the test cohort are shown in Table 4. RMVs of EPB41L3 in PGC tissues were significantly higher than those in remnant gastric cancer (RGC) in univariate analysis (p = 0.01) (Figure 3A). Similarly, RMVs in samples with lymph node metastasis were significantly higher than those without lymph node metastasis (p = 0.03) (Figure 3B). In this way, DNA methylation genome-wide analysis can help us to identify specific genes to characterize certain clinical status in patients with GI malignancies.
Figure 1: Beta values in 48 samples (training cohort). All 48 samples (training cohort) were loaded and outliers were examined (A). Two samples had peaks that were outliers, and these were removed (B). Please click here to view a larger version of this figure.
Figure 2: The resultant heatmap. The remaining 46 samples were clustered by DNA promoter hypermethylation. The heatmap was divided into high and low methylation groups. This heatmap allows visualization of the top 50 probes within 1,500 bp of the transcriptional start site (TSS) in the differential methylation analysis. Please click here to view a larger version of this figure.
Figure 3: Relative methylation values (RMVs) for EPB41L3 in primary gastric cancer (PGC) vs. remnant gastric cancer (RGC), and in cases with and without lymph node metastasis. The results of microarray analysis for EPB41L3 in the test cohort (126 samples) were validated using qMSP. (A) In univariate analysis, RMVs of EPB41L3 in PGC tissues were significantly higher than those in RGC (p = 0.01). (B) Similarly, RMVs in samples with lymph node metastasis were significantly higher than in those without lymph node metastasis (p = 0.03). Please click here to view a larger version of this figure.
Gene | Forward 5' – 3' | Reverse 5' – 3' | Probe |
B-ACTIN | TAG GGA GTA TAT AGG TTG GGG AAG TT | AAC ACA CAA TAA CAA ACA CAA ATT CAC | 56-FAM TGT GGG GTG ZEN GTG ATG GAG GAG GTT TAG 3IABkFQ |
EPB41L3 | GGG ATA GTG GGG TTG ACG C | ATA AAA ATC CCG ACG AAC GA | AAA TTC GAA AAA CCG CGC GAC GCC GAA ACC A |
Table 1: Primer and probe sequences.
Clinicopathological factors | Variables | |
Age | 74 (52 – 89) * | |
Gender | Male / Female | 38 (79.2%) / 10 (20.8%) |
Type | PGC / RGC | 35 (72.9%) / 13 (27.1%) |
Lymph node metastasis | (+) / (-) | 23 (47.9%) / 25 (52.1%) |
PGC: Primary gastric cancer, RGC: Remnant gastric cancer | ||
* Median (minimum-maximum) |
Table 2: The characteristics in 48 patients with gastric cancer in the training cohort.
P-value | Odds ratio | 95% Confidence interval | |
Type (PGC) | 0.01 | 9.09 | 1.67 – 50.00 |
Lymph node metastasis (+) | 0.03 | 6.82 | 1.16 – 40.08 |
PGC: Primary gastric cancer |
Table 3: Predictive factors for the high methylation group (Cluster B).
Clinicopathological factors | Variables | |
Age | 71 (33 – 86) * | |
Gender | Male / Female | 96 (76.2%) / 30 (23.8%) |
Type | PGC / RGC | 87 (69.0%) / 39 (31.0%) |
Lymph node metastasis | (+) / (-) | 50 (39.7%) / 76 (60.3%) |
PGC: Primary gastric cancer, RGC: Remnant gastric cancer | ||
* Median (minimum-maximum) |
Table 4: The characteristics in 126 patients with gastric cancer in the test cohort.
There are three critical steps in obtaining accurate results from DNA methylation genome-wide analysis. The first is macrodissection of a tumor area by preferably two qualified pathologists based on representative H&E stained sections. Inaccurate macrodissection can cause contamination with adjacent non-cancerous tissues, which engenders unreliable results; thus, careful macrodissection is required. The second is assessment of the DNA quality (quality check: QC). Samples which fail the QC (∆Cq > 5.0) may give poor quality data. Therefore, samples with ∆Cq > 5.0 should be removed and others used. The third step is calculation of the β-value, which is determined by a data analysis tool for the array platform software as the methylated signal / the total (methylated + unmethylated) signal17. The β-value ranges from 0–1 (or 0%–100%), which is simple to interpret biologically17. The main problem with this value is its poor statistical properties, since its high heteroscedasticity implies that variance across samples at methylation range extremes (β = 0 or β = 1) is highly reduced17. In addition, due to poor sample quality, β-values may not show reproducible biphasic peaks22, and samples without such peaks should be excluded from further study. In addition, target gene should be chosen based on the criteria of having larger beta values, being related to CpG islands in the promoter region, and being suitable for primer and probe design for qMSP.
Evaluation of DNA methylation at a CpG locus in the human genome is performed using microarray-based technology with a fixed number of probes to survey specific genomic loci. It is the most widely used method in epigenome-wide association studies (EWAS) due to its low cost, small amount of DNA required, and markedly shorter sample processing time, which allows high-throughput processing of many clinical samples23. However, an array platform for complex evaluation of DNA methylation at an individual CpG locus is limited by the number and specificity of probes for epigenetically altered loci, which prevents exploration of some genomic regions. WGBS is generally viewed as the gold standard method due to its wider spectrum of genomic coverage10,11. However, this method has a substantial cost and a relatively long processing time for analysis of a large number of samples10,11. Thus, it is not always feasible. In comparison, the array platform for complex evaluation of DNA methylation at an individual CpG locus in the human genome is reasonable for use in terms of cost and genomic coverage. Recently, the latest upgraded bead chips have gotten ready to use24. These assays can help us analyze nearly doubled measured CpG sites, which can achieve ideal genome-wide association study (GWAS) for large sample populations.
In summary, DNA methylation genome-wide analysis with the array platform for complex evaluation of DNA methylation at an individual CpG locus in the human genome can provide important information on epigenetic biomarkers in gastrointestinal cancer. Compared with WGBS, this method is cost effective and reduces sample-processing time. Therefore, this method for detection of DNA methylation at a CpG locus is likely to be widely used in epigenetic biomarker research.
The authors have nothing to disclose.
We are especially grateful to all members of the Department of Surgery, The Sidney Kimmel Comprehensive Cancer Center at the Johns Hopkins University School of Medicine for useful discussions and technical support. We also thank Kristen Rodgers for generous technical guidance on the procedures for bisulfite treatment and qMSP.
(NH4)2SO4 | Sigma-Aldrich | 14148 | Step 5.2. |
10% Sodium dodecyl sulfate (SDS) | Quality Biological | 351-032-721 EA | Step 2.1. |
100 % Ethanol | Sigma-Aldrich | 24194 | Step 1.7. |
ABI StepOnePlus Real-Time PCR System | Applied BioSystems | 4376600 | Step 5.2. 96-well Real-Time PCR instrument |
CT conversion reagent | Zymo Research | D5001-1 | Step 3.2.3. |
Deoxynucleotide triphosphate (dNTP) | Invitrogen | 10297-018 | Step 5.2. |
DEPC-treated water | Quality Biological | 351-068-131 | Step 2.1. |
Ethylenediaminetetraacetic acid (EDTA) | Corning | 46-034-CL | Step 2.1. |
EZ DNA Methylation Kit | Zymo Research | D5002 | Step 3.2. |
Fluorescein | Bio-Rad | #1708780 | Step 5.2. |
GenomeStudio | omicX | OMICS_00854 | Step 4.3. Data analysis tool for an array platform as a β-value, with a range from 0 to 1 |
Human genomic DNA | New England Bio Labs | N4002S | Step 5.3. |
Infinium HD FFPE QC Kit | Illumina | WG-321-1001 | Step 4.1. FFPE QC assay on a real-time PCR amplification |
Infinium HumanMethylation450 assay | Illumina | WG-314 | Step 4.2. Array platform for complex evaluation of DNA methylation to assess the methylation status of >450,000 CpG sites in the genome |
LightCycler 480 | Roche | 5015278001 | Step 4.1. |
M-Binding Buffer | Zymo Research | D5002-3 | Step 3.2.6. |
M-Desulphonation Buffer | Zymo Research | D5002-5 | Step 3.2.9. |
M-Dilution Buffer | Zymo Research | D5002-2 | Step 3.2.1. |
Minfi package | Bioconductor | N/A | Step 4.4. |
M-Wash Buffer | Zymo Research | D5002-4 | Step 3.2.10. |
Platinum Taq polymerase | ThermoFisher Scientific | 10966-034 | Step 5.2. |
Proteinase K | New England Biolabs | P8107S | Step 2.8. |
Single-use polypropylene (Eppendorf) tube | Eppendorf | 24533495 | Step 2.5.2. |
Tris hydrochloride (Tris-HCL) 2 M pH 8.8 | Quality Biological | 351-092-101 | Step 2.1. |
Xylene | Sigma-Aldrich | 214736 | Step 1.3. |
Zymo Spin 1 Column | Zymo Research | C1003 | Step 3.2.6. |
β-Mercaptoethanol | Sigma-Aldrich | M3148 | Step 5.2. |