Discovery and Validation of Hypermethylated Markers for Colorectal Cancer

Colorectal carcinoma (CRC) is one of the most prevalent malignant tumors worldwide. Screening and early diagnosis are critical for the clinical management of this disease. DNA methylation changes have been regarded as promising biomarkers for CRC diagnosis. Here, we map DNA methylation profiling on CRC in six CRCs and paired normal samples using a 450 K bead array. Further analysis confirms the methylation status of candidates in two data sets from the Gene Expression Omnibus. Receiver operating characteristic (ROC) curves are calculated to determine the diagnostic performances. We identify 1549 differentially methylated regions (DMRs) showing differences in methylation between CRC and normal tissue. Two genes (ADD2 and AKR1B1), related to the DMRs, are selected for further validation. ROC curves show that the areas under the curves of ADD2 and AKR1B1 are higher than that of SEPT9, which has been clinically used as a screening biomarker of CRC. Our data suggests that aberrant DNA methylation of ADD2 and AKR1B1 could be potential screening markers of CRC.


Introduction
Colorectal carcinoma (CRC) is one of the most prevalent malignant tumors worldwide. Global statistics showed that in 2012 alone, an estimated 1.36 million new cases were diagnosed with CRC, and approximately 694,000 people died from this disease [1]. Screening and early diagnosis are critical for the clinical management of CRC. Traditional screening tools include fecal occult blood test (FOBT) and colonoscopy. However, the effectiveness of FOBT is limited by the test performance, while colonoscopy is invasive, and it is therefore impractical to screen all patients for CRC in this manner. The identification of highly specific, noninvasive biomarkers is a top priority for screening and early diagnosis of CRC.
Aberrant DNA methylation is a well-recognized epigenetic feature of cancer, in general, and has been discovered in most tumors; it is thus gaining increasing attention as a potential biomarker [2][3][4]. Abnormally methylated genes can be used as biomarkers for early detection as well as tumor classification of CRC [5][6][7]. Some of these alterations have also been detected in stool or peripheral blood, suggesting that they can be candidates for noninvasive biomarkers of CRC. Epi proColon5, a blood-based assay for measuring methylated SEPT9, has become available for clinical application and has been approved by China and Europe. However, the sensitivity and specificity are still not satisfactory [5]. Novel biomarkers are needed to improve the accuracy of diagnosis of CRC.
In this study, the genome-wide methylation pattern of CRC was compared with adjacent normal tissues using the Illumina 450 K microarray, thus revealing aberrantly differentially methylated regions (DMRs) in CRC. Among the list of DMRs that we identified, potential biomarkers were validated in two independent data sets. We also established the sensitivity and specificity of the new molecular markers, which showed a higher area under the curve (AUC) than SEPT9. These biomarkers could improve the accuracy of CRC screening and diagnosis.

Differential Methylation Region
Analysis. Infinium Methylation data were processed with the Methylation Module of the GenomeStudio software. Methylation levels of CpG sites were calculated as -values (0-1). We removed unreliable probes that were detected with a P value > 0.05. In addition, CpG sites were removed on the X and Y chromosomes, containing single-nucleotide polymorphisms. The methylation data were deposited in the NCBI Gene Expression Omnibus (GEO): GSE75546. DMRs were analyzed using the ChAMP package, according to the instruction manual. To help identify regions of realistic length, the search was only conducted in regions where the distance between consecutive probes was less than 1 kb. The average -values of the probes in the DMR were used as a representative of the DMR methylation levels. To screen the candidate DMR, the following criteria were used: -difference > 0.4, -value in normal tissue < 0.15, and value < 1 − 4.

Data Set for Validation of Candidate Biomarkers.
Methylation of candidate markers was evaluated in the data sets GSE48684 (147 samples containing CRC, adenoma, and normal tissues) and GSE68060 (118 samples containing CRC and normal tissues) from the GEO. The methylation status of these samples was determined using the same version of the 450 K methylation array.

Statistical Analysis.
Statistical analysis was conducted using the GraphPad Prism 6 software (La Jolla, CA, USA) and MedCalc version 10.1.6 (MedCalc Software, Mariakerke, Belgium). The Mann-Whitney test was used to compare methylation levels between CRC, adenoma, and normal tissue. All reported values were two-sided, with < 0.05 being considered statistically significant. ROC analysis was performed by MedCalc.  (Table S2). We calculated the hypermethylated and hypomethylated DMRs as shown in Figure 1. Hypermethylated DMRs were mainly located in the promoter region and CpG islands (Figure 1(a)), while hypomethylated DMRs were mainly located in the intergenic region and open sea (Figure 1(b)). Interestingly, most of the hypermethylated DMRs were less than 400 bp, but most of the hypomethylated DMRs were greater than 1200 bp (Figure 1(c)).

Identification of Candidate DNA Methylation Markers.
To identify the candidate DNA methylation markers, the following criteria were used: -difference > 0.4, -value in normal tissue < 0.15, and value < 1 − 4. Identification was restricted to hypermethylated DMRs as these can be easily transferred to clinical application with Methylation-Specific Polymerase Chain Reaction (MSP). After evaluating all DMRs, three DMRs were identified that met all the criteria. Information regarding these three DMRs is shown in Table 2. Of these three DMRs-related genes, SEPT9 has been clinically used as a screening biomarker. In the present study, SEPT9 was used as a reference.

In Silico Validation of Selected Candidates.
To investigate the selected DNA methylation candidates, we used two independent data sets, namely, GSE48684 and GSE68060, from GEO. GSE48684 contained 41 normal tissues, 42 adenomas, and 64 CRCs. GSE68060 contained 36 normal tissues and 82 CRCs. The data set GSE48684 was generated from the 450 K methylation array. Technical and biological validation studies were conducted to demonstrate that the data were reproducible and robust [7]. The methylation levels of three DMRs in normal tissue, adenoma, and CRC in the two data sets are shown in Figure 2. In CRCs or adenomas, all candidates had significantly higher methylation levels compared to normal tissues ( < 0.0001). However, there were no significant differences in methylation levels of the three candidate DMRs between CRCs and adenomas.

Discussion
Screening and early diagnosis is crucially important in the clinical management of CRC. Currently, colonoscopy and FOBT are the main approaches for CRC detection [8].
However, half of all CRCs are only detected at the advanced stages.
The widespread occurrence of modifications in CRC has major potential for being utilized as molecular markers, since alterations in DNA methylation in CRC was described by Fearon and Vogelstein over 20 years ago [9]. Compared with normal tissues, even adenomas showed apparent aberrant DNA methylation. Many aberrant DNA methylations have been reported as potential markers of CRC, such as SEPT9, NDRG4, and VIM [5,10,11]. To date, a blood-based assay named Epi proColon (Epigenomics AG, Berlin, Germany), which detects methylated SEPT9, has been applied clinically in several countries [12][13][14]. However, the sensitivity and specificity of SEPT9 detection are still unsatisfactory. In a prospective clinical trial, sensitivity was 68% for all stages of CRC and 64% for CRC stages I-III, and much lower (22%) for advanced adenoma. In the present investigation, we performed a biomarker discovery and validation study to find new DNA methylation markers, which can be used for screening and diagnosing CRC.
Initially, we mapped the genome-wide methylation pattern of CRC compared with adjacent normal tissues using a 450 K bead chip and performed DMR analysis; this revealed that hypermethylation mainly occurred in CpG islands and promoter regions, while hypomethylation mainly occurred in the open sea and intergenic regions. These observations are in accordance with previous studies [15,16]. We also found that most hypermethylated regions were short fragments (<400 bp), whereas most hypomethylated regions were long fragments (>1200 bp). These results suggest that hypermethylation occurs on a small scale and hypomethylation occurs on a large scale in CRC.
From the list of DMRs, we selected the candidates that most closely matched our criteria, which were set based on the premise that hypermethylation candidates are obviously better suited than hypomethylation ones for further clinical application. One of the candidates is SEPT9, which has already been applied clinically. The protein ADD2 is a subunit of adducin, a cytoskeletal protein, which caps and stabilizes the fast-growing end of actin filaments. ADD2 is usually expressed in the nervous system and erythroid tissues [17][18][19]. For the first time, the present study describes the hypermethylation in the promoter region of the ADD2 gene in malignancy. The aberrant methylation of AKR1B1 in CRC has been previously reported [20,21]. However, the role of AKR1B1 as a potential biomarker has not yet been demonstrated. Therefore, the two candidates (ADD2, AKR1B1) were compared with SEPT9, and the performances of ADD2 and AKR1B1 were further evaluated for their potential as biomarkers.
The Infinium Methylation 450 K bead array is the new generation of the Methylation 27 K bead array, which contains high density methylation probes with a distribution over the entire genome. Many investigations have demonstrated the accuracy and reproducibility of this technology and have shown that the results of the Infinium Methylation 450 K Bead Chip had a good positive correlation with bisulfite sequencing [22,23]. In the present study, two independent data sets of the 450 K bead array from GEO were used for in silico validation. ROC curves were performed to determine the performance of the selected candidates. In the data set GSE48684, ADD2 and AKR1B1 have similar AUCs to SEPT9 when CRCs are compared to normal tissues. Comparing adenoma + CRC with normal tissues, ADD2 and AKR1B1 have higher AUCs than SEPT9. When comparing CRC and normal tissues in GSE68060, ADD2 and AKR1B1 also have higher AUCs than SEPT9. These results suggest aberrant methylations of ADD2 and AKR1B1 may have better screening and diagnostic performances in the early detection of CRC than SEPT9 alone. These findings should be confirmed by additional studies, for example, by carrying out tests in stools or blood.

Conclusions
In summary, we conducted investigations into the discovery of tissue biomarkers to identify DNA methylation markers associated with CRC and then replicated the findings in two independent sets from GEO. Further studies are required to confirm these results and understand the role of these genes in colorectal carcinogenesis.

Disclosure
Guodong Li is a co-first author.