Genomic Analysis of Abnormal DNAM Methylation in Parathyroid Tumors

Background Parathyroid tumors are common endocrine neoplasias associated with primary hyperparathyroidism. Although numerous studies have studied the subject, the predictive value of gene biomarkers nevertheless remains low. Methods In this study, we performed genomic analysis of abnormal DNA methylation in parathyroid tumors. After data preprocessing, differentially methylated genes were extracted from patients with parathyroid tumors by using t-tests. Results After refinement of the basic differential methylation, 28241 unique CpGs (634 genes) were identified to be methylated. The methylated genes were primarily involved in 7 GO terms, and the top 3 terms were associated with cyst morphogenesis, ion transport, and GTPase signal. Following pathway enrichment analyses, a total of 10 significant pathways were enriched; notably, the top 3 pathways were cholinergic synapses, glutamatergic synapses, and oxytocin signaling pathways. Based on PPIN and ego-net analysis, 67 ego genes were found which could completely separate the diseased group from the normal group. The 10 most prominent genes included POLA1, FAM155 B, AMMECR1, THOC2, CCND1, CLDN11, IDS, TST, RBPJ, and GNA11. SVM analysis confirmed that this grouping approach was precise. Conclusions This research provides useful data to further explore novel genes and pathways as therapeutic targets for parathyroid tumors.


Introduction
Parathyroid cancer is a common endocrine disease characterized by excessive secretion of the parathyroid hormone [1]. e molecular pathogenesis of parathyroid tumors has already been partly elucidated; Heppner and Carling have reported that inactivating somatic mutations of tumor suppressor genes multiple endocrine neoplasia type 1 (MEN1), RET, and HRPT2/CDC73 have been identified in parathyroid tumors [2,3]. Loss of menin, the protein encoded by the oncosuppressor gene MEN1, is characterized by a genetic background of parathyroid tumors [4]. is is called MEN1 syndrome [4].
It has been demonstrated that epigenetic modifications are not only involved in embryogenesis but also in cell fate reprogramming [5][6][7]. As epigenetic modifications are present in all human cancers, they cooperate with genetic alterations to drive given cancer phenotypes [8]. Furthermore, the abnormal methylation of cytosine phosphate guanine (CpG) in gene-promoter islands has been investigated in MEN1 syndrome cancer [9]. In benign and malignant parathyroid tumors, despite the clear usefulness in performing large-scale DNA methylation analysis, few studies have been published that describe abnormal global methylation on CpG gene-promoter islands. Global promoter methylation by LINE-1 does not differ from levels detected in normal glands.
is differs from most other cancers, which display global DNA hypomethylation [10,11]. Different CpG sites located near promoter regions of more than 14,000 genes have been screened by Starker and Collaborators, using a model composed of normal, benign, and malignant parathyroid tissues (3 normal parathyroid tissues, 14 adenomas, and 7 carcinomas) [12]. A gradient of CpG hypermethylation levels, ranging from normal tissues to adenomas and carcinomas, was identified in a subset of the genes [12]. ese genes were involved in key pathways linked to parathyroid tumors. Specific genes such as RIZ1, APC, RASSF1A, CDKN2A/p16 and CDKN2B/p15, RB1, WT1, GATA4, PYCARD, SFRP1, SFRP2, and SFRP4 were found to be hypermethylated; this was in line with the hypothesis that cell cycle, transcription, and WNT pathways would represent biological processes that were deranged [12]. ey would, in turn, be found in the onset of parathyroid neoplasms [12].
It is important to note that gene markers based solely on expression are still not reliable [13]. However, using cutting-edge computational methods and genomics data, significant markers can be identified. is can be done by integrating gene expression profiles with protein-protein interaction maps.
us, in this study, we developed a workflow to identify significant methylation of genes that were functionally associated with diseases. We did this so that feature selection could maximize prediction performance [14].

Data Preprocessing and Identification of Differentially
Methylated Genes. A methylation-identification algorithm found in Genelibs (http://www.genelibs.com) was utilized. Raw microarray data containing 1954997 CpGs sites were filtered down in order to better focus our study. Sites were eliminated when they met the following criteria: (i) distance from CpG to single-nucleotide polymorphism (SNP) was less than or equal to 2; (ii) minor allele frequency (MAF) was less than 0.05; (iii) cross-hybridized probes were found, or on sex chromosomes. 1925786 CpGs were kept for further study.
In this study, β values have been represented graphically, both the diseased group and the normal group. e percentage of methylation used the following formula: methylated/(methylated + unmethylated); results range between 0 and 1, where zero represented fully unmethylated genes and one represented fully methylated. Subsequently, the absolute value of the difference in mean β values was calculated, termed A. T-tests were then employed to identify the differentially methylated CpGs and identified at the threshold of P < 0.05 and A>0.01.
In order to decrease the number of nonvariable sites, further filtering steps were performed. All sites with methylation scores were >50; the differential expression p value was <0.001, and the absolute value of >3 remained. Only CpGs termed A with a value of ≥17 (and corrected p value <0.05) were applied. is was done in order to detect substantial methylation differences.

Hierarchical Clustering Analysis.
Hierarchical clustering is an analytical tool applied to discover the closest associations that exist between gene profiles and specimens [19]. Generally, cancers have similar methylation profiles that tend to cluster together. To analyze whether these differentially methylated CpGs would segregate into two distinct clusters, unsupervised hierarchical clustering was conducted using Euclidian distance and average linkage criteria [20]. e matrix of mean β-value levels was then formed between the parathyroidoma and normal samples.

Gene Ontology (GO) Analysis of Differentially Methylated
Genes. Gene ontology (GO) was applied to analyze the main function of differentially expressed genes (DEGs), the key functional classification used by the National Center of Biotechnology Information (NCBI). In the present study, Fisher's exact test was used to ascertain the GO category; furthermore, P values were corrected using the false discovery rate (FDR) along with the Benjamini & Hochberg method [21]. Functional terms with an FDR <0.01 and gene count >10 were considered to be statistically significant.

Pathway Enrichment Analysis of Differentially Methylated
Genes. Pathway analysis was used to find significant pathways of the DEGs according to the Kyoto Encyclopedia of Genes and Genomes (KEGG) [22]. In the present study, Fisher's exact test was used to extract significant pathways, and the threshold of significance was defined by FDR. Significant pathways were selected according to the thresholds of FDR <0.01 and gene count >10.

Identification of the Ego Genes in the Methylation Correlation Network.
e Search Tool for the Retrieval of Interacting Genes (STRING) database (http://www.bork. embl-heidelberg.de/STRING) is a global resource used for analyzing gene-gene interaction [23]. In the present study, STRING was utilized to identify the correlations between the differentially methylated genes.
First, a network of all human gene interactions was obtained from the STRING database; a special subnetwork that contained differentially methylated genes and parathyroid adenoma genes (CDC73, MEN1, CCND1, RET) was then used for further analysis. e absolute value of the Pearson coefficient was calculated: interaction pairs with a p value of 0.05 defined the difference expression network [24]. Finally, the weight value of interaction in the network was calculated using the following formula (EgoNet algorithm) [25]: 2 International Journal of Endocrinology where V represents the node set in the network. e topological analysis of the differential expression network was measured by four indicators: degree, closeness, betweenness, and transitivity (a measure of the clustering coefficient). In order to identify the methylation markers, the context likelihood of relatedness (CLR) was applied to the algorithm to square weight and then to form an adjacency matrix. Genes were sorted in descending order based on the importance of genes in the netweb (g (i) � z-score).

Support Vector Machine Analysis.
e support vector machine (SVM) is a machine learning method commonly used in pattern recognition for binary classification [26]. In this study, SVM was used to test whether ego genes could completely separate the diseased group from the normal group and to further verify the feasibility of our analysis methods.
e normal group and the diseased group were together termed the total population. In accordance with the ratio of 6 : 4, the total population was randomly divided into an experimental group and a testing group. SVM C-classification was then performed using linear kernel and 5-fold cross-validation. e experimental group served as the basis for classification, and the testing group as the basis for regression [27].

Identification of Differentially Methylated Genes.
Following quality control and normalization to (i) remove probes with an SNP-CpG distance of ≤2, (ii) remove probes with a minimum allelic frequency of <0.05, and (iii) demonstrate cross-hybridization, a total of 1925786 methylated CpGs remained. A volcano plot exhibiting the distribution of the 1925786 methylated CpGs is presented in Figure 1. Among these 1925786 methylated CpGs, 192613 of them (representing 26926 genes) were differentially methylated, where the absolute value of the mean β-value was >0.05, and the P value <0.05. A total of 192340 of the CpGs were hypermethylated, and 273 CpGs were hypomethylated in the disease group.
eir P values are demonstrated in Table 1.

Hierarchical Clustering Analysis.
To further explore the changes in the methylation levels, cluster analysis was conducted. e cluster heat map is demonstrated in Figure 2. In this figure, it is observed that there were distinctive methylation patterns in the parathyroidoma and normal samples, further segregating the samples into two distinct groups.

GO Enrichment.
To determine the primary functions of the differentially methylated genes, a comprehensive gene ontology (GO) analysis was performed. As shown in Figure 3, 7 signaling pathways were significantly enriched by the differentially methylated genes, including regulation of small  GTPase mediated signal transduction, regulation of cell morphogenesis, calcium ion transmembrane transport, glutamate receptor signaling pathway, regulation of cell morphogenesis involved in differentiation, regulation of ion transmembrane transport, and ERBB signaling pathway. ese GO terms were sorted in ascending order based on FDR value. Overall, the GO terms were mainly involved in cell morphogenesis, ion transport, and GTPase signal.

KEGG Pathway Analysis.
Pathway enrichment analysis of the differentially methylated genes was conducted given the KEGG pathway database. is analysis yielded 48 pathways. Table 2 lists the top 10 pathways. ese pathways were sorted in ascending order based on FDR value. e top 3 significantly enriched pathways were cholinergic synapses, glutamatergic synapses, and the oxytocin signaling pathway. e oxytocin signaling pathway was associated with CCND1.

Ego Analysis.
Given the goal of analyzing the association between the differentially methylated genes, STRING software was used to establish the PPI network (PPIN). A network of all human gene interactions containing 787896 interaction pairs (16730 genes) was obtained from STRING. A total of the 793 interaction pairs (the target-PPI) were extracted from the PPIN. e target-PPI included differentially methylated genes as well as parathyroid adenoma susceptible genes (CDC73, MEN1, CCND1, RET). is covered 299 genes. e weight value of the interaction in the network was calculated, and genes were sorted in descending order based on topological degree. e top 30% were selected as ego genes. A total of 67 ego genes were obtained (Figure 4). e differential expression of the 67 genes ensures complete separation between the normal and parathyroid tumor populations. Genes with higher z-scores were given greater importance. e top 10 of the 67 ego genes are shown in Table 3.   e use of machine learning in medical diagnosis is gaining currency. is is mainly because the effectiveness of classification and recognition systems has significantly improved, helping medical experts in diagnosing diseases [32,33]. In this paper, C-classification was performed using linear kernel and 5-fold cross-validation. Assessment indicators were AUC (the area under receiver operating characteristic curve), accuracy (classification accuracy), MCC (Mathews' correlation coefficient), specificity (TNR, true-negative rates), and sensitivity (TPR, true-positive rates). e results of the analysis are as follows: AUC � 0.9123, accuracy � 99.29%, MCC � 0.8549, specificity � 83%, and sensitivity � 92%.
us, the methods used in this study could completely separate the diseased group from the normal one.

Discussion
Analysis of DNA methylation data has been widely used to identify abnormally methylated genes associated with tumors. It has also enabled the extraction of targets for therapeutic strategies. In this study, the pathogenesis of parathyroid tumors was analyzed by means of bioinformatics.
is included the detection of differentially methylated genes, gene ontology (GO), pathway enrichment, PPI construction, and ego genes identification. Potential mechanisms of parathyroid tumors have been thus revealed, providing novel insights into parathyroid diagnosis and therapy.
In general, the genetic background of parathyroid tumors is characterized by the loss of two oncosuppressor genes. First, a loss of menin, the protein encoded by the MEN1 gene, takes place.
is occurs in the parathyroid lesions found in multiple endocrine neoplasia type 1 (MEN1), [34]. Second is a loss of parafibromin, encoded by the CDC73/HRPT2 gene.
Our model aimed at finding the diseased genes by means of the following underlying assumption: if the majority of neighbors of a central disease gene are also disease genes, then its other neighbors are likely to be involved in the disease pathway.
is model's benefit is that it can find hidden genes that would otherwise show no significance by themselves but are nevertheless clustered in a subnetwork module. It is in this subnetwork where collectively the genes are highly predictive of disease status. Furthermore, machine-learning techniques can be used to assess the association between ego-networks and clinical outcomes.
Based on pathway analyses, pathways of cholinergic synapses, glutamatergic synapses, and oxytocin signaling were considered the most important. e oxytocin signaling pathway in particular was associated with CCND1. On the other hand, cholinergic and glutamatergic synapses were closely related to the regulation of nerve excitement and inhibition. In GO analysis, the differentially methylated genes were enriched in 7 GO terms, which were mainly involved in cell morphogenesis, ion transport, and GTPase signal. Recent studies suggest that there is a cross-regulation between morphogenesis and EMT processes. Aberrant activation of these coregulators can drive different stages of cancer progression, including tumor invasion, cell spread, and metastatic colonization and growth [45] e ion (such as Ca 2+ , K + , Fe 3+ , and Cl − ) and channel proteins play important roles in tumor progression. ese membrane ion transport systems have great potential as diagnostic biomarkers and therapeutic targets in tumor treatment [46][47][48]. GTPase-related signaling pathways are widely abnormally activated in tumors and are involved in the regulation of tumor cell growth, metastasis, and drug resistance [49]. However, there are few studies on the relationship between these three cell functions and the tumorigenesis of parathyroid tumors, suggesting that further research in this area should be conducted in the future.
Following the implementation of the PPIN and ego-net, 67 ego genes were found, which completely separates normal from parathyroid tumor populations. e top 10 of the 67 ego genes were POLA1, FAM155 B, AMMECR1, THOC2, CCND1, CLDN11, IDS, TST, RBPJ, and GNA11. At present, there have been studies conducted on CCND1, but little research on the other genes. It is reported that AMMECR1 is associated with growth, bone, and heart alterations; CLDN11 is an epigenetic biomarker for malignancy, and THOC2 mutations are related to X-linked intellectual disability [50][51][52]. POLA1 is the catalytic subunit of DNA polymerase and plays an essential role in the initiation of DNA replication [53]. FAM155 B is predicted to be a multipass membrane protein with unknown function. IDS is involved in the lysosomal degradation of heparan sulfate and dermatan sulfate [54]. TST is mainly localized to mitochondria and catalyzes the conversion of thiosulfate and cyanide to thiocyanate and sulfite [55]. As molecular switches in regulating the notch response, transcription factor RBPJ involves in the regulation of the progression of a variety of tumors, such as colorectal cancer [56], lung cancer [57],  [58], glioblastoma [59,60], and prostate cancer [61]. As a member of the G protein family, GNA11 plays an important role in tumor progression [62].
Research shows that CCND1 and POLA1 are potential targets of miR-206 and may share a common regulatory network [63]. e interrelationship between these genes has not been studied. Global promoter methylation by CCND1 often displays differentially expressed DNA hypomethylation, especially when compared to normal tissue [64]. Different CpG sites located near promoter regions of more than 14,000 predicted genes were screened. Both CCND1 and CCND2 have been previously reported as deregulated in many cancers [65,66]. Furthermore, they represent an alternative marker to study epigenetic modifications [66]. e reversibility of these changes makes them prime targets for therapeutic manipulations. A number of small molecules targeting chromatin-based mechanisms are currently being tested in clinical trials [67]. However, the results of this study were based on a limited clinical sample, which limited the accuracy of the data. e role of the above biomarkers still needed to be verified with more samples. In addition, we did not analyze the difference of MEN1 in adenomas, carcinomas, or sporadic tumors due to the limitation of sample size. Importantly, our conclusions still require experimental confirmation, not just bioinformatics data analysis. e expression of MEN1 in subgroups of parathyroid tumors is still worthy of further analysis.
In conclusion, we have evaluated the performance of EgoNet in human protein-protein interaction networks.
is method has not only successfully identified CCND1 from significant ego-networks but has also detected several novel targets, such as POLA1, FAM155 B, AMMECR1, and THOC2. We expect that EgoNet can be widely used to infer novel biomarkers for phenotypic prediction of many human diseases.

MEN1:
Multiple endocrine neoplasia type 1 CpG: Cytosine phosphate guanine GEO: Gene expression omnibus SNP: Single-nucleotide polymorphism MAF: Minor allele frequency GO: Gene ontology DEGs: Differentially expressed genes NCBI: National Center of Biotechnology Information FDR: False discovery rate KEGG: Kyoto Encyclopedia of Genes and Genomes STRING: Search Tool for the Retrieval of Interacting Genes CLR: Context likelihood of relatedness SVM: Support vector machine GO: Gene ontology PPIN: PPI network AUC: e area under receiver operating characteristic curve Accuracy: Classification accuracy MCC: Mathews' correlation coefficient TNR: True-negative rates TPR: True-positive rates HPT-JT: Hyperparathyroidism-jaw tumors.
Data Availability e data supporting the conclusions of this paper are included within the manuscript.

Ethical Approval
is article does not contain any studies with human participants or animals performed by any of the authors.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Qing Li and Yonghao Li contributed equally to this research.