DNA Methylation Modification Regulator-Mediated Molecular Clusters and Tumor Metabolic Characterization in Prostate Cancer

Background An increasing number of studies have indicated a close link between DNA methylation and tumor metabolism. However, the overall influence of DNA methylation on tumor metabolic characteristics in prostate cancer (PCa) remains unclear. Methods We first explored the subtypes of DNA methylation modification regulators and tumor metabolic features of 1,205 PCa samples using clustering analysis and gene set variation analysis based on the mRNA levels of DNA methylation modification regulators. A DNA methylation-related score (DMS) was calculated using principal component analysis and the DNA methylation modification-related gene signatures to quantify DNA methylation characteristics. We then performed a meta-analysis to identify the hazard ratio of DMS in the six cohorts. In addition, a nomogram was drawn using univariate and multivariate Cox analyses based on the DMS and clinical variables. Finally, a drug sensitivity analysis of the DMS was performed based on the genomics of drug sensitivity in cancer datasets. Results Three PCa clusters showing different DNA methylation modification patterns and tumor metabolic features were identified. A DMS system was established to quantify the characteristics of DNA methylation modification. PCa samples showed a differential metabolic landscape between the high and low DMS groups. The prognostic value of the DMS and nomogram was independently validated in multiple cohorts. A high DMS was associated with increases in the tumor mutation burden, copy number variation, and microsatellite instability; high tumor heterogeneity; and poor prognosis. Finally, DMS was closely related to different types of antitumor treatment. Conclusion Improving the understanding of tumor metabolism by characterizing DNA methylation modification patterns and using the DMS may help clinicians predict prognosis and aid in more personalized antitumor therapy strategies for PCa.


Introduction
DNA methylation is one of the earliest discovered and most widespread epigenetic modifications in cells [1]. Among the major forms of DNA methylation in humans is 5 mC, which occurs when DNA within CpG nucleotides is methylated at the fifth carbon atom of cytosine residues [2]. Prostate cancer (PCa) is the most common cancer in men [3]. Although major efforts have been devoted to improving the treatment of PCa, effective individualized therapeutic strategies require further analysis [4]. Several genes with essential roles in the initiation and progression of PCa have been reported to be regulated by promoter hypermethylation or hypomethylation. For example, cyclin-dependent kinase inhibitor 2A (CDKN24) is a suppressor gene that encodes the protein p16 and affects the cell cycle and hypermethylation in PCa [5,6]. Decoy receptors 1 and 2, which are associated with cell apoptosis, maintain an abnormal DNA methylation status and promote the progression of PCa [7].
Cell development and multiplication are driven by energy metabolism. Based on the infinite proliferation ability of tumor cells, the initiation and growth of tumors are closely associated with the transformation of cell metabolic states [8]. Recent studies showed that the mechanism of PCa emergence is related to tumor metabolism, including citric acid and choline metabolism [9]. Some studies indicated that tumor metabolism associated with the androgen receptor (AR) leads to the occurrence and castration resistance of PCa [10,11]. us, it may be possible to prevent the transformation to PCa by inhibiting these metabolic pathways. Studies are needed to explore tumor heterogeneity and the mechanism underlying metabolism in PCa.
Recent studies revealed that many key genes involved in tumor metabolism can be regulated by DNA methylation to promote tumor progression [12]. For example, DNA methylation can inhibit the expression of MCT1 and impact glycolysis to render cells vulnerable to MCT4 inhibition [13]. However, the overall impact of all DNA methylation regulators on tumor metabolism in PCa is unclear.
In this study, we collected 1,205 patients from six independent PCa cohorts and divided them into training and validation cohorts. We then performed unsupervised clustering based on DNA methylation modification (DMM) regulators and identified three DMM patterns in the training cohort. Because of the impact of upstream gene regulation based on the genomic context, we selected biomarkers that regulate DNA methylation rather than those that perform DNA methylation. Using single-sample gene set enrichment analysis (ssGSEA) based on metabolic pathways, we identified three DMM patterns and DMM-related gene clusters with different tumor metabolic characteristics and prognoses. Finally, we constructed a DNA methylation-related score (DMS) system to quantify DNA methylation characteristics.
rough independent validation and metaanalysis, the DMS and nomogram based on the DMS can efficiently predict disease-free survival (DFS) in patients with PCa.

Collection of PCa Datasets.
e workflow for this study is shown in Figure S1A. We collected data on PCa samples from the cBioPortal, Gene Expression Omnibus (GEO), and e Cancer Genome Atlas (TCGA) databases. First, we calculated the average mRNA expression data in multiple samples from the same patients. For prognostic analysis, we excluded patients without DFS information. We also collected six PCa cohorts (MSKCC, DKFZ, GSE116918, GSE54460, GSE70768, and TCGA) containing a total of 1,205 patients for calculation and analysis. For the TCGA prostate adenocarcinoma (PRAD) cohort, RNA sequencing data (fragments per kilobase transcript per million mapped reads [FPKM] values), clinical information, and DNA methylation data (450k methylation microarrays) were obtained from the Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov). e MSKCC and DKFZ datasets were downloaded from cBioPortal for Cancer Genomics (http://www.cbioportal.org/); GSE54460, GSE70768, and GSE116918 were obtained from GEO (https://www.ncbi.nlm.nih.gov/geo/). e FPKM values were transformed into transcripts per kilobase million (TPM) values and log2(n+1) which was more consistent with the microarray data. Somatic mutation and copy number variation data of TCGA PRAD cohort were obtained from the University of California Santa Cruz (UCSC) Xena browser (https://xenabrowser.net). To validate our results in an independent cohort, we first combined the GSE54460, GSE70768, GSE116918, MSKCC, and DKFZ cohorts into one group which was designated as a meta (training) cohort to explore the characteristics of DNA methylation subtypes and construct the DMS. Using the R package sva, the "ComBat" algorithm was employed to remove batch effects (Figures S1B and S1C) [14]. We then used the TCGA PRAD cohort as the testing cohort to evaluate the prognostic value of the DMS and nomogram. All clinical information of the six PCa cohorts is presented in Table S1.

Unsupervised Clustering Analysis.
To extract the characteristics of mRNA expression data and verify the DMM patterns and DMM-related gene clusters, we performed unsupervised clustering methods based on 20 DMM regulators and DMM-related genes in the meta cohort. e Partitioning Around Medoidclustering algorithm was applied and repeated 1,000 times using the R package Con-sensusClusterPlus to ensure the stability of subtype analysis [15].

Differentially Expressed Gene Analysis.
Differentially expressed genes (DEGs) between three DMM patterns with adjusted P value <0.001 were selected using the R package limma as DMM-related genes for further analysis [18].

Construction of DMS and Nomogram.
We performed further statistical analysis of the prognostic DMM-related genes by performing univariate Cox regression analysis on 2259 DEGs, and 337 DMM-related genes associated with DFS were selected for further analysis. To quantify the characteristics of DNA methylation modification, we performed principal component analysis (PCA) of the meta cohort to generate a DMS system. PC1 and PC2 were extracted to calculate the DMS as shown in the equation below. e advantage of this approach is that it concentrates the score on the largest set of highly correlated (or unrelated) gene blocks in the set while downweighting the contribution of genes that are not tracked by other set members [19,20].
where i is the expression of 337 prognostic DMM-related genes. e correlation between 2 Journal of Oncology clinical variates, DMM regulators, DMS, and DFS of patients with PCa was analyzed by performing univariate analysis. e prognostic model and nomogram were constructed using multivariate Cox regression analysis. Kaplan-Meier (K-M) survival curves were used for prognostic analysis, and log-rank tests were performed to calculate P values. To test the precision of the risk model and nomogram, time-dependent receiver operating characteristic (ROC) analysis was performed using the R package survival ROC 1.0.3. An area under the ROC curve >0.60 indicates that the prediction ability of the model was meaningful, whereas a value of >0.75 indicates an outstanding predictive value of the model. e C-index was calculated using the R package pec. Decision curve analysis was performed using the R package ggDCA.

Gene Ontology and Kyoto Encyclopedia of Genes and
Genomes Functional Enrichment Analysis. DEGs between the high and low DMS groups were ranked according to their logFC values. e top 1,000 DEGs were selected for subsequent analyses. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis were performed to identify enriched GO and KEGG pathways of selected genes using the R package clusterProfiler with a cutoff of P value < 0.05 [21].

Statistical
Analysis. Unpaired Student's t-test was used to compare two groups with normally distributed variables whereas the Mann-Whitney U test was used to compare two groups with nonnormally distributed variables. To compare the three groups, one-way analysis and Kruskal-Wallis tests of variance were used as parametric and nonparametric methods, respectively. Contingency table variables were analyzed using the chi-square test or Fisher's exact test. A combined analysis of the hazard ratio of DMS between the six cohorts was performed by meta-analysis (fixed-effect model). Fisher's exact test was used to calculate the difference for contingency table variables, and the correlation coefficient of two variables was obtained using Spearman's correlation analysis. Statistical significance was defined as a two-tailed P value < 0.05. All statistical analyses were performed using R software (version 3.6.3; e R Project for Statistical Computing, Vienna, Austria).

Prognosis Value of DMM Regulators.
We next examined the relationship between DMM regulators and DFS through prognostic analysis of PCa. First, we collected data from 1,205 patients with DFS and other clinical information from six independent cohorts (Table S1). We then divided these patients into a training cohort (meta cohort) to determine the characteristics of the DNA methylation and a testing cohort (TCGA PRAD cohort) to validate our conclusion. Univariate Cox regression and K-M survival analysis were used to analyze 20 DMM regulators in the meta and TCGA PRAD cohort, which showed that DNMT3B, UHRF1, UNG, DNMT1, DNMT3A, MBD4, ZBTB38, MBD3, SMUG1, UHRF2, TDG, NTHL1, TET1, ZBTB4, and ZBTB33 were closely related to the DFS of patients with PCa (log-rank test: P < 0.05; Table S4). Figures 1(f ) and S2B present the correlation and prognostic value of these regulators in a network plot.

Features of DMM-Related Gene Clusters and Establishment of DMS.
To identify the heterogeneity and biomarkers of each pattern, we detected DEGs among DMM patterns through differential analysis. A total of 2,259 DNA DMMrelated genes were selected from the meta cohort (P < 0.001; Figure 3(a) and Table S6). Univariate Cox regression analysis based on the meta-cohort confirmed that 337 DMM-related genes were associated with the DFS of patients with PCa (Table S7; univariate cox: P < 0.001). We also detected three DMM-related gene clusters based on the expression characteristics of these 337 genes using unsupervised cluster analysis ( Figure  3(b) and Figures S5A-S5H).
Moreover, we compared the DFS of patients with PCa from different DMM-related gene clusters using K-M survival analysis. Patients in DMM-related gene cluster C had a longer DFS and better prognosis than those in DMM-related gene clusters A and B (log-rank test, P < 0.001; Figure 3(c)). We also found significant differences in the expression abundance of regulators, and 35/ 41 of the active levels of these pathways were significantly different between the three clusters (Wilcoxon test: P < 0.05; Figures 3(d) and 3(e)). e metabolic heat map of the PCa samples showed that each gene cluster had unique metabolic characteristics ( Figure S6A). DMM-related gene cluster A showed elevated metabolic pathways, including pyrimidine, fructose and mannose, glycine, serine and threonine, tyrosine, glycerophospholipid, arachidonic acid, linoleic acid, alpha-linolenic acid, porphyrin and chlorophyll, sulfur, and other enzymes (drug) and exhibited the worst prognosis. erefore, these pathways may be regulated by DNA methylation and promote PCa progression. In contrast, gene cluster C, with high active levels of fatty acid, ascorbate and aldarate, cysteine and methionine, histidine, tryptophan, beta-alanine, selenoamino acid, glutathione, starch and sucrose, pyruvate, propanoate, butanoate, cytochrome p450 (xenobiotics), and cytochrome p450 (drug) metabolic pathways showed the best prognosis, suggesting that these pathways suppress PCa progression. In addition, DMM-related gene cluster B is an intermediate metabolic subtype between DMM-related gene clusters A and C and had a medium prognosis. ese findings demonstrate that DNA methylation is the source of tumor heterogeneity, leading to tumor metabolism disorders.
To identify a quantitative biomarker of DNA methylation modification in PCa samples, we used the PCA algorithm to calculate the DMS, which is the value of PCA2 minus PCA1 from prognostic DMM-related genes (Figure 4(a) and Table S8). Differential analysis of DMS between different DMM patterns and DMM-related gene clusters indicated that DMS can describe the DNA methylation characteristics in PCa samples (Wilcoxon test: P < 0.05, Figures 4(b) and 4(c)). e distribution of patients in the meta cohort in the three patterns and three DMMrelated gene clusters is shown in Figure 4(d). Finally, the GO and KEGG gene function enrichment analysis of DEGs between the high and low DMS groups divided by the median DMS (DMS � 106.32) in the meta cohort also indicated that DMS is related to carbon, pyruvate, propanoate, cysteine and methionine, and amino sugar and nucleotide sugar metabolisms (P < 0.05; Figure 4(e) and Table S9).

Relationship between DMS, DMM Regulators, DNA Methylation Level, and Metabolic Status of PCa Samples.
To explore the relationship between DMS and DMM regulators, we calculated the coefficient between the DMS and DMM regulators and compared the expression levels of these regulators between the high and low DMS groups (Figures 4(f ) and 4(g)).
We also obtained TCGA PRAD 450 K data. First, we calculated the DMS of TCGA PRAD cohort samples and divided these samples into high and low DMS groups based on the median DMS (DMS � 72.10). Unsupervised clustering (single-center method) was performed based on the 500 genes showing the most variable DNA methylation levels in turn based on DNA methylation characteristics. Remarkably, the heat map indicated that the characteristics of DNA methylation were closely associated with the DMS ( Figure S6B). We then compared the average DNA methylation levels of the top 500 genes between the high and low DMS groups and the coefficient between the average DNA methylation level and the DMS. e low DMS group had a higher DNA methylation level than the high DMS group, confirming that the DMS accurately reflected the DNA methylation status of PCa samples (Wilcoxon test: P < 0.05; Spearman correlation: P < 0.05; Figures S6C  and S6D).
is result was further examined by performing risk stratification analysis between the DMS and clinical factors, including age, Gleason score, and T stage, in both the meta and TCGA PRAD cohorts ( Figures S8 and S9). e results confirmed that DMS is an independent prognostic biomarker with clinical factors and can accurately predict the DFS in patients with PCa. To improve the predictive value of DMS, we selected clinical variables with independent prognostic value to obtain a nomogram through univariate and multivariate Cox analysis in the meta cohort (P < 0.05; Figure 6(f ) and Table S11). We used C-index calculation, ROC analysis, and decision curve analysis to assess the clinical significance of DMS, clinical variate, and nomogram. Superior results were observed for the DMS as a more accurate and reliable prognostic biomarker compared to clinical variates, whereas the nomogram had a better net benefit than the clinical variate or DMS-only models (Figures 6(g) and S10A-S10F). Furthermore, to independently validate the predictive value of the nomogram, we calculated the total points of each sample in the TCGA PRAD cohort and performed ROC and K-M survival curve analysis. e nomogram was found to predict the DFS of patients with PCa in both the meta and TCGA PRAD cohorts (Figures 6(h)-6(i), and Figures S10G and S10H).

Relationship between DMS, Tumor Mutation Burden, Microsatellite Instability, and Tumor Microenvironment.
In previous studies, the tumor mutation burden (TMB) has been reported to reflect tumor heterogeneity [30]. Because nuclear excision repair showed functional enrichment for the DMS and to further examine the relationship between PCa tumor heterogeneity and DMS, we compared the TMB between the high and low DMS groups in the TCGA PRAD cohort. We found that tumors in the high DMS group had higher mutation counts than those in the low DMS group (Wilcoxon test: P < 0.001; Figure 7(a)). is further indicated that tumors in the high DMS group had greater heterogeneity than those in the low DMS group. We explored the relationship between the TMB, DMS, and DFS in prognosis by comparing the DFS between patients with a high and low TMB and found that patients with a high TMB had a shorter DFS than those with a low TMB (log-rank test: P < 0.001, Figure 7(b)). We further compared the DFS of high and low DMS groups in the high and low TMB groups. In the high or low TMB group, patients in the high DMS group had a shorter DFS than those in the low DMS group, suggesting that DMS is an independent prognostic factor for the TMB (logrank test: P < 0.001; Figure 7(c)). We also generated a gene mutation landscape using a waterfall plot of the top 20 genes and found that the mutation rate of SPOP was higher and that of TTN was lower in the high DMS group (P < 0.05; Figure 7(d) and Table 1). Based on these results, DMS was clearly associated with DNA mutation in SPOP and TTN.
Copy number variations (CNVs) occurring upstream of genes regulate gene expression and influence tumor occurrence and development [31]. To explore whether this DNA element influences the DMS, we analyzed the number of amplifications and deletions as CNVs in DNA methylation regulator pattern-related prognostic genes. e number of CNV amplifications and deletions was higher in the high DMS group than in the low DMS group (Wilcoxon test: P < 0.001; Figures 7(e) and 7(f )), indicating that CNV is an important factor leading to DNA methylation disorders. Microsatellite instability (MSI) is also a crucial indicator of genome instability associated with tumor heterogeneity [32]. We obtained the MSI data of each PRAD sample from a previous study and performed correlation analysis [33]. e results suggest that the high DMS group had a higher level of MSI compared to the low DMS group (Wilcoxon test: P < 0.001; Figure 7(g)), suggesting that MSI also interferes with DNA methylation. e tumor microenvironment, including both stromal and immune cells, can impact the occurrence and progression of PCa. To explore the characteristics of PCa in the high/low DMS groups, we first calculated ssGSEA scores for EMT, ECM, and TGF-β using the corresponding gene sets (Table S2). Differential analysis of these ssGSEA scores indicated that high DMS tumors had higher ECM and lower TGF-β scores than low DMS tumors ( Figure S11A). We then performed correlation analysis between immunologic ssGSEA scores and DMS, which revealed that the DMS was closely related to several immune pathways ( Figure S11B). erefore, DNA methylation may also be a crucial factor in immune disorders related to PCa.

Predictive Value of DMS in Antitumor erapy.
For metastatic and recurrent prostate cancer, androgen deprivation therapy is the main treatment strategy and can significantly improve prognosis [34]. Based on this information, we first analyzed the relationship between the DMS and drug sensitivity of bicalutamide via meta-analysis. e results suggested a positive correlation between the DMS and half maximal inhibitory concentration (IC50) of bicalutamide, and tumor samples in the high DMS group had a higher IC50 on for bicalutamide compared to the low DMS group (Wilcoxon test, P < 0.001; Figures 8(a) and  8(b)). We then predicted the drug sensitivity to cisplatin, docetaxel, doxorubicin, and paclitaxel, which are common antitumor drugs used for PCa. e results shown in Figures 8(a) and 8(b) indicate that the DMS had predictive value for these drugs.
Immune therapy is the most widely used treatment for many cancers, and its effect is associated with the expression  of checkpoint genes [35]. Although PCa is an immune desert tumor, some studies have suggested the usefulness of immunotherapy for PCa [36,37]. erefore, we compared the expression levels of checkpoint genes between the two groups. e high DMS group showed higher expression levels of CD276, CD4, IL1A, LAG3A, PDCD1, TGFB1, CTLA4, CD8A, PRF1, and TNF and lower expression levels of CD274, CXCR4, CXCL10, and CXCL9 compared to the low DMS group (Wilcoxon test: P < 0.05; Figure S11C). e high expression level of checkpoint genes may explain the worse prognosis in the high DMS group, and patients in the high DMS group may benefit more from immune therapy than those in the low DMS group.

Discussion
With the development of molecular biology and oncology, increasing evidence has shown that DNA methylation can significantly influence tumor metabolism and cancer progression [38]. However, the overall spectrum of the regulatory patterns of DNA methylation and its influence on tumor metabolism in PCa remain unclear. It has been shown that the link between different DMM patterns and tumor metabolic status in PCa can promote the recognition of tumor pathology and be used to improve the efficiency of antitumor treatment.
In this study, we identified three DMM patterns with unique metabolic characteristics via clustering analysis of DMM regulator expression. PCa tissues showed significantly different metabolic statuses between DNA methylation regulator patterns. In addition, significantly different prognoses were observed between the three patterns.
e DEGs among these three DMM patterns were identified as DMM-related genes and may be directly or indirectly regulated by DNA methylation events. Similar to the DMM patterns, we found that three DMM-related gene clusters correlated with alterations in tumor metabolism were identified by these DMM-related genes. is is indicative of three different metabolic clusters in PCa. As pattern A and gene cluster A had the shortest DFS, the metabolic pathways activated in pattern A and gene cluster A, which are fruc-tose_and_mannose, glycine_serine_and_threonine, tyrosine, glycerophospholipid, arachidonic_acid, linoleic_acid, alpha_linolenic_acid, porphyrin_and_chlorophyll, sulfur, and other_enzymes_(drug), may promote the progression of PCa, by altering metabolic pathways. Additionally, pattern C and gene cluster C were associated with the longest DFS and best prognosis, and their activated metabolic pathways fat-ty_acid, ascorbate_and_aldarate, cysteine_and_methionine, histidine, tryptophan, beta_alanine, selenoamino_acid, glutathione, starch_alanine, pyruvate, propanoate, butanoate, and cytochrome_p450(drug) may suppress PCa progression.
Considering the heterogeneity of DNA methylation modifications individually, it is necessary to quantify the DNA methylation modification spectrum of each sample. erefore, we constructed a DMS system and validated it in multiple PCa cohorts. To confirm that the DMS is associated with DNA methylation characteristics, we conducted unsupervised clustering using DNA methylation levels of the top 500 genes with the greatest variation in PCa in the TCGA PRAD cohort. Unexpectedly, there was a close correlation between the DNA methylation status and DMS level (high/ low DMS group). Taken together, these findings reveal an association between the transcriptome and DNA methylome and support that the DMS can represent the characteristics of DNA methylation.
Moreover, to validate the prognostic value of DMS, we performed hazard stratification and multivariate analysis between the DMS, TMB, Gleason score, and clinical factors in multiple PCa datasets. e results suggested that DMS is an independent prognostic factor and can help improve prognosis prediction in patients with PCa. Finally, our results have clinical implications, as we established a DMS system that can describe DNA methylation characteristics. e prognostic value of the DMS system and nomogram based on clinical variables and DMS was validated in multiple PCa cohorts. e DMS can also be used to evaluate clinicopathological characteristics, such as the TMB, CNV, MSI status, and SPOP and TTN mutation in patients with PCa. In addition, DMS can predict the efficacy of castration, androgen deprivation therapy, and chemotherapy.
is study had some limitations. First, the reaction rate of the PCa antitumor therapy cohort was not accessible. e predictive value of the DMS for antitumor therapy in PCa must be further verified in further studies. Second, because of batch effects, the DMS cannot be directly compared between different cohorts without batch correction.

Conclusions
In summary, we explored three different metabolic subtypes based on the characteristics of DNA methylation and established a DMS system for PCa. DMS, which is associated with the tumor metabolic status, TMB, CNV, and MSI, is an efficient index for predicting DFS and therapeutic responsiveness and may help facilitate personalized antitumor therapy for patients with PCa.  Figure 11: the relationship between DMS and TME. (A) Boxplot of ssGSEA score of EMT, ECM, and TGF-β for low and high DMS groups in the meta cohort. (B) e heat map of the correlation between DMS and ssGSEA scores of immune pathways in meta cohort. (C) Boxplot of ssGSEA score of immune pathways for low and high DMS groups in the meta cohort. * P < 0.05, * * P < 0.01, and * * * P < 0.001. Supplemental Table 1: clinical information of PCa patients from six cohorts. Supplemental Table 2: gene sets analyzed in the article. Supplemental Table 3: correlation analysis between DNA methylation and mRNA expression of 20 DMM regulators. Supplemental Table 4: the univariate cox analysis of 20 DMM regulators in meta cohort and TCGA cohort. Supplemental Table 5: differential analysis of ssGSEA of KEGG pathways for three DMM patterns. Supplemental Table 6: DEGs between three DMM patterns. Supplemental Table 7: the univariate cox analysis of 337 prognostic DMMrelated gene expression in meta cohort. Supplemental Table 8: PCA and PCA coefficient to calculated DMS. Supplemental Table 9: GO/KEGG analysis of DEGs between low and high DMS groups in meta cohort. Supplemental