Development of a KRAS-Associated Metabolic Risk Model for Prognostic Prediction in Pancreatic Cancer

Background KRAS was reported to affect some metabolic genes and promote metabolic reprogramming in solid tumors. However, there was no comprehensive analysis to explore KRAS-associated metabolic signature or risk model for pancreatic cancer (PC). Methods In the current study, multiple bioinformatics analyses were used to identify differentially expressed metabolic genes based on KRAS mutation status in PC. Then, we developed and validated a prognostic risk model based on the selected KRAS-associated metabolic genes. Besides, we explored the association between the risk model and the metabolic characteristics as well as gemcitabine-associated chemoresistance in PC. Results 6 KRAS-associated metabolic genes (i.e., CYP2S1, GPX3, FTCD, ENPP2, UGT1A10, and XDH) were selected and enrolled to establish a prognostic risk model. The prognostic model had a high C-index of 0.733 for overall survival (OS) in TCGA pancreatic cancer database. The area under the curve (AUC) values of 1- and 3-year survival were both greater than 0.70. Then, the risk model was validated in two GEO datasets and also presented a satisfactory discrimination and calibration performance. Further, we found that the expression of some KRAS-driven glycolysis-associated genes (PKM, GLUT1, HK2, and LDHA) and gemcitabine-associated chemoresistance genes (i.e., CDA and RMM2) was significantly upregulated in high-risk PC patients evaluated by the risk model. Conclusions We constructed a risk model based on 6 KRAS-associated metabolic genes, which predicted patients' survival with high accuracy and reflected tumor metabolic characteristics and gemcitabine-associated chemoresistance in PC.


Introduction
Pancreatic cancer (PC) is one of the most malignant cancers, which leads to 4.5% of all cancer-related deaths globally [1].
The vital causes for the poor prognosis are the highly aggressive phenotype and early cancer recurrence and metastasis following surgical treatment [2]. Despite some advances in the management of PC in recent years, very few major breakthroughs for effective biomarkers nor treatment strategies have emerged.
Almost all PC patients carry at least one of the four frequently mutated driver genes, which include oncogene KRAS and the tumor suppressors TP53, SMAD4, and CDKN2A [3]. KRAS, the most frequently mutated oncogene in cancer especially in PC, was reported to rewire metabolism to support tumor growth [4]. Several studies showed that KRAS promoted metabolic reprogramming through the enhancement of glucose metabolism, differential channeling of glucose intermediates, reprogramming glutamine metabolism, and increasing autophagy and macropinocytosis [5,6]. The metabolic reprogramming and fibrotic stroma of PC had been considered as a barrier for cytotoxic drug delivery to cancer cells, therefore contributing to chemoresistance and treatment failure [7,8]. Although some KRAS-associated genes were found to regulate cancer cell metabolism, the mechanism still remained to be clarified and there was no comprehensive analysis to explore KRAS-associated metabolic signature or risk model for PC.
In the current study, multiple bioinformatics analyses were used to identify differentially expressed metabolic genes based on KRAS mutation status in PC. Then, a prognostic model was constructed based on the selected KRAS-associated metabolic genes. Moreover, we explored the association between the risk model and the metabolic characteristics and gemcitabine-associated chemoresistance in PC.

Datasets and Data
Acquisition. The gene expression data were recorded based on fragments per kilobase per million (FPKM), and clinical information for 178 PC samples was obtained from The Cancer Genome Atlas (TCGA, https:// portal.gdc.cancer.gov/repository) database up to September 2020. The KRAS mutation data of TCGA PC cohort were downloaded from cBioPortal (http://www.cbioportal.org/) [9]. Among 178 PC samples, 167 PC samples with RNA-sequencing data and KRAS mutation information were subjected to subsequent analysis. The GSE57495 dataset based on GPL9115 (including 63 PC samples) and the GSE79668 dataset based on GPL11154 (including 51 PC samples) were downloaded from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) database for validation. Besides, two other GEO datasets, the GSE140077 dataset based on GPL20795 (including 1 PC cell line and 6 samples) and the GSE106336 dataset based on GPL18573 (including 1 PC cell line and 6 samples), were used for further study in gemcitabine resistance of PC. A total of 1724 metabolic genes were obtained from the metabolic pathway-related gene lists of "c2.cp.kegg.v7.1.symbols.gmt" in gene set enrichment analysis (GSEA, https://www.gsea-msigdb.org/gsea/index .jsp) [10]. Datasets mentioned above for PC were publicly obtainable, and ethics approval was not needed. One flow chart is displayed in Figure 1 to summarize the process of this study.    KRAS  TP53  MAMLD1  MAGEC1  CDKN2A  SMAD4  IRS1  FAMA7C  MAML3  PRG4  ATRX  FOXP2  TMC4  ZNF347  CD99L2  IPP  OTUD4  SORBS2  EDC4  RUNX2  OTOF  DIDO1   5 BioMed Research International system, the GSE57495 and GSE79668 datasets were analyzed. Patients in separate datasets were divided into a high-risk group and a low-risk group using the median cutoff of the risk score. To evaluate the performance of model, the R package "pROC" was used to evaluate the discrimination of the risk model in TCGA and GEO datasets through area under the curve (AUC) of the receiver operating characteristic (ROC) curve of 1-and 3-year survival. With the help of R package "rms," Harrell's concordance index (C-index) of the risk score system was calculated by a bootstrap approach with 1000 resamples. The calibration curves were used to assess the consistency between modelpredicted and observed survival. Then, the log-rank tests and Kaplan-Meier (KM) analyses were performed using the survival R package between the high-risk and low-risk groups to assess the predictive ability of the prognostic model. We calculated the risk scores in different groups based on clinicopathological parameters (such as American Joint Committee on Cancer (AJCC) stage, histologic grade, and diabetes status) in TCGA PC cohort by using Wilcoxon's test to evaluate whether the risk model reflects PC progression.

Functional Enrichment Analysis.
Gene set enrichment analysis (GSEA, Version3.0, http://software.broadinstitude .org/gsea/) was performed to determine how the metabolic pathways and relevant metabolic pathway-related genes differed between PC samples of KRAS WT and KRAS MUT in TCGA cohort. An annotated gene set file (c2.cp.kegg.v7.1.symbols.gmt) was used as the reference gene set, and a gene set was considered as an enriched group when the nominal p value <0.05. DEGs and coexpressed genes of PAGs screened from the cBioPortal database were integrated to DAVID 6.7 (https://david-d.ncifcrf.gov/) separately to   7 BioMed Research International perform Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis [11]. Results were visualized using R package "ggplot2."

Association between the Risk Model and Metabolic
Characteristics of PC. Several KRAS-driven metabolic targets which took part in metabolic reprogramming of PC were selected to explore relationships between the risk model and metabolic characteristics of PC [12]. The mRNA expressions levels of these promising KRAS-driven metabolic genes were compared in high-risk and low-risk groups by using Student's t-test.

Association between the Risk Model and Gemcitabine
Chemoresistance of PC. In order to evaluate whether the risk model reflects gemcitabine metabolism-associated chemoresistance of PC, metabolic risk scores were calculated in 2 parental and GEM-resistant cell lines (CFPAC-1 and HPAFII) in GSE140077 and GSE106336 datasets. Then, several gemcitabine metabolism-associated chemoresistance genes of PC were selected to explore relationships between the risk model and gemcitabine chemoresistance of PC [13]. The mRNA expressions levels of these promising KRAS-driven metabolic genes were compared in high-risk and low-risk groups using Wilcoxon's test.

Statistical
Analyses. All statistical analyses were performed using R software Version 4.0.1 (https://www.rproject.org/) and SPSS software Version 24.0 (SPSS, Inc., Chicago, IL, USA). X-tile (New Haven, CT, United States) was used to define the best cutoff values for outcome-based optimization. A p value <0.05 was considered statistically significant unless otherwise specified.

Association between Metabolic Phenotype and KRAS Mutations in PC.
It is shown in the FireBrowse Database (http://www.firebrowse.org/) that KRAS mutation was the most common type of mutation in PC, followed by mutation frequency of TP53, MAMLD1, MAGEC1, CDKN2A, and SMAD4 (Figure 2(a)). Next, we utilized mRNA expression data and corresponding clinical information of PC samples in TCGA and cBioPortal to investigate metabolic processes linked to KRAS mutation status. GSEA was conducted to determine the difference of metabolic pathways between 58 PC samples of KRAS WT and 109 PC samples of KRAS MUT. The results showed that 4 metabolic biological processes were significantly enriched in the KRAS MUT group, which were glycine serine and threonine metabolism (NES = 1:63, size = 31, and p value <0.05), tryptophan metabolism (NES = 1:48, size = 40, and p value <0.05), type   b)). Then, these 54 DEGs were integrated to DAVID 6.7 to perform GO analysis and KEGG pathway analysis. GO analysis showed that the top 10 highly enriched functions in the metabolic process were "oxidation reduction, carboxylic acid catabolic process, hormone metabolic process, alcohol catabolic process, glucose metabolic process, lipid catabolic process, cellular amino acid catabolic process, vitamin metabolic process, steroid metabolic process, and glutamine family amino acid metabolic process" (Figure 3(c)). In KEGG analysis (Figure 3(d)), DEGs were found to be mainly enriched in "adipocytokine signaling pathway, retinol metabolism, metabolism of xenobiotics by cytochrome P450, type II diabetes mellitus, drug metabolism, endocytosis, and fatty acid metabolism."

Construction and Validation of a KRAS-Associated
Metabolic Risk Model. We performed univariate Cox proportional regression and found that 21 out of 54 DEGs were significantly related to OS and considered as prognosis-associated DEGs (PAGs) (Figure 4(a)). Further, with the help of the R package "survival," a multivariate Cox regression analysis for these 21 PAGs was performed and it revealed that 6 of them were independently associated with OS (Figure 4(b)). Therefore, a 6-PAG-based KRAS-associated metabolic risk model was developed by weighting the normalized expression of these 6 PAGs multiplied by corresponding coefficients derived from the multivariate Cox regression analysis: risk score = ð−0:29328 * normalized expression of CYP2S1Þ + ð−0:45614 * normalized expression of GPX3Þ + ð−0:57665 * normalized expression of FTCDÞ + ð0:54899 * normalized expression of ENPP2Þ + ð 0:19472 * normalized expression of UGT1A10Þ + ð0:34727 * normalized expression of XDHÞ (Table 1). Patients were divided into high-risk and low-risk groups by using the median cutoff value of the risk score (Figure 3(c)). The clinicopathological characteristics of patients in different risk groups were shown in Table 2.
The predictive accuracy of the risk model was further assessed through the ROC and C-index analysis. The results showed the AUC of the risk model for OS in TCGA cohort was 0.773 at 1 year and 0.704 at 3 years ( Figure 5(a)). Besides, C-index of the risk model in TCGA cohort was 0.733 (95% CI, 0.675-0.761). The calibration curves of the risk model matched well, which indicated that it could accurately predict the 1-and 3-year OS in TCGA cohort ( Figure 5(b)). GSE57495 and GSE79668 datasets were used for the external validation for the risk model. The AUC for 1-and 3-year OS in GSE57495 datasets was 0.627 and 0.698 ( Figure 5(a)), while the AUC for 1-and 3-year OS is 0.727 and 0.617 in GSE79668 datasets ( Figure 5(e)). The C-index of the risk model in GSE57495 and GSE79668 datasets was 0.683 (95% CI, 0.655-0.723) and 0.625 (95% CI,

Association between the Risk Model and Patients' Survival and Clinicopathological Characteristics in PC.
To further evaluate the prognostic power of the risk model, Kaplan-Meier analyses were performed and we found that all patients in the high-risk group had a shorter OS (p value = 4.234e-07) and disease-free survival (DFS) (p value = 1.83e -06) than those in the low-risk group in TCGA cohort (Figures 6(a) and 6(b)). Besides, similar results for Kaplan-Meier analyses of OS were observed in the GSE57495 dataset (p value = 8.637e-04) and GSE79668 dataset (p value = 1.833e-02) (Figures 6(c) and 6(d)). Further, we calculated the risk scores in different groups based on clinicopathological characteristics in TCGA PC cohort, and the results showed that patients who had advanced stage, higher histologic grade, or diabetes history had higher risk scores (p value <0.05) (Figure 6(e)). The data in Figures 4 and 5 suggested the KRAS-associated metabolic risk model had effective value in predicting patients' survival and was associated with advanced tumor characteristics.

Association between the Risk Model and Metabolic
Characteristics of PC. We next explored the activities of 6 PAGs (i.e., CYP2S1, GPX3, FTCD, ENPP2, UGT1A10, and XDH) by analyzing its potential metabolic pathways in PC. The coexpression analyses for 6 PAGs were performed by using the cBioPortal dataset (Spearman ' s correlated coefficient > 0:6 or <−0.6, p value <0.05). We found 131 coexpression genes for CYP2S1, 163 coexpression genes for ENPP2, 387 coexpression genes for FTCD, 521 coexpression genes for GPX3, 189 coexpression genes for UGT1A10, and 213 coexpression genes for XDH, all of which were enrolled into DAVID 6.7 and subjected to functional and pathway enrichment analyses. The potential metabolic pathways involved are shown in Figure 7(a). CYP2S1 and its neighboring genes were mainly enriched in "oxidation-reduction process, lipid metabolic process, protein glycosylation, lysophospholipase activity, steroid metabolic process, regulation of glucuronidation, and response to insulin stimulus." GPX3 may act a vital role in "lipid metabolic process, glutamate receptor activity, and regulation of insulin secretion." FTCD may play an important role in "glucose metabolic process and regulation of insulin secretion." ENPP2 may be involved in "L-amino acid transport, glutamate receptor activity, regulation of insulin secretion, and response to insulin stimulus." UGT1A10 was mainly associated with "lipid metabolic process, steroid metabolic process, regulation of glucuronidation, CoA succinyl transferase activity, and endocytosis," and XDH was involved in "protein glycosylation, phagocytosis, and endocytosis." To further explore the potential metabolic targets influenced by our risk model, we compared the expression levels of several KRAS-driven metabolic genes between high-risk and low-risk groups in TCGA cohort. We found higher mRNA expressions of PKM (p = 1:86e − 05), GLUT1 (p = 3:168e − 05), HK2 (p = 3:056e − 04), LDHA (p = 2:989e − 06), and VDR (p = 0:012) in the high-risk group (Figure 7(b)).

Association between the Risk Model and Gemcitabine
Chemoresistance in PC. We found that 5 of 6 PAGs were enriched in "drug metabolism or response to drug" in functional and pathway enrichment analyses (Figure 8(a)). In order to explore their relationship with gemcitabineassociated chemoresistance in PC, we calculated metabolic risk scores in 2 parental gemcitabine-resistant cell lines (CFPAC-1 and HPAFII) in GSE140077 and GSE106336

16
BioMed Research International datasets (Figure 8(b)). The results revealed that the gemcitabine-resistant group had significantly higher metabolic risk scores than the parental group (p value<0.001).
To investigate how the risk model leads to gemcitabine resistance, we compared the expression of previous reported genes regulating gemcitabine drug metabolism and efficacy (i.e., CDA, DCK, hENT1, hCNT1, RMM1, and RMM2) between high-risk and low-risk groups in TCGA cohort. The results showed that the mRNA levels of CDA (p value = 0.001) and RMM2 (p value = 2.517e-12) were significantly upregulated in the high-risk group (Figure 8(c)).

Discussion
As the most frequently mutated oncogene in PC, KRAS and its downstream pathways affect several cellular processes including cell proliferation, migration, metabolism, and autophagy in PC [14]. Small molecule drug (Sotorasib) that specifically and irreversibly inhibits KRAS had passed phase 1 clinical trial, showing a promising therapeutic prospect [15]. Studies have revealed oncogenic KRAS rewired metabolism to favor a more anabolic state and therefore promoted tumorigenesis and progression of PC. However, the molecular mechanisms were still not clear, and it is of great significance to conduct a profound study on KRAS-associated metabolic genes in PC.
Recent studies have offered ample insight into cancer metabolic landscapes to define PC. Gao et al. identified prognostic signatures and novel PC subtypes based on cancer metabolism using weighted gene coexpression network analysis (WGCNA) [16]. They also found higher simple nucleotide variant (SNV) frequencies of KRAS in the high-risk group. In the current study, we found out 6 differentially expressed metabolic genes based on KRAS mutation status, including CYP2S1, GPX3, FTCD, ENPP2, UGT1A10, and XDH. Then, we developed a significant KRAS-associated metabolic risk model for prognostic prediction in PC. Further, the risk model was validated in two external cohorts (GSE57495 and GSE79668 datasets), which showed a stably high prognostic value for PC. These results indicated our model was accurate in predicting patients' survival, which may be helpful in designing personalized therapy targeting metabolism reprogramming. In the risk model, high expression of CYP2S1, GPX3, and FTCD was correlated with better prognoses, while high expressions of ENPP2, UGT1A10, and XDH were correlated with poor prognoses in PC. As a member of cytochrome P450 (CYPs), CYP2S1 was known to biotransform some exogenous and endogenous compounds including drugs, fatty acids, and cholesterols [17]. Study showed that expressions of 8 CYPs were changed in CYP2S1-depleted cells, and all of them were involved in lipid biotransformation. Indeed, CYP2S1 affected the metabolism

17
BioMed Research International of arachidonic acid (AA) and linoleic acid (LA) to modulate BEAS-2B cell growth [18]. In consistent with their results, our data also found CYP2S1 was enriched in lipid metabolic process. It will be interesting to investigate the detailed mechanism by which CYP2S1 regulates lipid metabolism in PC in future study. As a member of glutathione peroxidase family (GPX), GPX3 reduces glutathione to catalyze the reduction of hydrogen peroxide, hydroperoxides, and lipid hydroperoxides [19]. Low GPX activity was found asso-ciated with enhanced lipid peroxidation in metastatic cancers, indicating that loss of GPX3 may promote systemic oxidative stress [20]. Studies had reported that low expression of GPX3 was associated with poor prognosis and chemoresistance in several tumor types [19]. Our results also proved that it is associated with a good outcome and may involve in "lipid metabolic process and glutamate receptor activity" in PC. ENPP2, which encodes an ectolysophospholipase D called autotaxin (ATX), was found  (c) The expression levels of several gemcitabine metabolism-associated chemoresistance genes between high-risk and low-risk groups in TCGA cohort. Higher expressions of CDA (p value = 0.001) and RMM2 (p value = 2.517e-12) were found in the high-risk group. 18 BioMed Research International significantly increased in several types of cancer including PC [21][22][23]. Overwhelming evidences revealed that the ATX/lysophosphatidate (LPA) signaling axis served key roles in energy metabolism regulation and obesity control, dysregulation of which could cause inflammation and tumorigenesis [24,25]. Studies have indicated that ATX/LPA axis promoted DNA synthesis, proliferation, and invasion in pancreatic cancer cells via ERK1/2 and Rho pathways [26,27]. Overexpression of ATX was also proven to lead elevated tumorigenesis and invasiveness compared with control groups in RASmutated NIH3T3 murine fibroblasts [28]. Further studies are needed to explore the relationship and underlying regulation mechanism of KRAS mutation and ENPP2/ATX in PC tumorigenesis and metabolism reprogramming. UGT1A10 was a extrahepatic phase II metabolizing enzyme that expressed highly in numerous target areas for tobacco-induced cancers, including the upper aerodigestive tract [29]. UGT1A10 was elevated in a CPT-11/SN-38-resistant cell line and responsible for SN-38 glucuronidation, which was one of the mechanisms associated with irinotecan hydrochloride/7-ethyl-10-hydroaxycamptothecin (CPT-11/SN-38) resistance in lung cancer [30]. FTCD was found significantly downregulated in hepatocellular carcinoma (HCC) and served as a diagnostic biomarker for HCC [31]. Recently, FTCD was found to be associated with the sensitivity of chemotherapeutic drug methotrexate by using a CRISPR-Cas9-based screen [32]. XDH, a rate-limiting enzyme to catalyze the final steps of purine metabolism, was found significantly decreased and served as a useful predictor of poor prognoses in several cancer types [33]. Uric acid, which was transformed by XDH, was found to modulate tumor cell sensitivity to the antimetabolite 5-FU, one of the most commonly used anticancer drugs in the clinic [34]. Our study consistently suggested that UGT1A10, FTCD, and XDH were involved in "drug metabolism or response to drug" in PC, but their specific roles in metabolic reprogramming and chemoresistance in PC still remain to be studied.
Gemcitabine-based chemotherapy is a major treatment for PC patients with or without surgical resection. It has been reported gemcitabine drug metabolism was affected by some genes or metabolites, which included drug transporters (i.e., hEN1 and hCNT1), activating and inactivating enzymes (i.e., CDA and DCK), and competitive substrates to active metabolites (i.e., RRM1 and RRM2) [13]. Cytidine deaminase (CDA) induced the deamination of dFdC to dFdU, leading to inactivation of gemcitabine [35]. It had been confirmed that CDA expression was correlated with OS in PC, and several in vitro studies revealed that overexpression of CDA led to gemcitabine resistance, while loss of CDA recovered gemcitabine sensitivity [36,37]. RRM2 was involved in the activity of Ribonucleotide Reductase (RR), which was a rate-limiting enzyme of DNA synthesis. It has been demonstrated that upregulation of RRM2 led to gemcitabine chemoresistance in PC cells and human PC xenografts in mice [38]. Besides, expression level of RRM2 was correlated inversely with OS in gemcitabine-treated PC patients in clinical study [39]. In the current study, CDA and RRM2 were upregulated in the high-risk group evaluated by the risk model. Taken these together, the risk model may reflect the possibility of gemcitabine resistance, which would help oncologist choose appropriate chemotherapy regimen for PC.
There are a few limitations of our study. First, our study is mainly based on bioinformatics analysis, and more experimental studies are needed to investigate how KRAS may regulate cancer cell metabolism in PC. Second, the established model needs to be further validated in prospective clinical studies.
In conclusion, we constructed and validated a prognostic model based on the KRAS-associated metabolic genes in PC. The prognostic model reflects tumor metabolic characteristics and gemcitabine-associated chemoresistance, which may have important value in aiding personalized therapy.