A Novel Cuproptosis-Associated Gene Signature to Predict Prognosis in Patients with Pancreatic Cancer

Background Pancreatic cancer (PAAD) is a malignant tumor with a poor prognosis and lacks sensitive biomarkers for diagnosis and targeted therapy. Cuproptosis, a recently proposed form of cell death based on cellular copper ion concentration, plays a key role in cancer biology. This study is aimed at constructing a risk model for predicting the prognosis of PAAD patients based on cuproptosis-related genes. Methods Pancreatic-related data from UCSC-TCGA and UCSC-GTEx databases were extracted for analysis, and TCGA-PAAD samples were randomly divided into the training and validation groups. Pearson correlation analysis was used to obtain cuproptosis-related genes coexpressed with 19 copper death genes. Univariate Cox and Lasso regression analyses were used to obtain cuproptosis-related prognostic genes. Multivariate Cox regression analysis was used to construct the final prognostic risk model. The risk score curve, Kaplan-Meier survival curves, and ROC curve were used to evaluate the predictive ability of the Cox risk model. Finally, the functional annotation of the risk model was obtained through enrichment analysis. Results The Cox risk model has an eight prognostic cuproptosis-related gene signature. Kaplan-Meier survival curves demonstrated that the high-risk group had a shorter survival time. The ROC curve of the risk score was well created to predict one-, three-, and five-year survival rates, and AUC of the risk score was higher than other clinical characteristics. Cox regression analysis revealed that the risk score has an independent prognostic value for PAAD. GSEA reveals specific tumor pathways associated with the risk model (Myc targets v1, mTORC1 signaling, and E2F targets). Conclusions We constructed a prognostic model containing eight cuproptosis-related genes (AKR1B10, KLHL29, PROM2, PIP5K1C, KIF18B, AMIGO2, MRPL3, and PI4KB) that can accurately predict the prognosis of PAAD patients. The results will provide new perspectives for individualized outcome prediction and new therapy development for PAAD patients.


Introduction
Pancreatic adenocarcinoma (PAAD) exhibits a poor prognosis in digestive system tumors because of its insidious onset, rapid development, and tendency to metastasize [1]. Due to the lack of effective diagnostic methods and nontypical symptoms, patients are usually diagnosed in a locally advanced or metastatic stage. Over 80% of patients miss the chance of surgical treatment when diagnosed [2,3]. At present, early diagnosis, prognosis prediction, and potential immunotherapy targets for PAAD patients are key issues in clinical practice.
Copper is a basic element that is critical to the health of organisms [4]. Normally, the concentration of intracellular copper ions is kept at a low level, and when copper ions gradually accumulate beyond the threshold, reactive oxygen species will be generated, and cell death will be induced. Copper is closely related to cancer occurrence. In 1975, Schwartz proposed that various trace elements, including copper, could be used to diagnose or predict cancer [5]. Copper is involved in the growth of cancer cells and can promote neovascularization and metastasis, leading to cancer occurrence and development [6]. In 2022, researchers proposed a different mechanism for regulating cell death from known mechanisms, including apoptosis, ferroptosis, and necrosis [7]. This independent copper-related cell death mechanism is called cuproptosis. Cuproptosis occurs through the direct binding of copper to the ribosylated portions of the tricarboxylic acid cycle (TAC), leading to aggregation of ribosylated proteins and instability of Fe-S cluster proteins resulting in proteotoxic stress and cell death. The association between PAAD and copper was strong. The serum copper level is significantly higher in PAAD patients than in normal patients, and higher levels of copper are associated with a greater risk of being diagnosed with PAAD [8]. However, the relationship between copper metabolismrelated genes and the prognosis of patients with PAAD remains unclear. This study used bioinformatics methods to identify eight cuproptosis-related genes, and a prognostic model was established in PAAD patients. Based on the risk score, we aimed to better determine the prognosis of PAAD patients and better understand the intrinsic relationship between PAAD and cuproptosis.

Data Download.
The Cancer Genome Atlas (TCGA) database contains RNA-seq data from 178 pancreatic and four normal pancreatic tissue samples [9]. The GTEx database contains RNA-seq data from 167 normal pancreas tissue samples [9]. The University of California Santa Cruz (UCSC) database collates and standardizes data from TCGA and GTEx databases [10]. All transcripts per kilobase million (TPM) data were obtained in the UCSC database on 1 April 2022. The Wilcox test, suitable for large sample size and TPM data, was used to analyze differences in gene expression between the pancreatic cancer tumor group and the normal group. Differentially expressed genes (DEGs) were defined as the absolute value of Log Fold Change ð LogFCÞ > 1 and p value < 0.05. Meanwhile, clinical information for all samples was downloaded, including age, sex, TNM stage, AJCC stage, overall survival (OS) time, progression-free survival (PFS) time, and disease-specific survival (DSS) time. Excluding samples with incomplete data, 178 patients were included.  [7,11,12]. The cuproptosis-related genes coexpressed with cuproptosis genes were obtained based on TCGA-PAAD samples. Pearson's correlation analysis was performed between 19 genes and DEGs in samples to identify cuproptosis-related genes according to the correlation coefficient > 0:5 and p value < 0.05.

Construction of the Cox Risk Model.
With R package "caret," 178 cases were randomly assigned to the training (n = 88) and validation (n = 90) groups. The chi-squared test compared the clinical characteristics of patients in the train-ing group and the validation group, aiming to evaluate the heterogeneity of the two sets of data. The data of the training group were used to construct the Cox risk model of cuproptosis-related genes. The cuproptosis-related genes were subjected to univariate Cox regression analysis based on OS, and the genes with prognostic significance were screened out (p < 0:01). To avoid overfitting the model, Least Absolute Shrinkage and Selection Operator (Lasso) regression analysis was performed on potential prognostic genes. When the cross-validation error of Lasso regression is the smallest, the list of prognostic genes corresponding to the best penalty parameter is obtained. Lasso prognostic genes were subjected to multivariate Cox regression analysis to determine the final risk model. The risk score of each patient was calculated by the following formula: risk score = β mRNA1 × ExpressionmRNA1 + βmRNA2 × ExpressionmRNA2 + ⋯ + βmRNAn × ExpressionmRNAn.

Prediction
Capacity of the Cox Risk Model. Using the median risk score value of the training group, patients in the training and validation groups were divided into the high-and low-risk groups. A risk score curve was drawn to visually display each patient's risk score and disease outcome. Kaplan-Meier survival curves were utilized to assess differences in OS, PFS, and DSS between patients in the high-and low-risk groups. The receiver operating characteristic (ROC) curve and the area under ROC curve (AUC) were employed to assess the predictive power of the risk score and each clinical characteristic. Finally, principal component analysis (PCA) was utilized to reduce the dimensions of four sets of genes (DEGs, cuproptosis genes, cuproptosisrelated genes, and risk model genes) in the high-and lowrisk groups, aiming to visualize the distinguishing ability of risk scores.

Functional Enrichment Analysis of Risk Scores.
Differential analysis was performed on the high-risk group (n = 89) and the low-risk group (n = 89) of the entire cohort, and genes with absolute values of LogFC > 1 and p < 0:05 were defined as risk differential genes. Gene ontology (GO) can analyze biological processes (BP), cellular components (CC), and molecular functions (MF) of gene sets. The Kyoto encyclopedia of genes and genomes (KEGG) analysis is a systematic analysis of cellular pathways involved in genomes. GO and KEGG enrichment analyses were performed using "ClusterProfiler" and "ggplot2" packages in R (Version 3.6.3). Significant enrichment was considered eligible if p < 0:05. Gene set enrichment analysis (GSEA) can evaluate the regulation of pathways in the high-and lowrisk groups. For GSEA in the entire cohort, GSEA software (GSEA 4.1.0) and h.all.v7.5.symbols were employed. A GMT dataset in MsigDB was used as the control group. The number of permutations was set to 1000. The screening criteria for significant enrichment pathways were false discovery rates ðFDRÞ < 0:25 and absolute values of normalized enrichment score ðNESÞ > 1. The nominal p value (NOM p value) was <0.05.
2.6. External Dataset Validation of the Cox Risk Model. GEO microarray expression data with survival information were screened for external validation of the Cox risk model [13]. The GSE62452 dataset (69 PAAD samples) and the GSE28735 dataset (45 PAAD samples) were included for analysis. The probe file GPL6244 was downloaded for the conversion of gene IDs [14,15]. The SVA package was used to remove batch effects after merging the GSE62452 and GSE28735 datasets [16]. In total, the survival time and survival status of 107 PAAD cases were used to assess the prognostic value of the Cox risk model. The Survminer package was used to identify the optimal cutoff value, and the samples were divided into the high-and low-risk groups based on the cutoff value.     BioMed Research International 2.7. Differential Expression Analysis of Risk Genes. We further confirmed the differential expression of 8 risk genes at the cellular level by qRT-PCR. PAAD cell lines (ASPC-1, SW1990, BXPC3, and PANC-1) and human pancreatic cell lines were obtained from the Shanghai Institute of Nutrition and Health (Shanghai, China). RNA was isolated from samples using TRIzol, after which the PrimeScript RT Reagent Kit was employed to prepare cDNA. A StepOne Real-Time PCR Instrument (Applied Biosystems, NY, USA) was used for all qRT-PCR analyses. Relative gene expression was analyzed using the 2-ΔΔCt method, with GAPDH being employed for normalization purposes. The primers used in this study are shown in Table 1, and the analyses were repeated three times.

Statistical
Analysis. Statistical analysis was performed using R (Version 3.6.3). Cox regression was used to evaluate each gene for the prognosis of PAAD patients by calculating the hazard ratio (HR) and its 95% confidence interval (CI). Log-rank tests were used to evaluate OS, DSS, and PFS of the three groups (entire, training, and validation). The Pearson correlation test was used to analyze correlations. The ggplot2 R package was used to visualize all data. Statistical significance was set at p < 0:05. According to R package "caret," 178 samples were randomly assigned to the training (n = 88) and validation (n = 90) groups. The clinical characteristics of 178 PAAD patients are detailed in Table 2, without statistical differences in the clinical characteristics between the training and validation groups.

Establishment of the Cox Risk Model.
A correlation analysis between 19 cuproptosis genes and 7978 DEGs was conducted. A total of 5252 genes with a correlation coefficient > 0:5 and p < 0:05 were identified as cuproptosis-related genes ( Figure 2(c), Supplementary Table 2). All cuproptosis-related genes were selected for univariate Cox regression analysis, and 202 genes with prognostic values were obtained (Supplementary Table 3).
The following Lasso regression analysis revealed that the best penalty parameter corresponds to 22 prognostic genes (Figures 2(d) and 2(e), Table 3). Finally, a multivariate Cox regression analysis revealed an eight prognostic cuproptosis-related gene signature for PAAD patients (Figure 2(f)). The correlation between the eight gene signatures and 19 cuproptosis genes is presented in Figure 2(g). Based on the eight-gene signature, we calculated the following risk scores: 0:205 * ExpressionAKR1B10 − 0:994 * ExpressionKLHL29 + 0:272 * ExpressionPROM2 − 1:192 * ExpressionPIP5K1C + 1:169 * ExpressionKIF18B + 0:498 * ExpressionAMIGO2 + 1:439 * ExpressionMRPL3 − 2:012 * ExpressionPI4KB: 3.3. Validation of the Cox Risk Model. All patients were classified into either the high-score group or the low-score group according to the training group median risk score (riskScore = 1:121), and risk scores and risk groups for all patients are listed in Supplementary Table 4. The distribution patterns of risk scores and outcome status    Figure 5(b)), and the validation group (AUC = 0:656, Figure 5(c)) to obtain consistent results. In addition, a time-dependent survival ROC curve of the risk score was well created to predict one-, three-, and five-year OS rates (Figures 5(d)-5(f)).
3.6. GSEA. GSEA enrichment analysis was performed based on the high-and low-risk groups of the entire cohort samples, and eight signaling pathways that met the screening criteria were obtained (Figures 7(d)-7(k)). Among them, seven signaling pathways were activated in the high-risk group, including Myc targets v1, mTORC1 signaling, glycolysis, E2F targets, G2M checkpoint, Myc targets v2, and estrogen response late, while one signaling pathway was activated in the low-risk group: Kras signaling. We predicted that the difference between the high-and low-risk groups is related to pancreatic cancer pathogenesis.

External Validation of the Cox Risk Model and Risk
Genes. Survival information and risk gene expression data of 107 PAAD samples were obtained by combining the   Table 6. The results of the univariate Cox regression analysis showed that AKR1B10, KIF18B, and risk score were influential factors for survival in PAAD patients in the GEO dataset (Figure 8(a)). Further results of survival analysis were consistent with previous results, with patients in the low-risk group having better survival times than the high-risk group (Figure 8(b)). Finally, we assessed the expression of 8 risk genes at the cellular level, and the results showed that AKR1B10 and PROM2 were highly expressed in PAAD cell lines, but KLHL29 was highly expressed in human pancreatic cell lines. These results suggest that risk genes have the potential as biomarkers for PAAD (Figure 8(c)).

Discussion
Copper-dependent controlled cell death in human cells is a novel cell death mechanism different from the known cell death mechanism. It occurs through the direct binding of copper ions with lipacylated components of TAC in mitochondrial respiration. Cuproptosis leads to the aggregation of ribosylated proteins and subsequent downregulation of Fe-S cluster proteins, resulting in toxic protein stress and cell death [6]. The five-year survival rate of PAAD is only 5%. Due to difficult diagnosis and poor prognosis, research in PAAD in recent years has focused on finding new biomark-ers to improve prognosis [17]. Cell death-related genes are potential biomarkers for the diagnosis and prognosis of PAAD. ADP ribosylation factor 6 (a ferroptosis-related gene) functions downstream of the Kras/ERK signaling pathway and can promote proliferation and the Warburg effect in PAAD cells [18,19]. The expression of circular RNA ATG7 (an autophagy-related circular RNA) was positively correlated with tumor diameter and lymph node infiltration. Overexpression of ATG7 promoted the proliferation, migration, and autophagy of PAAD cells, indicating that ATG7 may be a potential therapeutic target for PAAD [20]. In addition, with the development of multiomics data, a large number of genetic features and risk models provide new insights for tumor diagnosis and prognosis prediction. Among them, a number of cell death-based disease models have shown considerable clinical value, such as autophagy, ferroptosis, and pyroptosis [21,22]. Considering the complexity of tumor biology, predictive models constructed from multiple genetic signatures are more accurate and reliable than single pathological features or single biomarkers.
The current study constructed an eight-cuproptosisrelated gene signature to predict OS, DSS, and PFS in PAAD patients. First, PAAD patients were randomly assigned to the training and validation groups. Based on the training group, prognostic cuproptosis-related genes were confirmed using Lasso regression and Cox regression models. According to the risk score, patients were categorized into the highand low-risk groups. The prognoses between the high-and low-risk groups were statistically significant, and AUC suggests the prediction ability of our risk score signature. More

11
BioMed Research International importantly, by comparing ROC curves of patient survival, we found that the risk score based on cuproptosis-related genes had more accurate predictive power than clinicopathological indicators as follow-up years increased. The model was validated in the validation and entire groups. In addition, the model was externally validated using the GEO database, and the results showed similar predictive power. We refer to other cell death-based risk models for comparison with the cuproptosis risk model. The AUC values of the risk model in our study led to the predictive performance of overall survival at 1, 3, and 5 years, reaching 0.75, 0.84, and 0.86, respectively. In other cell death-related studies, the AUC values for overall survival at 1, 3, and 5 years were 0.537, 0.731, and 0.852 for autophagy [21]; 0.665, 0.738, and 0.871 for ferroptosis [23]; and 0.643, 0.705, and 0.807 for pyroptosis [22], respectively. Our risk model exhibits more robust and accurate predictive performance than models of other forms of cell death.
Our risk score included eight cuproptosis-related genes: AKR1B10, KLHL29, PROM2, PIP5K1C, KIF18B, AMIGO2, MRPL, and PI4KB. As a carcinogenic protein, AKR1B10 promotes tumor occurrence and development by enhancing fat production and is involved in pancreatic carcinogenesis via modulating the Kras-E-cadherin pathway [24,25]. PROM2, a transmembrane glycoprotein, is upregulated in PAAD cells, and higher PROM2 expression is associated with a poor prognosis in PAAD patients [26]. PIP5K1C is a lipid kinase that regulates adhesion dynamics and cell attachment through the site-specific formation of phosphatidylinositol-4,5-diphosphate. A study disclosed that PIP5K1C is a negative regulator of cell migration and invasion and that the phosphorylation status of PIP5K1C may serve as an indicator of adenocarcinoma invasion [27]. KIF18B is highly expressed in human PAAD tissues and is associated with poor prognosis and clinical characteristics of PAAD patients, such as tumor size and TNM stage. In PAAD cells, KIF18B could bind to the cell division cyclerelated promoter region 8, thereby activating its transcription [28]. The interaction of extracellular matrix receptors causes cancer metastasis, and AMIGO2 promotes the adhesion of tumor cells to endothelial cells, accelerating this process [29].
We performed an enrichment analysis based on the risk scores of pancreatic cancer patients. GSEA revealed that various signaling pathways involved in cancer occurrence and development are enhanced in the high-risk group. The Myc oncoprotein family is involved in regulating metabolic reprogramming and providing sufficient energy for cancer cell proliferation [30]. Aerobic glycolysis is a metabolic modality exhibited in various tumors, and there is evidence that mTORC1 signaling regulates aerobic glycolysis by upregulating hypoxia-inducible factor-(HIF-) 1α [31]. mTORC1 signaling is also involved in promoting lipid metabolism and nucleotide synthesis in tumor cells [32]. The E2F signaling pathway is involved in the transcriptional machinery driving cell cycle progression, and E2F activity is abnormally increased in various tumors, leading to uncontrolled proliferation [33]. Similarly, the enrichment analysis of DEGs between the high-and low-risk groups involves tumor-related mechanisms. Studies have demonstrated that  PI3K/AKT pathway inactivation in PAAD cells can inhibit mutant p53, thus inducing S-phase arrest and apoptosis of PAAD cells [34]. In addition, there is a close relationship between extracellular matrix (ECM) accumulation and cancer progression, and pancreatic stellate cells play a role in this process. Interactions between extracellular matrix receptors alter the tumor microenvironment in PAAD, accelerating its progression, including angiogenesis, invasion, and metastasis [35].
This study has limitations. First, some incomplete information was deleted, which led to a certain degree of information deviation. Next, the concept of cuproptosis was only recently proposed, and sufficient literature cannot be obtained to clarify the possible mechanism of an eight-gene signature involved in cuproptosis. Multicenter, large-scale clinical trials of prediction models require further validation.

Conclusion
In conclusion, we constructed a prognostic model containing eight cuproptosis-related genes and analyzed their MF in PAAD. Our study highlights that the risk score we constructed can be used as a marker of outcome in PAAD patients. Our findings provide new insights into the role of cuproptosis-related genes in PAAD pathogenesis and as potential biomarkers for PAAD diagnosis and treatment.

Data Availability
The raw data presented in this study can be found in online repositories: UCSC, https://xena.ucsc.edu. Data processing results are available in the article and supplementary file.

Ethical Approval
Data are sourced from publicly available databases, so no additional ethical approval is required.

Conflicts of Interest
All authors declare that there is no conflict of interest.