Identification and Validation of a Prognostic Model Based on Three Autophagy-Related Genes in Hepatocellular Carcinoma

Background Accumulating studies have demonstrated that autophagy plays an important role in hepatocellular carcinoma (HCC). We aimed to construct a prognostic model based on autophagy-related genes (ARGs) to predict the survival of HCC patients. Methods Differentially expressed ARGs were identified based on the expression data from The Cancer Genome Atlas and ARGs of the Human Autophagy Database. Univariate Cox regression analysis was used to identify the prognosis-related ARGs. Multivariate Cox regression analysis was performed to construct the prognostic model. Receiver operating characteristic (ROC), Kaplan-Meier curve, and multivariate Cox regression analyses were performed to test the prognostic value of the model. The prognostic value of the model was further confirmed by an independent data cohort obtained from the International Cancer Genome Consortium (ICGC) database. Results A total of 34 prognosis-related ARGs were selected from 62 differentially expressed ARGs identified in HCC compared with noncancer tissues. After analysis, a novel prognostic model based on ARGs (PRKCD, BIRC5, and ATIC) was constructed. The risk score divided patients into high- or low-risk groups, which had significantly different survival rates. Multivariate Cox analysis indicated that the risk score was an independent risk factor for survival of HCC after adjusting for other conventional clinical parameters. ROC analysis showed that the predictive value of this model was better than that of other conventional clinical parameters. Moreover, the prognostic value of the model was further confirmed in an independent cohort from ICGC patients. Conclusion The prognosis-related ARGs could provide new perspectives on HCC, and the model should be helpful for predicting the prognosis of HCC patients.


Introduction
Hepatocellular carcinoma (HCC) is the fifth most common cancer worldwide and the third most common cause of cancer mortality [1]. The main cause of HCC is nonalcoholic fatty liver disease (NAFLD) and hepatitis virus infection including chronic hepatitis B virus (HBV) infection and chronic hepatitis C virus (HCV) infection [2]. In America, although the 5-year survival rate of HCC patients increased by 90% between 2006 and 2012 compared to 1992-1999, the overall prognosis of HCC remains poor [3]. Therefore, establishing an effective prognostic model can provide new guidance for clinical management. However, conventional clinical parameters used to predict clinical outcome often have some limitations given the heterogeneity of HCC [4]. Therefore, the establishment of a novel effective prognostic model must be based on the molecular heterogeneity of HCC.
Autophagy is a lysosomal degradation pathway for unwanted, damaged, and defective intracellular components and is essential for survival, differentiation, development, and homeostasis [5]. In recent decades, accumulating evidence has demonstrated that abnormal expression of autophagy-related genes (ARGs) is involved in the development of various cancers [6,7]. However, the role of ARGs in cancer remains inconclusive. In general, at the early stage of liver cancer, autophagy protects cells from carcinogenesis. At an advanced stage, however, autophagy can promote cancer progression [8,9]. For liver cancer, dysregulated autophagy is dependent on changes in ARG expression [10]. Thus, it is of special significance to identify biomarkers from ARGs that are associated with the prognosis of patients with cancer and to calculate their influence on the prognosis model. Recent reports have shown that prognostic models based on ARGs provide better prediction of clinical outcomes for patients with cancer, such as thyroid cancer and bladder cancer [11,12]. However, studies on prognostic models based on ARGs in HCC are limited. Thus, our aims in this study were to construct a prognostic model based on ARGs to predict HCC patient survivals.
In this study, we identified the 62 differentially expressed ARGs in HCC based on The Cancer Genome Atlas (TCGA) database. Enrichment analysis of differentially expressed ARGs may help to identify the potential molecular mechanisms in autophagy. In addition, we performed univariate Cox regression analysis to identify 34 prognosis-related ARGs and conducted multivariate Cox regression analysis to select 3 key genes (PRKCD, BIRC5, and ATIC) to construct a prognostic model. Univariate and multivariate Cox regression analyses confirmed that the risk score calculated by the model formula was an independent risk factor for patient survival. Survival analysis and ROC curve analysis demonstrated that the model had good performance in predicting prognosis. Moreover, we confirmed the reliability of this model by analyzing another independent data cohort obtained from the International Cancer Genome Consortium (ICGC) database. In

Construction and Validation of the Prognostic Model.
Univariate Cox regression analysis evaluated the association between the expression of differentially expressed ARGs and patient overall survival (OS). Genes with a p value less than 0.05 were considered prognosis-related ARGs and then entered into a stepwise multivariate Cox regression analysis tested by AIC (Akaike Information Criterion, assessing the goodness of fit of a statistical model) to identify the predictive model. Then, a prognostic model was constructed, and the formula of the risk score was as follows: risk score = ðthe expression of gene 1 × regression coefficient of gene 1 Þ + ðthe expression of gene 2 × regression coefficient of gene 2 Þ + ⋯+ðthe expression of gene n × regression coefficient of gene n Þ.
Based on the risk score, HCC patients were classified into low-or high-risk groups. Survival curves were generated using the Kaplan-Meier method, and two-sided logrank tests were employed to compare the differences in OS between the low-and high-risk groups. Univariate Cox regression analysis and multivariate Cox regression analysis were conducted to explore whether the risk score could be an independent indicator of OS in the TCGA data cohort of HCC patients. The sensitivity and specificity of the prognostic model to predict the clinical outcome of HCC patients were analyzed by calculating the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. Moreover, an independent data cohort of the ICGC database was used to further confirm the predictive value of the model.

Statistical
Analysis. R 3.6.1 (https://www.r-project.org) was utilized for plot and statistical analysis. The Wilcoxon rank-sum test was used to identify differentially expressed ARGs. Univariate Cox regression analysis was performed to estimate the prognosis-related ARGs. Multivariate Cox regression analysis was used to construct the prognostic model. An independent t-test was used to analyze the associations between risk score and conventional clinical parameters. ROC analysis was performed to test the sensitivity and specificity of the model. The survival curve was plotted using the survival and survminer packages of R. The forest maps were plotted by the forestplot package of R. The survivalROC package of R was used to generate ROC curves and AUC values to calculate according to ROC curves.

Differentially Expressed Autophagy-Related
Genes between HCC and Normal Tissues. To date, HADb datasets include a total of 232 ARGs that have been described as   (Table 1). Compared with the noncancer tissue samples, 62 differentially expressed ARGs were identified in the HCC samples ( Table 2). The results of hierarchical cluster analysis showed that the HCC samples could be clearly distinguished from the normal tissues based on the differentially expressed ARGs (Figure 1(a)), and the volcano plot showed that there were 4 downregulated genes and 58 upregulated genes among these ARGs (Figure 1(b)). Furthermore, a scatter plot was generated to visualize the expression levels of differentially expressed ARGs between HCC and normal tissue (Figure 1(c)).

Identification of Prognostic-Related ARGs.
To identify ARGs associated with the clinical outcomes of patients, univariate Cox regression analysis was applied with the criterion of a p value < 0.05. As shown in Figure 4, 34 genes were selected and significantly associated with the overall survival (OS) of HCC patients. The hazard ratio (HR) of each gene was calculated, and all 34 genes had an HR > 1, indicating that all of these genes are risk factors for OS in HCC patients.

Construction and Validation of the ARG-Based
Prognostic Model. Based on the prognosis-related ARGs, multivariate Cox regression analysis was performed to construct the ARG-based prognostic model. Three genes, PRKCD, BIRC5, and ATIC, were finally selected to construct the model (Table 5), and the risk score formula of To assess the performance of the prognostic model in predicting the clinical outcome of HCC patients, we calculated the risk score of each HCC patient and divided patients into high-or low-risk groups using the median risk score as the cutoff value. The survival curve indicated that the high-risk group (n = 117) suffered significantly lower 3-and 5-year survival rates than the low-risk group (n = 118) ( Figure 5). The risk score distribution, survival status of each patient, and heat map of the three gene expression profiles in the TCGA database are shown in Figure 6. The results indicated that survival time decreased and the mortality rate increased as the risk scores increased. We further analyzed the associations between the expression of the three ARGs and clinical parameters in HCC patients. The results of the independent t-test are shown in Figure 7. We observed significant correlations between overexpression of PRKCD and advanced histological grade (p = 0:001), advanced T stage (p = 0:011 ), advanced pathologic stage (p = 0:011), and female sex (p = 0:032). High BIRC5 expression was closely related to advanced T stage (p = 0:007), advanced pathologic stage (p = 0:007), and advanced histological grade (p = 3:352e − 04). Overexpression of ATIC occurred in the advanced T stage (p = 0:047), advanced pathologic stage (p = 0:047), and advanced histological grade (p = 0:002). In particular, we used the same method to analyze the correlations between the risk score calculated by the model and clinical parameters. As shown in Figures 7(k)-7(n), we found that a high-risk score was significantly related to younger patients (p = 0:047), advanced T stage (p = 0:002), advanced histological grade (p = 0:001 ), and advanced pathologic stage (p = 0:002). Then, we wanted to determine whether the risk scores or other conventional clinical parameters of HCC patients were independent risk factors for the OS of patients. In Figure 8, univariate Cox analysis showed that advanced pathologic stage, advanced T stage, advanced M stage, and risk score were risk factors for OS. However, after adjusting for clinical parameters, only the risk score remained an independent prognostic indicator for HCC patients in multivariate analyses (HR = 1:745, 95%CI = 1:420-2:145, p < 0:001). Moreover, we plotted the ROC curve to compare the predictive value of the risk score with other clinical parameters, as is shown in Figure 9.
The results indicate that the T stage (AUC = 0:708) and pathological stage (AUC = 0:703) have the highest predictive value among the conventional clinical parameters.

BioMed Research International
However, the predictive value of the risk score (AUC = 0:758) was higher than that of the T stage and pathological stage.

Confirmation of the Prognostic Model Using an
Additional Data Cohort. The International Cancer Genome Consortium (ICGC) database contains large-scale cancer genome studies in tumors from 50 different cancer types [14]. To further validate the predictive value of this model, we downloaded the data cohort (Liver Cancer-RIKEN, JP, https://dcc.icgc.org/releases/current/Projects/LIRI-JP), which contains expression profiles and clinical follow-up information from 232 HCC patients. We analyzed the correlation between the expression levels of three genes and the survival of HCC patients in both the TCGA and ICGC databases. Kaplan-Meier analysis results indicated that high expression of the three genes was significantly associated with inferior OS in HCC patients (Figure 10). Similarly, these patients were divided into low-or high-risk groups based on the risk score calculated by the prognostic model constructed based on the TCGA data. The Kaplan-Meier analysis of the two groups was significantly different (p = 4:505e − 06 < 0:001) and a similar trend was observed in TCGA (Figure 11(a)). In addition, the AUC calculated by the ROC curve was 0.739, which indicated that the model has good performance for predicting HCC patient survival (Figure 11(b)).

Discussion
The incidence of HCC is characterized by insidiousness, rapid progression, and a low early diagnosis rate. Most patients are already in advanced tumor stages when they receive treatment [15,16]. Due to the heterogeneity of liver cancer, conventional clinical parameters such as age, sex, grade, and TNM stage often do not accurately predict clinical outcomes [17]. Therefore, there is an urgent need to develop new prognostic features to facilitate the prediction of clinical outcomes in HCC patients. In recent decades, many studies have focused on identifying novel biomarkers to promote the prediction of HCC patient survival [18][19][20]. Based on the advantages of this type of research, the combination of multiple prognosis-related genes with conventional clinical parameters to construct a prognostic model may have better predictive value for HCC patients.
Autophagy is an intracellular self-digestion process that plays a fundamental role in cell homeostasis, and numerous studies have demonstrated that ARGs play important roles in tumorigenesis [21][22][23]. Therefore, identifying biomarkers from ARGs may provide new perspectives of the diagnosis or treatment for various cancers. Recently, accumulating reports have shown that using ARGs to build a prognostic model can provide better prediction of clinical outcomes for cancer patients. Lin et al. established a prognostic signature based on three ARGs in thyroid cancer [11]. Wang et al. analyzed TCGA and GEO datasets to construct and validated an autophagy-clinical prognostic model in bladder cancer [12]. Although numerous studies have confirmed the close relationship between the development of HCC and autophagy, a prognostic model based on ARGs in HCC had not been reported previously.
In this study, we used high-throughput expression profile data from TCGA to construct an ARG prognostic model and validated this model using data from TCGA and ICGC. First, we identified 62 ARGs were differentially expressed in tumor tissues. GO analysis and KEGG pathway enrichment analysis were used to explore the potential biological functions of these genes. GO enrichment revealed that these genes were mostly enriched in  TSC1  SQSTM1  PRKCD  TMEM74  MAPK3  CDKN2A  HSPB8  ATG10  BIRC5  ATIC  PARP1  SPNS1  HDAC1  MLST8  HGS  ATG4B  DRAM1  CANX  RAB24  SPHK1  HSP90AB1  WDR45B  VMP1  CAPN10  IKBKE  GAPDH  BAK1  RGS19  PEA15  CASP8  FKBP1A NPC1 RHEB Figure 4: Thirty-four differentially expressed autophagy-related genes with a prognostic value determined by univariate Cox regression. 8 BioMed Research International autophagy, a process utilizing autophagic mechanisms, macroautophagy, and autophagosomes, indicating that the differentially expressed genes mainly affect the progression of liver cancer by affecting autophagy [24,25]. KEGG pathway enrichment was mainly enriched in autophagyanimal, apoptosis, and platinum drug resistance. The results are consistent with previous studies showing that dysregulated apoptosis and platinum-based resistance are common features of many cancers, including HCC [26][27][28]. Then, univariate Cox regression analysis identified 34 genes closely related to the survival of HCC patient, and multivariate Cox regression analysis finally selected three key genes (PRKCD, BIRC5, and ATIC) to construct the prognostic model. We divided patients into high-or low-risk groups and found that the high-risk group had a lower 3-or 5-year survival rate. Then, we confirmed that the risk score calculated by the model formula was an independent prognostic indicator after adjusting for other clinical parameters. In addition, ROC curve analysis demonstrated that the risk score has a better predictive value than other clinical parameters. Furthermore, we also observed a similar trend of survival analysis and ROC curve analysis in an independent dataset from the ICGC database which further confirmed the reliability of this prognostic model. The three ARGs that we have selected to construct the prognostic model have been reported to be involved in the development of cancer in other studies. PRKCD is one of the PKC family members and is a family of serine-and threonine-specific protein kinases that can be activated  by calcium and the second messenger diacylglycerol [29]. Previous studies suggested that PRKCD can show completely opposite effects on tumors in different cell types [30]. For example, PRKCD overexpression protected keratinocytes from UV-induced apoptosis and enhanced long-term survival which is a protective mechanism against skin carcinogenesis [31,32]. LV et al. found that PRKCD can promote tumor progression in pancreatic cancer [33]. In liver cancer, Zhang et al. reported that dihydromyricetin inhibits the migration and invasion of hepatoma cells by reducing MMP-9 expression via a mechanism that is dependent on the upregulation of PRKCD [34]. Nambotin et al. confirmed that Frizzled-7 displays anticancer properties against HCC involving PRKCD activation [35]. In addition, Cai et al. implied that PRKCD is an independent gene involved in the progression of NAFLD to HCC [36]. Combined with our research, we suggest that high expression of PRKCD in HCC may affect tumorigenesis and serve as a biomarker for predicting patient survivals. BIRC5 (also known as    11 BioMed Research International survivin) is a member of the inhibitor of the apoptosis (IAP) gene family, which is essential for cell division and can inhibit apoptotic cell death [37,38]. Ambrosini et al. first described this gene as an oncogene that is prominently expressed in all common human cancers of the lung, colon, pancreas, prostate, and breast [39]. Consistent with the results of previous studies, Chen et al. recently found that increased BIRC5 expressions are associated with histological grade, tumor size, and TNM stage in HCC patients which is consistent with our findings [40].
The protein encoded by ATIC is the last enzyme in the de novo purine biosynthetic pathway [41]. Previous studies have demonstrated that the purine synthesis pathway correlates with cancer cell proliferation [42,43]. Li et al. recently found that ATIC is an oncogenic gene that promotes survival, proliferation, and migration by targeting AMPK-mTOR-S6K1 signaling in HCC [44]. Although many studies have reported that all three ARGs we selected to construct prognostic models were directly or indirectly involved in various cancers, this is the first study to combine those genes with clinical information to predict the prognosis of HCC.
In summary, we constructed and validated a prognostic model based on three ARGs and this model could be a useful tool to predict the survival of HCC. To our knowledge, the three ARG-related prognostic models have not been reported previously, and the differentially expressed ARGs may provide a new perspective for the study of HCC molecular mechanisms. However, there are some limitations of our research that should be taken into consideration. First, we only focused on the mRNA levels of these genes, and protein levels should be further investigated. Second, the results are exclusively based on the TCGA and ICGC datasets, and additional independent cohorts are necessary to confirm the reliability of this model. Third, the results of our study are descriptive and the potential molecular mechanisms of the three genes warrant additional functional experiments.

Conflicts of Interest
The authors declare that they have no conflicts of interest.