Development and Validation of an Autophagy-Related Gene Signature for Predicting the Prognosis of Hepatocellular Carcinoma

Purpose Autophagy is a lysosomal degradation pathway that is essential for maintaining the homeostasis of the intracellular environment. Mounting evidence indicates that autophagy plays an essential role in the occurrence and development of hepatocellular cancer (HCC). This research is aimed at exploring the prognostic value of autophagy-related genes (ARGs) in HCC patients. Methods The Wilcoxon test was used to identify differentially expressed ARGs in The Cancer Genome Atlas (TCGA) HCC cohort. Then, the TCGA cohort was randomly divided into training and testing groups. Cox and LASSO regression models were used to screen for autophagy-related genes that affect overall survival (OS) in the TCGA training group. Based on the coefficient of risk genes, we constructed an autophagy-related gene signature for predicting the prognosis of HCC patients. Finally, we validated the prognostic significance of autophagy-related gene signature using the TCGA testing group and three external datasets. Results ATG10, BIRC5, GAPDH, and TMEM74 are risk genes for OS. According to the optimal cutoff value of risk score in each HCC dataset, HCC patients can divide into high- and low-risk groups. ARG risk score can significantly distinguish HCC patients with different survival outcomes. Meanwhile, the ARG risk score is independently correlated with OS in multiple HCC cohorts. Conclusions The autophagy-related risk score can effectively screen high-risk HCC patients and provide guidance for clinical prevention and treatment of HCC.


Introduction
Hepatocellular carcinoma (HCC) is the third leading cause of cancer-related death in China and the fourth leading cause of cancer-related death in the world. Despite significant advances in the diagnosis and treatment of HCC in recent years, the prognosis of hepatocellular carcinoma is still poor, owing to its high invasiveness [1,2]. Therefore, it is essential to explore the molecular mechanism of the occurrence and development of liver cancer, as it can lead to a new treatment strategy for the prevention and treatment of hepatocellular carcinoma.
Autophagy refers to the process of using lysosomes to degrade self-damaged organelles and macromolecules under the regulation of autophagy-related genes, thereby maintaining the needs of the cells themselves and the renewal of organelles. Changes in the level of autophagy are associated with a variety of human diseases, such as cancer, autoimmune diseases, and central nervous system diseases [3][4][5][6]. Several studies have shown that autophagy can inhibit the growth and invasion of tumor cells in the early stages of many tumors. However, in the advanced tumor stages, autophagy can promote the rapid growth of tumor cells by degrading aging and damaged organelles and macromole-cules, thereby promoting the malignant transformation of tumor cells [4,6]. The change in autophagy level is closely related to the development of liver cancer. The main types of autophagy in the progression of liver cancer are mainly    3 BioMed Research International molecular chaperone-mediated autophagy (CMA) and macroautophagy [7]. On the one hand, mice with weakened CMA can increase the vulnerability to oxidative stress, worsen liver function, and accelerate metabolic abnormalities, thereby promoting the occurrence of hepatic adenoma [8]. When it is progressing to a malignant tumor, the tumor cells show a significant increase in CMA activity in order to maintain the metabolic shift of cancerous cells [9]. On the other hand, macroautophagy plays an antitumor role in the progression of liver cancer. The reduction in autophagy level is associated with malignant transformation and poor prognosis of liver cancer. In liver cancer, a decrease in the level of autophagy can cause the appearance of the autophagy protein p62, and p62 can promote the release of nuclear factor erythroid 2-related factor 2 (Nrf2), which in turn encourages the progression of liver fibrosis and liver cancer [10,11]. Therefore, autophagy as a target for antitumor may have important clinical significance in the future. Since autophagy plays an essential role in the occurrence and development of cancer, it is of great clinical importance to find autophagy-related tumor biomarkers.
With the popularization of high-throughput sequencing technology in recent years, it is feasible to explore the relationship between autophagy-related genes (ARGs) and the prognosis of patients with hepatocellular carcinoma. Therefore, in our study, we systematically analyzed the differentially expressed ARGs in HCC by using the TCGA database and selected differentially expressed ARGs significantly associated with OS in the TCGA training group. Based on the cutoff value of risk score in the TCGA training group, HCC patients could be divided into high-and lowrisk groups. Finally, we explored the prognostic role of the ARG risk score in the TCGA training group, TCGA testing group, and three external datasets.      BioMed Research International and prognostic information of HCC patients from the cBio-Portal online website (https://www.cbioportal.org/). Differentially expressed ARGs between tumor samples and normal samples were determined by the Wilcoxon test. The criteria for screening differential genes were |log 2 FC | >1 and FDR < 0:05. Normalization was performed by converting the expression matrix of the autophagy-related genes using the formula log 2 ðx + 1Þ . The staging of HCC patients was determined by using the 7th edition of AJCC (American Joint Committee on Cancer) staging.

Material and Methods
2.2. Functional Enrichment Analysis of ARGs. Gene ontology (GO) is a database for the definition and description of genes and protein functions for a variety of species. GO annotations include molecular function (MF), biological process (BP), and cellular components (CC). Through these three functional categories, the function of a gene can be deeply described. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a comprehensive database that integrates information on genomic, chemical, and system functions. Based on this database, we can further speculate on ARG-related signaling pathways. Then, we obtained the enrichment function and pathways of differentially expressed ARGs through the cluster profile package in R software.

Establishment and Validation of Autophagy-Related
Gene Signature for OS. We matched the prognostic information of HCC patients with the liver cancer samples and finally obtained 367 HCC patients with prognostic information. A good predictive model should have internal and external validation, so we randomly divided the TCGA cohort into TCGA training (n = 183) and TCGA testing groups (n = 184) using the "sample" function in R language. Then, univariate regression analysis was used to explore differentially expressed ARGs that were significantly associated with the OS in the TCGA training group. The LASSO regression model was used to reduce the false-positive result caused by model overfitting. Furthermore, we constructed the OS gene signature by incorporating variables from multivariate regression analysis using the stepwise method. Finally, we validated the prognostic role of risk scores in the TCGA testing group, and three external datasets.

ARG Risk Score Computation.
The ARG risk score of each HCC patient was computed by the expression value of the risk gene screened by multivariate regression analysis and the corresponding regression coefficient. The risk score for HCC patients is calculated as follows: risk score = 0:556 * ATG10 + 0:217 * BIRC5 + 0:403 * GAPDH + 0:765 * TMEM74 for OS. Relative operating characteristic (ROC) analysis is used to determine the optimal cutoff value in each HCC dataset. Based on the optimal risk score of each dataset, we divided HCC patients into high-risk and low-risk groups. The Kaplan-Meier curve was used to map the survival time of patients in high-and low-risk groups, and the log-rank test was conducted to compare the significance of high-and low-risk groups. 2.6. Statistical Analysis. Statistical analyses were analyzed using SPSS 24.0 (Chicago, IL, USA) and R 3.5.1 software (https://www.r-project.org/). We used R packages, including limma, pheatmap, ggpubr, clusterProfiler, survival, survminer, and survivalROC. The categorical variable is represented by frequency; if the continuous variable satisfies the normal distribution, it is expressed by mean ± standard deviation; if it does not meet the normal distribution, it is expressed by median (interquartile range). Comparisons between categorical variables were utilized using chi-square analysis, and comparisons between continuous variables were performed using the Kruskal-Wallis test. Autophagyrelated risk genes were identified using the least absolute shrinkage and selection operator (LSAAO) and Cox regression model. Time-dependent ROC curves were used to analyze the performance of autophagy-related gene signature to predict overall survival. A P value of below 0.05 was considered statistically significant.

Identification of Differentially Expressed Autophagy-Correlated Genes.
We used the Wilcoxon test to analyze the expression matrix of tumor samples and normal samples. Based on the screening criteria (|log 2 FC | >1 and FDR < 0:05), we obtained a total of 62 differentially expressed ARGs, of which 58 were upregulated, and 4 were downregulated. Volcano map of differentially expressed autophagyrelated genes is shown in Figure 1 At the same time, most of the differentially expressed ARGs are significantly increased in HCC tissues, thus revealing that they play an indispensable role in promoting the occurrence and development of liver cancer.

Functional Annotation of the Differentially Expressed
Autophagy-Correlated Genes. Gene enrichment analysis can analyze the corresponding biological functions and pathways of selected genes. The biological process of GO is mainly enriched in autophagy, process utilizing autophagic mechanism, macroautophagy, regulation of autophagy, regulation of macroautophagy, regulation of apoptotic signaling pathway, autophagosome assembly, autophagosome organization, neuron death, and vacuumed organization (top 10).

Identification of Autophagy-Related Risk Genes for
Overall Survival. We performed univariate Cox regression analysis on 62 autophagy-related genes in the TCGA training group. The results of the univariate regression analysis showed that 26 autophagy-related genes were significantly associated with the prognosis of HCC patients (P < 0:05) (Figure 3(a)). In order to eliminate the false-positive results, we used the LASSO regression model to screen the independent variables further. The results of the LASSO regression analysis showed that 11 genes were significantly correlated with OS (Figures 3(b) and 3(c)). Finally, we put the selected 11 genes into multivariate regression analysis using the backward and forward method. The results of multivariate regression analysis identified ATG10, BIRC5, GAPDH, and TMEM74 are risk factors for OS in the TCGA training group (Table 2).  BioMed Research International + + + + + + + + + + + ++ + + + + + + + + + + + +++ + + + + + + + + + ++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ++ + + + + + + ++ + + + + ++ + + + + + + + + + + ++ + ++  Based on the previous results of the multivariate regression analysis, we constructed a risk score formula for ARGs, which was calculated as follows: risk score = 0:556 * ATG10 + 0:217 * BIRC5 + 0:403 * GAPDH + 0:765 * TMEM74 for OS. According to the regression coefficient, it is evident that ATG10, BIRC5, GAPDH, and TMEM74 are all risk factors for HCC patients. According to the optimal cutoff value (5.652) of a risk score for the autophagy-related gene signature, we divided the TCGA HCC training cohort into high-risk and low-risk groups. The 1-year OS, 3-year OS, and 5-year OS rates of the highrisk group were remarkably lower than the 1-year OS, 3year OS, and 5-year OS of the low-risk group (63.4% vs. 90.1%; 29.7% vs. 74.5%; 19.8% vs. 62.8%) (P = 2:837e − 09) (Figure 4(a)). The 1-year AUC, 3-year AUC, and 5-year AUC of the autophagy-related gene signature for OS are 0.701, 0.704, and 0.691 in the TCGA training group (Figure 4(b)). Furthermore, we ranked the prognostic risk scores for OS in the TCGA training group. (Figure 4(c)).   Figure 5(e) shows the heat map of the ARG expression matrix of the HCC patients in the TCGA testing group. To further validate the role of our established formula, we explored the prognostic significance of autophagy-related gene signature in the external ICGC dataset, GSE116174 dataset, and AHMU dataset. In the ICGC dataset, the overall survival rate of the high-risk group was remarkably lower than that of the low-risk group (P = 9:45e − 8) (Figure 6(a)). The 1-year AUC, 3-year AUC, and 5-year AUC of the autophagy-related gene signature for OS are 0.685, 0.773, and 0.643 ( Figure 6(b)). The risk score, survival status, survival time of HCC patients in the ICGC dataset are plotted in Figures 6(c) and 6(d). Figure 6(e) shows the heat map of the ARG expression matrix of the HCC patients in the ICGC dataset. The same tendency can be found in the GSE116174 dataset. The overall survival rate of the high-risk group was remarkably lower than that of the low-risk group (P = 7:67e − 4) (Figure 7(a)). The 1-year AUC, 3-year AUC, and 5-year AUC of the autophagy-related gene signature for OS is 0.685, 0.646, and 0.687 (Figure 7(b)). The risk score, survival status, and survival time of HCC patients in the GSE116174 dataset are plotted in Figures 7(c) and 7(d). Figure 7(e) shows the heat map of the ARG expression matrix of the HCC patients in the GSE116174 dataset. Finally, we used our own dataset for verification. In the AHMU dataset, when compared with the low-risk group, patients in the high-risk group have a poorer prognosis (P = 3:43e − 3) (Figure 8(a)), and the autophagy-related gene signature shows a better predictive value for HCC (Figure 8(b)). The risk score, survival status, and survival time of HCC patients in the AHMU dataset

Autophagy-Correlated Gene Signature Is Independently
Correlated with Overall Survival. To explore whether the ARG risk score is an independent risk factor for the prognosis of HCC patients in the TCGA training group, we included age, gender, AJCC stage, pathological grade, and the ARG risk score into the univariate regression model. Univariate regression analysis showed that the AJCC stage (P = 0:007) and ARG risk score (P < 0:001) were significantly associated with OS. Then, we included these factors into multivariate regression analysis. The results of the multivariate regression analysis showed that the AJCC stage (P = 0:042) and ARG risk score (P < 0:001) were significantly associated with OS (Table 3). Similarly, the AJCC stage and autophagy risk score were independent risk factors for the prognosis of HCC patients in the TCGA testing dataset (Table 4). Besides, in the external validation dataset, the autophagy risk score was also an independent risk factor for the prognosis of HCC patients in the ICGC dataset, GSE116174 dataset, and AHMU dataset (Tables 5-7). Therefore, the ARG risk score is an independent prognostic factor for OS in HCC patients.

Comparison of the Gene Signature
We Constructed with the Published Gene Signatures. We also used the TCGA, ICGC, and GSE116174 datasets to analyze the predictive value of gene signature in the two published literature [12,13] and the gene signature we constructed (Supplementary Figure 2). In the TCGA dataset, the results of the timedependent ROC curve show that the AUC of 8-gene signature (zhu-signature) [12] for 1-year and 5-year AUC is slightly higher than that of 4-gene signature (kongsignature) and 3-gene signature (lin-signature) [13], but its 3-year AUC is slighter lower than kong-signature and linsignature. The predicted value of 1-year, 3-year, and 5-year AUC is similar between kong-signature and lin-signature. In the ICGC dataset, the 1-year AUC of kong-signature is slightly lower than zhu-signature but was higher than linsignature and zhu-signature in both 3-year and 5-year AUC. In the GSE116174 dataset, kong-signature has shown good predictive value in 1-year, 3-year, and 5-year AUC. In summary, the kong-signature we constructed has a more accurate predictive value than the two other gene signatures in the previously published literature.

Discussion
With the development of science and technology in recent years, autophagy has gradually attracted the attention of researchers as an essential molecular process. Although many reports have explored the role of individual autophagy-related genes in liver cancer, research on the prognostic role of all autophagy-related genes in HCC is rarely investigated. Therefore, we deeply explored autophagy-related risk genes that affect the prognosis of HCC patients by digging into multiple public databases, so as to provide guidance for clinical evaluation of patients with hepatocellular carcinoma.
In our study, we downloaded and analyzed the RNA-seq data of the HCC cohort from TCGA and obtained 62 differentially expressed autophagy-related genes using the Wilcoxon test. Then, we performed gene enrichment analysis, and the function and pathway of autophagy-related genes play an essential role in the progression of hepatocellular carcinoma. Furthermore, we used univariate regression

13
BioMed Research International analysis to explore the relationship between the expression of autophagy-related genes and the overall survival of HCC patients. The results showed that 26 autophagy-related genes were significantly related to OS. The LASSO regression and multivariate regression analysis were utilized for further screening, and we found that ATG10, BIRC5, GAPDH, and TMEM74 are risk ARGs for OS. Therefore, we established a prognostic gene signature for OS in the TCGA training group. The prognostic role of autophagy-related gene signature was also validated in the TCGA testing group, ICGC dataset, GSE116174 dataset, and AHMU dataset. Finally, we compared the gene signature we built with the published gene signatures. In our study, based on different statistical methods, we constructed a new 4-gene signature for the HCC prediction model. In both the internal training dataset and the external validation dataset, the 4-gene signature has better robustness and accuracy compared to the other two predictive signatures, which provides a new strategy for predicting the prognosis of liver cancer patients.
ATG10, also named ATG10L, encodes a protein that is involved in Ub-like modification, which is crucial for the formation of autophagosomes. ATG10 has been explored in a variety of tumors, including colorectal cancer, gastric cancer, non-small-cell lung cancer, and breast cancer [14][15][16][17]. For example, Jo et al.'s study found that increased expression of ATG10 is associated with vascular invasion, lymphatic metastasis, and poor prognosis in colorectal cancer [15]. Xie et al.'s research found that ATG10 can promote the proliferation and migration of lung cancer [17]. This is consistent with our study's finding that elevated expression of ATG10 is consistent with poor prognosis in patients with HCC.
BIRC5, also known as survivin, is a critical member of the apoptosis-inhibiting gene family, and it can encode reg-ulatory molecules that inhibit the death of apoptosis cells. The abnormal expression of BIRC5 is related to the malignant transformation of the cancer cell [18][19][20][21][22][23]. For example, the high expression of BIRC5 can promote the proliferation and angiogenesis of liver cancer cells, reduce the sensitivity to chemotherapy and radiotherapy, and suppress the apoptosis of tumor cells, thereby affecting the survival outcome of HCC patients [18]. Our study found that the high expression of BIRC5 is associated with poor prognosis in patients with HCC, which is also consistent with the published literature.
GAPDH is a well-known housekeeping gene. It includes a C-terminal catalytic domain and an N-terminal NAD + binding domain and plays an important role in the body's energy metabolism, DNA repair, autophagy, and cell proliferation [24][25][26]. Liu et al.'s research finds that GAPDH can convert glycolytic flux to serine metabolism by increasing PHGDH and promoting histone methylation, thereby promoting the progress of liver tumors in mice [26]. Ganapathy-Kanniappan et al.'s research found that injection of GAPDH antagonists into mouse liver cancer models can induce apoptosis of liver cancer cells and block the progression of Hep3B tumors, which may be used as a potential target therapy for HCC [25]. In our study, GAPDH was significantly upregulated in liver cancer, and high expression of GAPDH was an independent risk factor for prognosis in patients with liver cancer, which is consistent with the literature.

15
BioMed Research International ATG16L1 and ATG9A, thereby promoting tumor cell survival [30]. Sun et al.'s study found that TMEM74 is highly expressed in liver cancer and lung cancer tissues and correlated to the poor prognosis of cancer patients. After overexpressing TMEM74 in tumor cells, the proliferation ability of tumor cells was enhanced [31]. In our study, elevated expression of TEME74 was associated with poor prognosis in patients with liver cancer, which is in accordance with the study by Sun et al.
However, there are still deficiencies in our research. Firstly, the data we use is derived from public databases, and these findings require subsequent verification. Secondly, we have discovered some potential biomarkers of liver cancer and have not further explored the underlying    16 BioMed Research International mechanism of its function, and subsequent research is needed to explore it.

Conclusion
In this research, we found autophagy-related risk scores can significantly distinguish high-and low-risk groups of HCC patients and are also significantly related to the prognosis of HCC patients. Therefore, the prediction model based on autophagy-related genes may be used to predict the prognosis of HCC patients to provide a new strategy for the prevention of HCC in the clinic.

Data Availability
We use public databases for subsequent analysis, and the corresponding data can be found in The Human Autophagy Database (HADb), The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC), Gene Expression Omnibus (GEO), and the cBioPortal.

Supplementary Materials
Supplementary Figure 1: cluster heat map of differentially expressed autophagy-related genes in HCC patients from the TCGA cohort. Supplementary Figure 2: comparison of the gene signature we constructed with the published gene signatures. Supplementary