A Novel Signature Based on mTORC1 Pathway in Hepatocellular Carcinoma

Background mTORC1 signal pathway plays a role in the initiation and progression of hepatocellular carcinoma (HCC), but no relevant gene signature was developed. This research aimed to explore the potential correlation between the mTORC1 signal pathway and HCC and establish the related gene signature. Methods HCC cases were retrieved from The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC), and Gene Expression Omnibus (GEO) databases. The genes included in mTORC1-associated signature were selected by performing univariate and multivariate Cox regression analyses and lasso regression analysis. The protein expression level of included genes was verified by The Human Protein Altas. Then, the signature was verified by survival analysis and multiple receiver operating characteristic (ROC) curve. Moreover, the correlation between signature and immune cells infiltration was investigated. Furthermore, a nomogram was established and evaluated by C-index and calibration plot. Results The signature was established with the six genes (ETF1, GSR, SKAP2, HSPD1, CACYBP, and PNP). Three genes (ETF1, GSR, and HSPD1) have verified their protein expression level in HCC. Under the grouping from signature, patients in the high-risk group showed worse survival than those in the low-risk group in both three datasets. The signature was found to be significantly associated with the infiltration of B cells, CD4+ T-cells, CD8+ T-cells, dendritic cells, macrophages, and neutrophils. The univariate and multivariate Cox regression analysis indicated that mTORC1-related signature could be the potential independent prognostic factor in HCC. Finally, the nomogram involving age, gender, stage, and signature has been established and verified. Conclusion The mTORC1-associated gene signature established and validated in our research could be used as a potential prognostic factor in HCC.


Introduction
Hepatocellular carcinoma (HCC) is one of the most prevalent cancers around the world, becoming the second leading causes of tumor-related death [1]. Owing to the high rate of metastasis, HCC patients with advanced stage are usually with a poor prognosis [2]. Although the treatment and biomarkers of HCC have developed, the clinical outcomes of HCC patients are still unsatisfactory [3]. e occurrent and development of HCC involved interactions between genetics, epigenetics, and transcriptomic alterations [4]. Many studies have verified that different biomarkers have certain prognostic value in HCC. For example, Gu et al. found that CCL14 was a potential prognostic biomarker that correlated with tumor immune cell infiltration in HCC [5]. Another study found the strong correlations between PRPF3 expression and prognosis in HCC [6]. However, as a biomarker, a single gene usually has a lower prognostic value than multigene prognostic signature. erefore, many generelated signatures have been developed for predicting prognosis of HCC. For instance, Zhang et al. established a gene signature associated with HCC microenvironment and successfully verified it [7]. Other predictive signatures based on immune [8] and glycolysis [9] also play an important role in HCC prognosis.
Generally, the gene researches usually focus on comparing the gene expression between two groups or pay attention to the highly upregulated and downregulated genes.
Nevertheless, some genes which showed no significant difference but had important biological function and characteristics were omitted. In view of this, a computational method gene set enrichment analysis (GSEA) determined whether a prior defined set of genes shows statistically significant differences between two biological states [10]. e advantage of GSEA is that it can identify the genes in which expression is based on the trend of overall level. Consequently, in this research, we identified the pathway and gene with GSEA. en, we constructed the signature based on related genes and verified it, providing the more comprehensive and accurate prognostic model for clinic.

Data Collection.
e gene expression data with the type of level 3 RNA-seq FPKM dataset and the clinical messages in TCGA website (https://portal.gdc.cancer.gov/) were retrieved. A total of 377 HCC cases have been downloaded and analyzed. As the validated cohorts, GSE76427 datasets were retrieved from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/), while ICGC-LIRI dataset was retrieved from International Cancer Genome Consortium (ICGC) database (https://icgc.org/).

Identification of Pathways and Related Genes.
After data collection, we extracted the clinical details and generated the expression matrix of all genes in HCC of TCGA dataset. en, we divided the patients into two groups according to the survival status and employed GSEA to choose the most relevant pathways. e pathways were considered for further analyses if the normalized p value was <0.01. After that, we collected the related genes of involved pathway from Molecular Signatures Database (http://software.broadinstitute. org/gsea/msigdb/index.jsp).

Construction and Verification of Signature.
Firstly, we performed the differentially expressed analysis to select the related genes. e "limma" package under R studio software was employed, and a p value <0.05 was considered statistically significant. Secondly, we employed the univariate and multivariate Cox regression analysis to choose the prognostic genes among the differentially expressed genes. e genes in this section were eligible for further selection if a p value was <0.05. en, the lasso regression analysis was executed for checking selected genes. In this analysis, a lasso penalty was applied, to simultaneously account for shrinkage and variable selection. e optimal value of the lambda penalty parameter was defined by performing 10 crossvalidations. Using the "glmnet" package, the coefficient of each included genes and risk score of each case were calculated. e calculation formula of risk score was as follows: risk score � (coefficient mRNA1 × expression of mRNA1) +(coefficient mRNA2 × expression of mRNA2) + · · · +(coefficient mRNAn × expression mRNAn).
(1) e cases were divided into two groups (high risk or low risk), according to the risk score median. To explore the time-dependent prognostic value of our gene signature, the survival analysis was performed using the "survival" package in the R studio software. e relationship between signature and other clinical messages was also evaluated and visualized with a heatmap. e protein expression level of included genes of signature was verified by e Human Protein Altas (https://www.proteinatlas.org/). Besides, the multiple receiver operating characteristic (ROC) curve was performed to check the predictive accuracy of risk score. In addition, we investigated the correlation between signature and six different immune cells (B cells, CD4 + T-cells, CD8 + T-cells, dendritic cells, macrophages, and neutrophils). e infiltration data of six immune cells were retrieved from tumor immune estimation resource (https://cistrome.shinyapps.io/ timer/). Moreover, the univariate and multivariate Cox regression analyses were performed to verify whether the risk score is an independent prognostic factor.

Predictive Nomogram Design.
A predictive nomogram based on age, gender, stage, and risk score was constructed using the "rms" package and Cox regression model to predict the overall survival (OS) at 1 year, 3 years, and 5 years of HCC patients. en, we used Harrell's concordance index (C-index) and calibration plot to evaluate the nomogram.

Results
e clinical data details of the patients used in this study are shown in Table 1. Figure 1 shows the screening process and validation of our study. e result of GSEA in Figure 2 showed that a total of 6 pathways were eligible, and the details of pathways are summarized in Table 2. Considering the highest normalized enrichment score of mTORC1 pathways, we chose the 200 relevant genes in this pathway for further analysis. A total of 199 genes have been found in gene expression matrix, and the differentially expressed analysis showed that 160 genes were significantly different in HCC (see Besides, the heatmap of risk score and clinical parameters was shown in Figure 3(c). All the genes included in the signature were highly expressed in the high-risk group and lowly expressed in the low-risk group. e correlation between gene expression of final included genes and clinical parameters is shown in Table 3. Meanwhile, significant difference was found between risk score and stage, grade, T, and M, respectively (Table 3). In terms of protein expression, three genes (ETF1, GSR, and HSPD1) have been found to be highly expressed in HCC tissue according to Figure 4. e survival analysis of TCGA ( Figure 5(a)), GSE76427 ( Figure 5(b)), and ICGC ( Figure 5(c)) both indicated that significant difference was found between two groups (p < 0.05). And the multiple ROC curve plot (Figures 5(d)-5(f)) demonstrated that the risk score got the highest predictability among analyzed factors in 1 year (AUC � 0.802), 3 years (AUC � 0.743), and 5 years (AUC � 0.719). In terms of immune cells infiltration, significant difference was found between risk score and B cells, CD4 + T-cell, CD8 + T-cell, dendrites, macrophage, and neutrophil ( Figure 6). Furthermore, we performed the univariate and multivariate Cox regression analyses, and the results in Table 4 revealed that the signature can be the independent prognostic factor in HCC (p < 0.01).
Finally, using the TCGA cohort, we built the nomogram based on age, gender, stage, and established signature to predict 1-year, 3-year, and 5-year OS for HCC patients (Figure 7(a)). e results of decision curve analysis (Supplementary file 6) demonstrated that nomogram with age, gender, stage, and established signature showed a higher benefit than other two solo models (stage only or risk score only). e C-index of nomogram was 0.73, and the calibration plot for the probability of survival at 1 year (Figure 7

Discussion
In the initiation and progression of hepatocellular carcinoma, genetic factors usually play an important role. Meanwhile, mRNA gene signature based on a certain characteristic like glycolysis [11] and immune [12] has been developed for predicting cancer prognosis. In this research, we explored specific function to identify genes by GSEA that could predict the survival of HCC patients. According to our results, six signal pathways were found to be highly related to survival and we established the gene signature with the mTORC1 signal pathway. As we all know, the mTOR pathway is a serine/threonine protein kinase belonging to the PI3K-related kinase family [13], which comprised of two distinct complexes (mTORC1 and mTORC2). With Raptor as its unique and key protein component, mTORC1 plays an important role in cell survival, autophagy, and metabolism [14]. Concerning for the mTOR signal pathway in HCC, it has been found that aberrant mTOR signaling was present in half of the HCC cases [15]. Meanwhile, an intact mTORC1 axis [16] and mTORC2-Akt1 cascade [17] were required for c-Mycdriven hepatocarcinogenesis. Moreover, some research studies [15,18] provided the theoretical basis of mTOR signaling pathway-oriented targeting treatment for HCC in clinic. Overall, these abovementioned evidences demonstrated that mTOR signal pathway plays an important role in the development of HCC.
In this study, we identified six genes in signature by performing the differentially expressed analysis, univariate Cox regression analysis, and lasso regression analysis. Among our included genes, five genes have been found to be related to HCC from previous studies. Singh et al. found that ETF1, CNOT6, and XRN1 gene in HepG2 cells led to significant alteration in stability of specific mRNAs, and this mechanism may hold novel cancer therapeutic targets [19]. In another research, McLoughlin concluded that GSR, TRXR1, NRF2, and oxidative stress determined hepatocellular carcinoma malignancy [20]. Lee's et al. study [21] found that HSPD1 was downregulated during early apoptosis of the hepatoma cell mediated by Paeoniae Radix. In terms of CACYBP, it has been verified that CACYBP can promote hepatocellular carcinoma progression in the absence of RNF41-mediated degradation [22]. Moreover, a study [23] found that PNP/fludarabine suicide gene system induced HCC cell apoptosis and inhibited the growth of HCC cells. Although we found no evidence supporting the correlation between SKAP2 and HCC, it has been verified that SKAP2 promotes podosome formation to facilitate tumor-associated macrophage infiltration and metastatic progression [24].
Recently, further investigations have been performed to explore how the mTOR signal transduction mechanisms modulate sensitivity of targeted therapies, angiogenesis, and tumor immunity [25]. e interest in mTOR targeting may improve immune response against cancer and develop new  therapeutic strategy. It has been verified that an inflammatory-CCRK circuitry drove mTORC1-dependent metabolic and immunosuppressive reprogramming in obesityrelated hepatocellular carcinoma [26]. In another study [27], Tan concluded that Tim-3-mediated PI3K/mTORC1 interference leads to the dysfunction of both tumor-infiltrating conventional natural killer cells and liver-resident natural killer cells. In our research, the results showed that mTORC1 signature significantly associated with B cells, CD4 + T-cell, CD8 + T-cell, dendrites, macrophage, and neutrophil, which indicated that the patients in high-risk group may benefit from immune-targeted therapies and provide a new strategy for immune checkpoint-based targeting.
Being different from the previous prognostic studies in HCC, our predictive model firstly concentrated on the mTORC1 signal pathway. More importantly, the mTORC1 signal pathway was identified by GSEA, which indicated the underlying mechanism between survival of HCC and mTORC1 pathway. Moreover, the validation from three independent datasets and a rigorous screening process enabled the identification of a reliable signature. However, our study has some limitations. First, prognostic signature  showed a relatively low diagnostic performance in predicting 5-year OS. It may be attributed to that only 200 associated genes were defined and evaluated for the initiation of the screening process. Second, using a single characteristic (mTORC1 signal pathway) to establish the predictive model is an intrinsic weakness. Indeed, many other mechanisms, such as metabolism [28] and immune [8], have an influence on the development and progression    Journal of Oncology of HCC. ird, the weak correlation between risk score and immune cells infiltration was observed in our research, which may attribute to the calculated method (TIMER) of immune cell infiltration. Furthermore, our signature explored the underlying effect between mTOR signal pathway and HCC, but it is necessary to perform more independent trials and functional experiments to shed light on the mechanism linking them.  Journal of Oncology