Novel Gene Signatures Promote Epithelial-Mesenchymal Transition (EMT) in Glucose Deprivation-Based Microenvironment to Predict Recurrence‐Free Survival in Hepatocellular Carcinoma

Background Current research studies have suggested that glucose deprivation (GD)-based tumor microenvironment (TME) can promote epithelial-mesenchymal transition (EMT) of tumor cells, leading to tumor invasion and metastasis. However, no one has yet studied detailedly the synthetic studies that include GD features in TME with EMT status. In our research, we comprehensively developed and validated a robust signature regarding GD and EMT status to provide prognostic value for patients with liver cancer. Methods GD and EMT status were estimated with transcriptomic profiles based on WGCNA and t-SNE algorithms. Two cohorts of training (TCGA_LIHC) and validation (GSE76427) datasets were analyzed with the Cox regression and logistic regression analyses. We identified a 2-mRNA signature to establish a GD-EMT-based gene risk model for the prediction of HCC relapse. Results Patients with significant GD-EMT status were divided into two subgroups: GDlow/EMTlow and GDhigh/EMThigh, with the latter having significantly worse recurrence-free survival (P < 0.01). We employed the least absolute shrinkage and selection operator (LASSO) technique as a method for HNF4A and SLC2A4 filtering and constructing a risk score for risk stratification. In the multivariate analysis, this risk score predicted recurrence-free survival (RFS) in both the discovery and validation cohorts and remained valid in patients stratified by TNM stage and age at diagnosis. The nomogram that combines risk score and TNM stage as well as age produces improved performance and net benefits in the analysis of calibration and decision curves in training and validation groups. Conclusions The GD-EMT-based signature predictive model may provide a prognosis classifier for HCC patients with a high risk of postoperative recurrence to decrease the relapse rate.


Introduction
Hepatocellular carcinoma (HCC) accounts for 75-85% of all primary liver malignancies [1]. During the last decades, attempts at therapeutic approaches for HCC have been in progress not only for early stages but also for advanced stages, such as aggressive surgery, liver transplantation, chemotherapy with sorafenib, and multikinase inhibitors, all of which are recognized as efective treatments for suferers [2,3], as well as CAR-T cell dysfunction, which is considered a novel approach to afect the immune microenvironment and the immunotherapeutic response in HCC [4]. However, without targeted therapy, patients with early and advanced HCC have a poor prognosis, with a median survival of 6-9 months and 1-2 months, respectively [5]. Furthermore, a relapsed rate of 50%-70% has been achieved even after surgical resection of the lesion, let alone many patients who are not eligible for resection [6,7]. Hence, the identifcation of novel panels providing more predictive value for suferers' recurrence status is highly demanded clinically for improving the prognostication for liver cancer. Terefore, to improve the prognosis of liver cancer patients, there is a great clinical need to discover novel panels that have more predictive value for patients' recurrence status.
On account of the coevolution of malignant cells and their direct environment, the tumor forms an organ-like structure. Studies clearly show that cancer development and metastasis rely on the mutual cointeraction between tumor cells and their environment, which leads to the formation of the tumor microenvironment [8]. Te rapid proliferation of cancer cells makes it often necessary for tumors to experience rapid angiogenesis, hypoxia, acidosis, glucose deprivation, immune cell infltration, and decreased activity, which all contribute to the development of cancer as well as drug resistance [9,10]. Tus, TME is equipped with low pH values, glucose deprivation (GD), severe hypoxia, high glutathione (GSH) content, and excessive hydrogen peroxide (H 2 O 2 ) [11]. Accumulated evidence has testifed that tumor cells have high migratory potential for constantly situating in glucose deprivation-based TME [12][13][14]. Te epithelial-mesenchymal transition (EMT) refers to the transition from polarized epithelial cells to motile mesenchymal cells through the activation of a series of signals that enhance tumor stem cell-like properties, invasion, and metastasis [15]. Current research has suggested that a GDbased microenvironment can promote EMT of tumor cells, leading to tumor invasion and metastasis [16,17]. Although there is an explicit link between the GD status and the EMT phenomenon of TME, an integrated analysis of the relationship between the GD state and the EMT response is rare.
In this study, we synthetically developed and validated robust signatures of GD and EMT status to provide prognostic value for HCC patients. Firstly, we fltered GD-related diferentially expressed genes (DEGs) and EMT-related DEGs associated with prognosis and applied them to model construction in silico. Furthermore, multiple independent HCC datasets were integrated to develop a risk score based on GD-EMT status, and the functional studies of relevant genes were validated in vitro. Ultimately, an original model incorporating GD-EMT status and clinicopathological features was indicated through a range of systematic analyses aimed at predicting RFS in liver cancer with universal applicability in clinical practice.

Data Retrieval and Preprocessing.
Te mRNA expression data and corresponding clinical characteristics of HCC patients were collected from Te Cancer Genome Atlas (TCGA) cohort (https://portal.gdc.cancer.gov/) and the Gene Expression Omnibus (GEO) (https://www.ncbi. nlm.nih.gov/geo/). Te study contained 418 HCC tumor samples with integrated clinical characteristics and valid survival data in TCGA database, 108 liver cancer patients from GSE76427 dataset, and 8 HepG2 cells treated with variously concentrated glucose (4 high glucose-relevant and 4 GD-related cells) from GSE140867 dataset. Te mRNA expression matrix (FPKM) value from TCGA and GEO database was converted into TPM value. Besides, samples from TCGA cohort were randomly assigned to two phases, namely, training and internal validation cohort. Te discovery set retrieved from the GSE76427 dataset was used for external validation. Tus, the tissue samples from TCGA-LIHC and GSE76427 datasets were assigned to diverse phases incorporating training, internal validation, entire, and external validation cohorts. Patients' clinicopathologic characteristics are listed in Table 1. All of the patients from the above two sets who met the following selection criteria could be enrolled: (a) histologically diagnosed malignant hepatocellular carcinoma; (b) eligible RNA expression; and (c) available RFS data. Te workfow shown in Figure 1 was composed of feature selection, silico analysis, validation, and model construction.

Verifcation of GD Status and GD-Associative DEGs.
We performed a weighted gene coexpression network analysis (WGCNA) using the WGCNA R package (version 3.613) to screen for genes associated with GD status and divided the associated mRNAs into the same coexpression modules [18]. On the basis of the results of the module-trait relationship, the module with the higher correlation was selected as the research object for the next study, and the genes in the pivotal module were considered as GD-related genes. Furthermore, t-distributed stochastic neighbor embedding (t-SNE) is a nonparametric and unsupervised algorithm that classifes or condenses patients into diverse clusters based on hub features or hallmarks by using the R package Seurat [19]. According to the RFS data, two clusters were singled out for comparison to determine the "GD high " and "GD low " groups. Te limma algorithm was employed to fltrate DEGs between the two groups [20], and genes generated with a false discovery rate (FDR) corrected P value <0.05 and an absolute log 2 -fold change value >1 were regarded as GD-related DEGs.

Identifcation of EMT States and EMT-Related DEGs.
Tere were 1184 EMT-related hallmark genes extracted from the dbEMT2.0 database (https://dbemt.bioinfo-minzhao.org/index. html), consisting of 1011 protein-coding genes and 173 noncoding RNAs. Similarly, patients were sectionalized into diverse clusters to compare the RFS data to ascertain the "EMT high " and "EMT low " groups. Tus, EMT-related DEGs could be further confrmed by the limma arithmetic, with the screening criteria of FDR-adjustedP < 0.05 and |log 2 FC| > 1.

Generation of GD-EMT Related
DEGs. GD and EMT status identifed above were divided into three groups, such as GD low /EMT low , GD high /EMT high , and mixed groups. Te GD-EMT-related DEGs could be acquired by detecting expression diferences between the GD low /EMT low and GD high /EMT high groups (FDR-adjustedP < 0.05, |log 2 FC| > 1). Finally, the set of genes most relevant to GD-EMT status can be obtained by comprehensively analyzing GD-EMT-associated DEGs and GD/EMT-associated DEGs distinguished above.

Establishment and Validation of Profling Based on GD-EMT-Relevant DEGs.
We performed univariate Cox regression analysis among GD-EMT-related DEGs using the R package "survival" and obtained preliminary GD-EMTrelated DEGs that were signifcantly correlated with RFS in the training cohort, of which P < 0.05 was treated as signifcant. Afterwards, the least absolute shrinkage and selection operator (LASSO) method was used to calculate risk scores for HCC patients, which is characterized by preserving valuable variables and avoiding overftting [21]. Based on these prognostic candidates, LASSO-Cox regression analysis was used to select genes to minimize the risk of overftting. A risk prediction model was constructed and the penalty regularization parameter lambda (λ) was chosen through the cross-validation routine with an n-fold equal to 10 by using the R package glmnet. Meanwhile, lambda.min was identifed to pick out the variables. Subsequently, we combined the regression coefcients from multivariate Cox regression models and optimized gene expression for each patient's risk score for RFS, that is, risk score � coefcient (i) × expression of signature gene (i). Te coefcients of gene (i) originated from the LASSO-Cox regression model, while its expression was derived from each patient. On the basis of risk scores, patients were divided into high-risk and low-risk groups. Besides, the Kaplan-Meier survival and principal component analysis (PCA) were drawn using the software "GraphPad prism 8.0" and the R package "rgl" for cluster analysis, respectively, to assess the predictive value of prognostic characteristics for RFS.

Enrichment Analysis.
We conducted functional enrichment analysis using the package "cluster profler" to explore potential molecules associated with GD-EMTassociated DEGs. Meanwhile, the correlations between risk scores and the enrichment scores of EMT-predicted pathways or GD-predicted pathways were conducted by the R package "ggcor."

Construction and Assessment of the Nomogram.
To estimate the feasibility of the risk score in depth, we selected patients with clinicopathological information from the TCGA dataset, which included age at diagnosis, alpha-fetoprotein (AFP) level, pathological tumor stage, gender, microvascular invasion, hepatitis B virus, and Barcelona Clinic Liver Cancer (BCLC) stage, and these characteristics were evaluated as categorical variables. Terefore, univariate and multivariate Cox regression analyses were performed to analyze the relationship between each variable and patient RFS. Nomograms are widely used for cancer prognosis, primarily because of their ability to reduce statistical predictive models into a single numerical estimate of the probability of an event, such as death or recurrence, which is tailored to the profle of an individual patient. Te "rms" R package was utilized to construct nomograms. For 1-, 3-, and 5-year survival rates, calibration curves were used to quantify the agreement between the predicted and actual results. Te ROC curve was developed with the R package "pROC" to evaluate the nomogram's efciency.  Complete medium containing 0.5% FBS was added to each well to continue the culture, and the healing of the scratches in the three fxed areas was photographed at 0 and 48 hours, respectively.

Cell Invasion and Migration
Assays. Invasion assays were performed in 24-well Transwells (8 μm pore size; BD), selfcoated with Matrigel (356234; BD). Cells were added to a coated flter (5 × 10 4 cells/flter) in 200 μl of serum-free medium in triplicate wells. Next, 500 μl of medium with 10% FBS was appended in the lower chamber. After 36 h, the upper surface of the flter was wiped of with a cotton swab. Cells on the lower surface of the membrane were fxed with 4% paraformaldehyde, stained with 0.5% crystal violet, photographed, and counted under a microscope in three random felds. Similarly, the migration assays were implemented with the same procedures, except that the plates were not coated with Matrigel and the plates were incubated for 12 h.
2.11. Statistical Analysis. Te R version 3.6.1 (https://www.rproject.org) and the corresponding package were utilized for full data analysis. Cell experiments were repeated at least three times, and data was expressed using the mean-± standard error of the mean (SEM). Statistical analysis was achieved with a one-way ANOVA test using GraphPad Prism 8. Recurrence-free survival analysis was estimated using the Kaplan-Meier method. Te value of P < 0.05 was considered statistically signifcant.

EMT Occurrence in Glucose Deprivation-Based
Microenvironment. EMT has been revealed to play an extremely signifcant role in the development and metastasis of tumors [22]. Previous studies have found that glucose deprivation treatment of tumor cells can lead to EMT induction and malignant transformation. An alteration of SMMC7721 cell lines from EMT-like phenotypes was evident after exposure to glucose deprivation for 48 hours (Figure 2(a)). Migration and invasion ability of SMMC7721 cells were signifcantly promoted by glucose deprivation (Figure 2(b)). Furthermore, cell migration and invasion ability were detected via the wound-scratch assay, the results of which showed that low glucose promoted the scratch healing ability of SMMC7721 cells (Figure 2(c)). In brief, our Cluster type Number of rows 418  418  418  418  418  418  418  censored  subjects  52  35  49  31  31  29  22  recurrences/  events  22  38  22  38  18  16  15   Median survival 1694  837  3125  1149  1852  1852  Cluster type Cluster type I (n=120)  Journal of Oncology 7 results showed that exposure to low glucose could induce EMT in HCC cells, which may participate in the malignant conversion.

Determination of GD Status and GD-Related DEGs in
Liver Cancer. We analyzed microarray datasets (GSE140867) generated from 4 high glucose-relevant and 4 GD-related HepG2 cells using WGCNA as a way to study hub modules in GD-positive samples, and fnally screened 2269 genes with P < 0.05 and log 2 (fold change)| > 1. After constructing the coexpression matrix, four gene modules, including blue model, yellow model, brown model, and turquoise model, were obtained by the dynamic hybrid shearing method (Figure 3(a)). Pearson correlation coefcients between expression profles of all gene pairs were transformed into network connection strengths (indicated by intensity in red) (Figure 3(b)). After that, the heatmap exhibited the relevance between four gene modules and glucose deprivation, resulting in the correlation between the blue module and glucose deprivation achieving 0.93 (P � 8e − 04) (Figures 3(c) and 3(d)). In addition, the correlation between module membership and gene signifcance was also analyzed, indicating that the blue module (n � 577 genes) may be especially critical for GD status (r � 0.84, P � 8.3e − 155) (Figure 3(e)). Meanwhile, the WGCNA heatmap revealed that the gene expression profle in the blue module was signifcantly overexpressed in HCC cells with GD treatment, compared with HCC cells disposed by high glucose (Figure 3(f )). 418 liver cancer patients, considered as the discovery cohort, were derived from the TCGA database. Te expression matrix of 577 GD hallmark genes in the blue module was adopted to compute the euclidean distance between any two individuals in the discovery cohort, and the nonlinear dimensionality reduction algorithm t-SNE was further applied to condense the euclidean distance into two-dimensional points. Subsequently, seven clusters with HCC patients were produced and every patient was allocated to the closest cluster ( Figure 4(a)), namely, 74, 73, 71, 69, 49, 45, and 37 patients in seven distinct clusters (from Cluster I to Cluster VII), respectively. Te RFS comparison showed that the most signifcant diferences were uncovered between Cluster II and Cluster III (Figure 4(b)). Tus, patients in Cluster III produced the best RFS, but patients in Cluster II had the worst prognosis (P � 0.0159; Figure 4(c)), suggesting that Cluster III and Cluster II perhaps represent the lowest and highest status of GD. Accordingly, suferers in Cluster II and Cluster III were sectionalized into "GD high " and "GD low " groups, separately. To engender GD-related DEGs, the expression profles of GD high and GD low groups were compared, resulting in 133 GD-related DEGs being confrmed (Figure 4(d)).   Color Key Journal of Oncology 9

Determination of EMT Status and EMT-Related DEGs in
in Cluster V) were grouped to analyze the relapse status among them (Figure 4(e)), which proved that the suferers in Cluster IV generated the best RFS, regarded as the EMT low group, compared with the patients in Cluster I, treated as the EMT high group (P � 0.0083; Figures 4(f ), and 4(g)). A total of 225 DEGs correlated with EMT status were distinguished by comparing the expression matrix of the EMT low and EMT high groups (Figure 4(h)).

GD-EMT-Related Prognostic DEGs in Liver Cancer.
On the basis of the above results, a two-dimension index, combined with GD and EMT status, was further explored, that is, patients were divided into three sets: GD low /EMT low , GD high / EMT high , and mixed groups. RFS analysis revealed positive diferences among the three groups (P < 0.001). Patients in the GD low /EMT low group had the best RFS, while those in the GD high /EMT high group had the worst RFS ( Figure 5(a)), suggesting that GD and EMT have consistent clues to their efects on recurrence status in liver cancer patients. To obtain the DEGs associated with GD-EMT, expression profles were compared between the GD low /EMT low and GD high /EMT high groups, drawing a conclusion on the identifcation of 17 GD-EMTrelated DEGs (Figure 5(b)), including 2 overexpressed in GD low / EMT low groups where patients had a higher survival rate and 15 overexpressed in GD high /EMT high groups where patients had a poorer outcome. Consequently, 12 critical GD-EMT-related DEGs, generated from the overlap among 133 GD-related DEGs, 225 EMT-correlated DEGs, and 17 GD-EMT-related DEGs were fltered out ( Figure 5(c)). Trough Pearson correlation analysis, relevance among the 12 GD-EMT-related DEGs, such as SLC2A4, ZNF746, EZR, GNA13, HNF4A, ITGA6, JUN, LASP1, MCL1, NDRG1, PCBP1, and SHC1 was displayed ( Figure 5(d)), among which the expression level of ZNF746, EZR, GNA13, HNF4A, ITGA6, LASP1, NDRG1, PCBP1, and SHC1 was decreased while the expression status of SLC2A4, JUN, and MCL1 was increased. Furthermore, enrichment analyses of functions revealed that the DEGs were associated with cadherin binding, cytoplasm, cytosal, liver development, and actin cytoskeleton reorganization in terms of gene oncology analysis. According to KEGG pathway analysis, the DEGs could take part in pathogenic Escherichia coli infection, the AMPK pathway, regulation of actin cytoskeleton, microRNAs in cancer, and the ErbB signaling pathway ( Figure 5(e)).    regression overlapping 4 genes (SLC2A4, HNF4A, JUN, and MCL1) among the 12 DEGs in the TCGA cohort and GSE76427 set that have signifcant efects on patients' prognosis (P < 0.05; Figure 6(a)). To accomplish the establishment of a predictive model, the LASSO regression analysis was then performed to flter signatures from 4 GD-EMT-related prognostic DEGs. Cross-validation was also performed to obtain the best λ value from the smallest partial likelihood bias to further identify DEGs signifcantly associated with prognosis in liver patients. Te corresponding coefcients were generated at the optimal log λ of 0.02036323. Te results are shown in Figures 6(b) and 6(c). Tus, the risk score was calculated by the following formula: the combination mRNA panel � (−0.010913578 × expression value of HNF4A + 0.002874759 × expression value of SLC2A4). Meanwhile, the risk score was closely related to the expression of most enrichment scores of GD-predicted pathways ( Figure 6(d)) and EMT-predicted pathways ( Figure 6(e)). Subsequently, the patients were classifed as high-risk or low-risk groups on account of the median value in training set (Figure 7(a)), validation set (Figure 7(b)), entire set (Figure 7(c)), and external testing set (Figure 7(d)). Te number of relapsed patients increased with an increasing risk score. Consistently, SLC2A4 expression was upregulated along with the downregulated expression levels of HNF4A. Te Kaplan-Meier analysis manifested that patients' RFS was longer in the low-risk group than that in the high-risk group (P < 0.01). Furthermore, PCA displayed that patients in diverse groups could be signifcantly clustered based on these two traits in all datasets. In conclusion, our fndings indicated the applicability of the two-gene trait in recidivism prediction.

GD-EMT-Based Risk Score and Prognosis Classifer in
Liver Cancer. Te univariate analysis displayed that the risk score, age at diagnosis, and TNM stage were signifcantly associated with patients' RFS in the training cohort, validation cohort, the entire TCGA cohort, and external validation cohort with hazard ratios (HRs) of 0.828797988, 0.899926153, 0.867811632, and 0.774179934, respectively. Also, multivariate Cox regression analysis further demonstrated that the risk score remained as an independent prognostic factor after integrating with various clinicopathologic characteristics, including age at diagnosis, AFP levels, TNM stage, BCLC stage, HBV, gender, and microvascular invasion (Figures 8(a)-8(d), Table 2). Due to the signifcant relationship between age, TNM stage, and patients' prognosis, the prognostic values of a diversity of age and TNM stage were also explored. Te Kaplan-Meier survival curves revealed that age and tumor stage could predict the outcome ( Figure 9). As expected, a higher risk score was positively associated with older age (Figure 10(a)) and higher tumor stage in four cohorts (Figure 10(b)). Te prognosis classifer was further validated within low-risk and high-risk patient subgroups with stage T1/T2 and stage T3/ T4 in four cohorts, respectively. As a result, patients in the low-risk group were able to generate better RFS compared to the high-risk group in both the T1/T2 and T3/T4 stage subgroups (Figures 11(a) and 11(b)). Similarly, stratifed analysis showed that risk scores could identify a diferent prognosis for suferers with age ≤ 60 or older (>60) (Figures 11(c) and 11(d)).

Nomogram Based on Risk Score and Clinicopathological
Features. We incorporated risk scores with clinicopathological characteristics (age at diagnosis and pathological tumor stage) to construct a nomogram to predict RFS. Te points for each factor and total points were calculated separately to assess RFS rates at 3 and 5 years (Figure 12(a)), and then the validity of the nomogram was assessed using ROC curves and calibration plots, and the fndings are shown in Figure 12(b). Te 3-and 5-year AUCs for both the internal and external validation cohorts were smaller than those for the training cohort (0.797 and 0.654 and 0.684 and 0.798, respectively) (Figures 12(c) and 12(e)). Te AUCs for the two time points were 0.801 and 0.794 in the TCGA cohort, respectively ( Figure 12(d)). In addition, the calibration plots show excellent agreement between predicted and observed results in the internal validation cohort (Figure 12(g)), the training cohort (Figure 12(f )), the GSE76427 cohort (Figure 12(i)), and the TCGA cohort ( Figure 12(h)).

Discussion
It is well known that the tumor microenvironment plays an important role in tumorigenesis by stimulating surrounding cells through the circulatory and lymphatic systems, which can further infuence tumor development [23][24][25]. At the same time, it can reprogram the surrounding cells so that they play a decisive role in tumor survival. Malignant tumors with rapidly proliferating cells regularly experience nutrient (e.g., glucose) deprivation, which promotes tumor progression and aggressiveness through EMT induction [26]. Also, cells exposed to low glucose could sufer malignant transformation with elevated formation of colonies when compared to high glucose medium [27]. Trough this study, we aimed to construct a model to solve the signifcant clinical issues by means of a comprehensive analysis of microenvironment characteristics and transcriptional profles. Currently, the coincident efect of EMT status and GD is apparently related to recidivation after stratifying patients by clinicopathological risk factors. Finally, the GD-EMT-basedtwo-gene characteristics were used as prognostic classifers for risk stratifcation and performed well in both the training and validation cohorts. Hence, this study synthetically analyzed the available HCC expression datasets to clarify GD-EMT-related DEGs to predict RFS for HCC      Figure 11: Kaplan-Meier curves of RFS according to the risk score in patients stratifed by pathological tumor stage and patients' age at diagnosis. Kaplan-Meier curves were applied to patients with lower tumor stage (T1/T2) (a), higher tumor stage (T3/T4) (b), lower age at diagnosis (≤60) (c), and higher age at diagnosis (>60) (d) in the training cohort, validation cohort, entire cohort, and GSE76427 cohort, respectively. Te tick marks on the Kaplan-Meier curves represent the censored subjects. Te two-sided log-rank test was used to determine diferences between the two curves. RFS, recurrence-free survival. 16 Journal of Oncology  suferers. Besides, we systematically evaluated the prognostic value of risk scores in HCC patients to establish a model with better accuracy.
Previous studies have shown that the GD-based microenvironment drives the emergence of the EMT state in cancer, resulting in the invasion and metastasis of tumor cells [28][29][30]. However, few indicators regarding GD status have been developed, much less to focus on comprehensive efects between GD and EMT status, as well as their potential roles in clinically relevant classifcation. Moreover, determination of GD status by a single biomarker is not sufcient because it may be liable to omit important information about biological processes [31][32][33][34]. Tus, the implementation of combined GD-EMT features across cohorts can be used to develop continuous metrics for the comprehensive assessment of TME. Te populations were divided into the GD low /EMT low and GD high /EMT high groups by subgroup classifcation, associated with diferent clinical prognosis, transcriptional GD-EMT patterns, and activation pathways that could be targeted for treatment. t-SNE has been used to discover potential subtypes of liver cancer, which provides an elegant dimensionality reduction technique [35][36][37]. In our study, t-SNE discerned disparate patterns of EMT status in TME based on a set of 1067 EMT hallmark genes from the dbEMT2.0 database, an updated database for EMT-related genes containing experimentally validated information and precomputed information on the regulation of cancer metastasis [38]. Also, EMT-predicted pathways during the EMT process were analyzed to explore their relationship with comprehensive features. When entering the GD state, GD-treated cancer cells without a specifc genetic signature could be classifed into diverse GD groups. Terefore, WGCNA, an efective method in many diseases that identifes modules of coexpressed genes [39], was employed to determine GD-related hub genes in one microarray dataset (GSE140867). As we all know, the invasive tumor cells in TME constantly exhibit dysregulated metabolism and enhanced aerobic glycolysis, leading to glucose depletion, hypoxia, immunosuppression, epigenetic modifcation, and lactic acid production [40,41]. Nevertheless, the correlational studies were mostly concentrated upon the following aspects, such as hypoxia [42,43], immune status [22,44], RNA m6A methylation [45,46], and lactic acid [47]. As a result, a synthetic study embracing GD characteristics in TME with EMT germination has not yet been studied in detail.
Studies have reported that the two signature genes in this study have a major role in multiple types of cancer. Hepatocyte nuclear factor 4 A (HNF4A), an orphan nuclear receptor, was one of the most important regulators of hepatocyte homeostasis, whose expression was frequently decreased in hepatocellular carcinoma. Cell invasion was closely associated with the downregulation of HNF4A expression, which promotes cancer metastasis [48,49]. As for solute carrier family-2-member-4-gene (SLC2A4), encoding glucose transporter-4-protein (GLUT4), it has been reported to serve as a novel therapeutic candidate for cancer treatment [50]. Te inhibition of SLC2A4 could compromise cell proliferation and metastasis in breast cancer [51], prostate cancer [52], and gastric cancer [53]. Tus, the two signature genes sifted from this study could ofer latent candidates to elucidate molecular mechanisms in liver cancer.
Tere were a wide variety of potential targets with corresponding detailed mechanisms having been certifed to be absolutely vital for tumor development. However, translating these eforts and discoveries from laboratory results to clinical applications is difcult. Hence, the incorporation of clinicopathological features and molecular markers could render a bran-new view for individualized treatment and prognostic observation. Our research provides a hint that patients in GDhigh/EMTigh status are considered high-risk patients, which can help clinicians make better decisions. In conclusion, this study linked microenvironmental characteristics and genetic profles to patient prognosis, which could better serve the clinical therapeutics of patients with liver cancer.
Tere were a few limitations to this study. First, this study was performed by using bioinformatics analyses. Tough we validated the results in several cohorts from public databases, we did not explore the relevant mechanisms in vivo experiments. Second, the clinical signifcance of the risk score needed further validation in prospective clinical trials. Tird, we defned median cut-of values for risk scores in all cohorts rather than optimal cut-of values. Tus, fndings in this study were waiting for further validation by well-designed, prospective, multicenter studies. Terefore, this study awaits further refnement with a welldesigned multicenter prospective study.

Conclusion
In brief, the EMT changes due to the tumor GD microenvironment that were closely related to the prognosis of liver cancer patients. Te GD-EMT-based genetic signature performed well in risk stratifcation and adds value beyond TNM staging. It can be used in clinical diagnosis for individualized treatment and prognosis, and follow-up can be scheduled on this basis.