Identification of Prognostic Markers for Head and NeckSquamous Cell Carcinoma Based on Glycolysis-Related Genes

Head and neck squamous cell carcinomas (HNSCCs) comprise a heterogeneous group of tumors. Many patients respond differently to treatment and prognosis due to molecular heterogeneity. There is an urgent need to identify novel biomarkers to predict the prognosis of patients with HNSCC. Glycolysis has an important influence on the progress of HNSCC. Therefore, we investigated the prognostic significance of glycolysis-related genes in HNSCC. Our results showed that ELF3, AURKA, and ADH7 of 20 glycolysis-related DEGs were significantly related to survival and were used to construct the risk signature. The risk score showed high accuracy in distinguishing the overall survival (OS) of HNSCC. The Kaplan–Meier curves demonstrated that the risk score was associated with an unfavorable prognosis in patients with female sex, male sex, grade 3, T1/2 stage, N+ stage, N2 stage, M0 stage, and clinical stage III/IV. Independent prognostic analysis showed that clinical stage and risk score were strongly associated with OS. Moreover, the risk score had higher accuracy in predicting 1-, 3-, and 5-year survival. AURKA and ADH7 were only significantly related to M1 macrophages and neutrophils, respectively, while ELF3 was significantly correlated with M2 macrophages and monocytes (all p < 0.05).The ceRNA network demonstrated that miR-335-5p and miR-9-5p may play core roles in the regulation of these three genes in HNSCC. The risk score constructed based on three glycolysis-related genes showed high accuracy in predicting the prognosis and clinicopathological characteristics of HNSCC.


Introduction
Head and neck cancers (HNCs) are a group of heterogeneous diseases with multifocal and multicellular origins, which occur in the nasal sinuses, nasal cavity, oral cavity, pharynx, throat, salivary glands, thyroid gland, etc. Squamous cell carcinomas comprise 90% of cases and are called head and neck squamous cell carcinoma (HNSCC) [1]. HNSCC, as the most common cancer type of the head and neck, is the sixth most common cancer worldwide, accounting for almost 5% of all malignant tumors [2]. Smoking, drinking, and human papillomavirus are risk factors for HNC and are related to the pathogenesis of HNSCC [3]. Although surgeries combined with radiotherapy, chemotherapy, and targeted therapy have been considered the main treatment methods in recent decades, less than 50% of patients have been cured. Moreover, the incidence of local recurrence in patients with HNSCC is 15-50% [4]. Although TNM staging is an important clinical prognostic factor to guide treatment choice, many HNSCC patients with the same TNM staging respond differently to radiotherapy, chemotherapy, and immunotherapy due to molecular heterogeneity, which greatly affects clinical efficacy and side effects [5]. erefore, there is an urgent need to identify novel biomarkers to predict the prognosis of patients with HNSCC.
Aerobic glycolysis is one of the most important metabolic characteristics of tumor cells. It can not only provide energy for tumor cells quickly (various intermediate metabolites produced in the process are the necessary precursors for other metabolic pathways, which can provide energy for tumor cells) but also provide raw materials for the synthesis of various biological macromolecules. More importantly, high-throughput glycolysis of tumor cells produces a high-lactate, low-glucose metabolic environment that can help maintain an immunosuppressive tumor microenvironment. Drug therapy targeting glycolytic genes can inhibit tumor progression and increase sensitivity to other therapeutic agents [6]. e expression levels of the three key glycolytic enzymes, namely, hexokinase, 6-phosphofructokinase, and pyruvate kinase, are significantly increased in oral precancerous and oral squamous cell carcinoma tissues [7]. Moreover, the expression of hexokinase 2/pyruvate kinase muscle isozyme M2 (HK2/PKM2) in tongue squamous cell carcinoma tissue is significantly increased and is significantly related to tumor staging, clinical staging, and lymph node metastasis and can be used as an independent factor to predict disease prognosis. HK2/PKM2 knock-down significantly inhibited the growth of tongue squamous cell carcinoma, transplanted tumors, and lung metastasis in mice [8]. ese results demonstrated the important role of glycolytic enzymes in the occurrence and development of oral cancer; thus, identifying reliable prognostic factors in patients with HNSCC based on glycolysis for accurate prognosis prediction and screening for high-risk cases are important for doctors to develop individualized treatment plans.
In recent years, the unique metabolic mode of tumor cells has attracted great attention and is expected to be an important drug target for targeted therapy [7]. Metabolic changes in the tumor microenvironment affect not only the biological characteristics of tumor cell proliferation and migration but also cause infiltration, distribution, and function changes in tumor immune cells [8]. Moreover, the Treg transcription factor Foxp3 reprograms Treg metabolism by inhibiting Myc signal transduction and glycolysis, and enhancing oxidative phosphorylation (OXPHOS) and nicotinamide adenine dinucleotide (NADH) oxidation, thereby providing Tregs the metabolic advantage in the lowglucose and high-lactic acid tumor microenvironment, thereby resisting lactate-mediated Tcell function and growth inhibition [8]. However, activated T cells require large amounts of glucose, amino acids, and fatty acids and adjust their metabolic pathways to increase glycolysis and glutamate decomposition activities [9]. Activated neutrophils and M1 macrophages also rely mainly on the glycolytic pathway for energy supply [10,11]. However, there is a lack of systematic exploration of the relationship between glucose metabolism and tumor immunity in HNSCC.
In the present study, three glycolysis genes (ELF3, AURKA, and ADH7) were significantly related to survival and were used to construct a risk signature that showed high accuracy in predicting the overall survival (OS) of HNSCC .   e risk score was associated with an unfavorable prognosis  in the patients with female sex, male sex, grade 3, T1/2 stage,  N + stage, N2 stage, M0 stage, and clinical stage III/IV. e risk score was also an independent prognostic factor for predicting the OS of patients with HNSCC. In addition, AURKA and ADH7 were significantly related to M1 macrophages and neutrophils, respectively, while ELF3 was significantly correlated with M2 macrophages and monocytes. e competing endogenous RNA (ceRNA) network demonstrated that miR-335-5p and miR-9-5p may play a core role in the regulation of these three genes in the development of HNSCC.

Data Source.
e transcriptome and corresponding clinical data from HNSCC (500 cases) and normal (44 cases) samples were downloaded from e Cancer Genome Atlas (TCGA) database. Moreover, the transcriptome and clinical information for HNSCC and normal samples were obtained from the GSE41613 dataset in the GEO database and were considered an independent external validation set for the prognostic signature. e 288 glycolysis-related genes were obtained from the MSigDB database (https://www.gseamsigdb.org/gsea/msigdb).

Identification of DEGs and Glycolysis-Related DEGs.
Based on a threshold value of |log 2 fold change (FC)| > 1 and a p value of <0.05, the DEGs were screened between the tumor and normal samples using the "limma" R package (v3.42.2). e DEGs were visualized by the volcano plot, and the top 50 DEGs were visualized as a heatmap. e glycolysis-related genes were intersected with the DEGs using Venn analysis (https://bioinformatics.psb.ugent.be/ webtools/Venn/).

Univariate and Multivariate Cox Regression Analyses.
e 500 HNSCC samples were divided into training (350 samples) and testing (150 samples) sets in a 7 : 3 ratio. e clinical characteristics of the HNSCC samples in the training and testing sets are reviewed in Table S1. e expression data of 20 glycolysis-related DEGs in the training set were subjected to univariate Cox regression analysis to determine those that were significantly associated with patient survival based on a p value of <0.2.
e genes identified in the univariate Cox regression analysis were included in the multivariate Cox regression and stepwise analyses to further filter the genes, ultimately obtaining the three genes related to prognosis, which were used to construct the risk signature. e risk score was calculated using the R package "survival" (v3.2-7) based on the linear model and predict.coxph function [12]. e risk score formula was as follows: (1) According to the median risk score, the 350 patients in the training set were classified into high-risk (>median) and low-risk (<median) groups. e OS and receiver operating characteristic (ROC) curve analyses were plotted respectively using the R packages "survminer" (v0.4.6) and "survivalROC" (v1.0.3) for the high-and lowrisk groups in the training set. e expression levels of the three genes were visualized on a heatmap using the R package "pheatmap" (v1.0.12). e risk score system of the three genes was constructed in the training set and evaluated using the testing set and the GSE41613 dataset. e correlations of the risk score and clinicopathological characteristics (sex, grade, clinical stage, and TNM stage) were evaluated by Kaplan-Meir survival curves using the R package "survival" (v3.2-7). e clinicopathological characteristics were included in the univariate and multivariate Cox regression analyses to confirm the independent prognostic factors based on p < 0.05. A nomogram was constructed using the R package "rms" (v5. [1][2][3][4] to predict the survival rates at 1, 3, and 5 years and to assess its effectiveness by calibration curves [13].

Distributions of the Immune Cells and eir Correlation with the Genes in the Risk Signature.
e distributions of immune cells between the high-and low-risk groups were analyzed using the CIBERSORT algorithm (v1.0.3). e differential immune cells (p < 0.05) were selected and their correlations with the three genes in the risk signature are shown in a correlation heat map.
2.6. Statistical Analyses. We used the R package "edgeR" to identify differentially expressed lncRNAs in the HNSCC and normal samples. Wilcoxon tests were used to compare the fractions of immune cells between HNSCC and normal samples in the CIBERSORT analysis. Pearson's correlation analysis was used to analyze the correlation between the three genes and differential immune cells. p < 0.05 was considered statistically significant.

Identification of Glycolysis-Related DEGs.
A total of 505 DEGs (Table S2) between the HNSCC and normal samples were identified, including 293 upregulated genes and 212 downregulated genes. e distributions of DEGs were visualized in a volcano plot (Figure 1(a)), and the top 50 DEGs showed obvious differences between the HNSCC and normal samples (Figure 1(b)). Among 288 glycolysis-related genes (Table S3), 20 overlapped with the DEGs (Figure 1(c)) and were included in the follow-up analyses.

Construction and Verification of the Prognostic Risk
Signature. To construct the glycolysis-related DEG-based prognostic signature, we performed univariate and multivariate Cox regression analyses in the training set. Univariate Cox regression analysis selected ELF3, AURKA, and ADH7 from 20 genes based on p < 0.2 (Table 1), and they were further included in the multivariate Cox regression analysis to determine the best characterized genes. e results are shown in Table 2, and ELF3, AURKA, and ADH7 were considered to be the optimal variables for the construction of the prognostic signature. e risk score was calculated using the aforementioned formula. e 350 samples in the training set were then divided into high-risk and low-risk groups according to the median of the risk score. e number of patients who died increased significantly as the risk score increased in the training set ( Figure 2(a)). In addition, the OS differed significantly (p � 1.491e − 02) between the high-and low-risk groups. e high-risk cohort was also correlated with a poor prognosis in the training set ( Figure 2(b)). Subsequently, ROC curve analysis of the prognostic risk score was performed at 1, 3, and 5 years to assess the predictive efficiency of the risk score. e results are shown in Figure 2(c), and the area under the curve (AUC) of the risk score in the training set was 0.592, 0.598, and 0.557 for patients with 1, 3, and 5year OS, respectively. e expressions of AURKA and ADH7 were upregulated, while ELF3 was downregulated in the high-risk group compared to those in the low-risk group in the training set ( Figure 2(d)).
To demonstrate the general applicability of the risk score, we performed the same analysis in both the testing set ( Figure 3) and the external validation set ( Figure 4). e results showed that the risk score was able to significantly differentiate the clinical outcomes of patients (all p < 0.05). Meanwhile, relatively higher predictive accuracy was observed in both the testing set and the external validation set. In the testing set, the AUCs for the risk score were 0.630, 0.600, and 0.616 for OS at 1, 3, and 5 years, respectively, and 0.687, 0.611, and 0.639 in the external validation set, respectively.
Evidence-Based Complementary and Alternative Medicine 3

Correlation of the Risk Score and Clinicopathological
Characteristics with OS. To investigate the correlation between the risk score and clinicopathological characteristics (sex, grade, clinical stage, and TNM stage), the risk score was visualized as box plots for these clinicopathological characteristics. e results showed that the risk score did not differ significantly for these characteristics (p > 0.05, Figure S1). To further investigate the correlation between the risk score and patient OS, Kaplan-Meier curves were plotted in patients stratified according to clinicopathological characteristics. e results showed that the risk score was associated with an unfavorable prognosis in patients with female sex, male sex, grade 3, T1/2 stage, N + stage, N2 stage, M0 stage, and clinical stage III/IV (p < 0.05, Figure 5), but not with grade 1/2, T3/4 stage, N0 stage, N1 stage, and clinical stage I/II (p > 0.05, Figure 5), suggesting that the risk scores might be associated with OS and clinicopathological characteristics.

Identification of Independent Prognostic Factors.
Univariate and multivariate Cox regression analyses were performed using the TCGA dataset to explore the independent prognostic factors based on the abovementioned clinicopathological characteristics.
e univariate analysis showed that the clinical stage and the risk score were significantly correlated with prognosis (p < 0.05, Table 3 and Figure 6(a)). In the multivariate analysis, based on the abovementioned significant factors, the clinical stage and risk score remained strongly associated with OS (clinical stage, p � 0.009; risk score, p < 0.001; Table 4 and Figure 6(b)), which were considered independent prognostic factors for HNSCC. After that, we constructed a nomogram model based on these independent prognostic factors that could predict the OS of patients ( Figure 6(c)). Further calibration curves showed that the nomogram model had the ability to predict the 3-year OS of patients in an approximate way to the actual observed values ( Figure 6(d)), suggesting that the nomogram model may be more suitable for predicting the midterm survival probability of patients.

Relationships between the Genes in the Risk Signature and
Immune Cells. Glycolysis-related genes were upregulated in samples of melanoma and lung cancer was poorly infiltrated by T cells. Moreover, the overexpression of glycolysis-related molecules impaired T cell killing of tumor cells, whereas the inhibition of glycolysis enhanced T cell-mediated antitumor immunity in vitro and in vivo [14]. us, collectively, immune cells may participate in glycolysis regulation during cancer occurrence and development. e present study investigated the relationship between the genes in the risk signature and 21 immune cell lines using the CIBERSORT algorithm. e results showed significantly   Evidence-Based Complementary and Alternative Medicine reduced fractions of B cell memory cells, T follicular helper cells, monocytes, and neutrophils in the high-risk group compared to those in the low-risk group (allp < 0.05). e fractions of M1 and M2 macrophages in the high-risk group were significantly higher than those in the low-risk group (p < 0.05, Figure 7(a)). Furthermore, the correlations of the three genes and the immune cells are shown in Figure 7(b); ELF3 showed a significant negative correlation with M2 macrophages and a significant positive correlation with monocytes; AURKA presented the highest positive relationship with M1 macrophages; ADH7 displayed a significant negative correlation with neutrophils. Taken together, the three genes related to glycolysis regulated the malignant development of HNSCC, probably by affecting the activity of immune cells,      particularly macrophages, monocytes, and neutrophils. However, whether these three genes could be used as targets for immunotherapy in HNSCC requires further research.

Construction of the ceRNA Network Based on the Genes in the Risk Signature.
To investigate the coregulatory network of the genes in the risk signature, a ceRNA network was constructed using the three genes, three 3 miRNAs, and 175 lncRNAs. As shown in Figure 8, the ceRNA network was roughly divided into two clusters connected by several lncRNAs; ADH7 was directly related to miR-335-5p, which was in a central position in a cluster and was associated with multiple lncRNAs; miR-124 was connected to both ELF3 and AURKA, but not to any lncRNAs; miR-9-5p, directly connected to ELF3 and indirectly connected to AURKA, was located in the center of another cluster and was related to multiple lncRNAs. ese results revealed that miR-335-5p and miR-9-5p may play core roles in the regulation of these three genes in HNSCC.

Discussion
e recurrence and metastasis of head and neck tumors are the main reasons for the low OS rate of patients. erefore, the identification of an exact tumor marker to assist in  clinical diagnosis, judge patient response to treatment, and detect local recurrence and metastasis of tumors early has become a research hotspot. Cytokeratin 19 serum fragment 21-1 (Cyfra21-1), squamous cell carcinoma-associated antigen (SCCAg), tissue polypeptide-specific antigen (TPS), carcinoembryonic antigen (CEA), Fe protein, and circulating tumor cells (CTCs) are tumor markers for the diagnosis and prognosis of HNSCC [15]. However, their lack of sensitivity and specificity means that they are not considered specific tumor markers for HNSCC.
In the present study, ELF3, AURKA, and ADH7 were significantly related to the OS of HNSCC and were used to construct a risk signature that showed high accuracy in predicting the prognosis of HNSCC. e risk scores of ns *** ns ns ns ns *** ns ns ns ns *** ns * ** ns ns ns ns ns *  14 Evidence-Based Complementary and Alternative Medicine HNSCC patients with female sex, male sex, grade 3, T1/2, N + , N2, M0, and clinical stage III/IV were associated with a poor prognosis. Moreover, the risk score was an independent prognostic factor for predicting the OS of patients with HNSCC. In addition, AURKA and ADH7 were significantly correlated with M1 macrophages and neutrophils, respectively, while ELF3 was significantly correlated with M2 macrophages and monocytes. Furthermore, the ceRNA network revealed the potential roles of miR-335-5p and miR-9-5p in the regulation of these three genes in HNSCC progression. e E26 transformation-specific (ETS) transcription factor family is located on chromosome 1q32.1 and has a conserved winged helix-turn-helix DNA-binding domain, the  Evidence-Based Complementary and Alternative Medicine ETS domain, that can bind to the typical DNA sequence 50-GGA(A/T)-30 [16][17][18]. Members of this family can act as upstream and downstream effectors of most signaling pathways, such as MAP kinase, Erk1/2, p38, and JNK, which play important roles in cell differentiation, development, proliferation, apoptosis, tissue remodeling, and epithelialmesenchymal transition [19][20][21][22][23][24]. ELF3 (E74 is similar to the ETS transcription factor 3), also known as ESE-1, EPR-1, ESX, and ERT, is expressed in the nucleus and cytoplasm [25]. ELF3 is closely associated with bladder, ovarian, biliary tract, gastric, cervical, breast, prostate, lung, liver, and colon cancer and increases cell proliferation, invasion, and migration [26,27]. ELF3 is highly expressed in head and neck tumors and upregulates the epidermal growth factor receptor (EGFR) and human epidermal growth factor receptor 2 (HER2) to attenuate the antiproliferative effects of EGFR/HER2 tyrosine kinase inhibitors (lapatinib and afatinib) [28].
e Aurora protein kinase family includes serine/threonine kinases that participate in many processes of cell mitosis, such as cell cycle G2/M conversion, mitotic spindle assembly, and chromosome separation [29]. e Aurora protein kinase family members include Aurora A, Aurora B, and Aurora C. e AURKA kinase is frequently amplified or overexpressed in malignant tumors such as pancreatic, prostate, gastric, breast, and colon cancers [30,31]. Abnormal AURKA expression promotes malignant tumor occurrence and development through a variety of possible mechanisms, including promoting cell cycle processes, activating cell survival and/or antiapoptosis signal transduction, enhancing oncogene carcinogenicity, promoting the epithelial-mesenchymal transition, and dry cell transformation of cancer cells [32]. In recent years, increasing evidence has demonstrated a potential relationship between AURKA expression and tumor immunity. e initial understanding of the influence of AURKA on the immune response comes from its application in immunotherapy, in which the AURKA epitope initiated an anti-AURKA immune response, thereby killing tumor cells with high AURKA expression. Moreover, AURKA inhibition directly interfered with the immune response by inducing cell transformation, T cell activation, and immune cell infiltration [33]. Reiter et al. reported that AURKA upregulation promoted the disease stage, the occurrence of positive regional lymph nodes, and the level of distant metastasis, reducing the disease-free survival time and OS rate of patients [34]. Huang et al. [35] indicated that the risk of oral cancer increased significantly among smokers with high AURKA expression, which had a certain reference value for screening groups at high risk for oral cancer. Yang et al. [36] showed that MLN8237, an AURKA inhibitor, inhibited AURKA autophosphorylation at r288, leading to abnormal cell division and cell cycle to stop cell aging in the G2/M phase.
Alcoholic dehydrogenase (ADH) is a key enzyme in alcohol metabolism [37] and can be divided into five categories, including seven genotypes: ADH1a, ADH1b, ADH1c, ADH4, ADH5, ADH6, and ADH7 [38]. ADH plays an important role in the metabolism of alcohol, retinol, and other substances to oxidize ethanol to acetaldehyde. us, abnormal or unbalanced expression of this enzyme leads to abnormal alcohol metabolism [39]. Recently, the ADH family was reported to also play a role in the prognosis of gastric cancer, breast cancer, and nasopharyngeal carcinoma [40]. ADH7 showed promise as a prognostic biomarker and chemotherapy target for gastric adenocarcinoma, with low ADH7 expression and is related to the prognosis of patients with better histological types of adenocarcinoma and squamous cell carcinoma [41]. In addition, ADH1B + 3170A>G and ADH1C + 13044A>G single-nucleotide polymorphisms were associated with increased risks of HNSCC [42].
In conclusion, ELF3, AURKA, and ADH7 were highly accurate in predicting the prognosis and clinicopathological features of patients with HNSCC, providing a new perspective for prognosis analysis and immune target therapy for HNSCC. However, this study selected few datasets and there were few tools to predict miRNA target genes. e databases used were not updated consistently and the annotation of the gene information was not perfect, which may have led to false-positive results. However, our research provides a preliminary basis and new ideas for research. Follow-up research will further focus on the functions and molecular mechanisms of these three genes miR-335-5p and miR-9-5p in the development of HNSCC. A large number of independent sample sets will also be used to verify these markers.

Conclusions
e risk score constructed based on three glycolysis-related genes showed high accuracy in predicting the prognosis and clinicopathological characteristics of HNSCC.

Data Availability
e figure data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.
between the HNSCC and normal samples. Table S3: 288 glycolysis-related genes. Figure S1: the correlation between the risk score and clinicopathological characteristics. (Supplementary Materials)