Seven Immune-Related Genes' Prognostic Value and Correlation with Treatment Outcome in Head and Neck Squamous Cell Carcinoma

Background Head and neck squamous cell carcinoma (HNSCC) is a growing concern worldwide, due to its poor prognosis, low responsiveness to treatment, and drug resistance. Since immunotherapy effectively improves HNSCC patients' survival status, it is important to continuously explore new immune-related predictive factors to accurately predict the immune landscape and clinical outcomes of individuals suffering from HNSCC. Methods The HNSCC transcriptome profiling of RNA-sequencing data was retrieved from TCGA database, and the microarray of GSE27020 was obtained from the GEO database for validation. The differentially expressed genes (DEGs) between HNSCC and normal samples were identified by multiple test corrections in TCGA database. The univariate and multivariate Cox analyses were performed to identify proper immune-related genes (IRGs) to construct a risk model. The Cox regression coefficient was employed for calculation of the risk score (RS) of IRG signature. The median value of RS was utilized as a basis to classify individuals with HNSCC into high- and low-risk groups. The Kaplan-Meier (K-M) survival analysis and receiver operating characteristic (ROC) curves were employed for the identification of the prognostic significance and precision of the IRG signature. The signature was also evaluated based on clinical variables, predictive nomogram, mutation analysis, infiltrating immune cells, immune-related pathways, and chemotherapeutic efficacy. The protein-protein interaction (PPI) network and functional enrichment pathway investigations were utilized to explore possible potential molecular mechanisms. Finally, the hub gene's differential mRNA expression levels were evaluated by means of the Gene Expression Profiling Interactive Analysis (GEPIA), and the Human Protein Atlas (HPA) was utilized for the validation of their translational levels. Results Collectively, 1593 DEGs between HNSCC and normal samples were identified, of which 136 IRGs were differentially expressed. Then, the 136 immune-related DEGs were mostly enriched in the cytokine-related signaling pathways by GO and KEGG analyses. After that, a valuable signature based on seven genes (DKK1, GAST, IGHM, IL12RB2, SLURP1, STC2, and TNFRSF4) was designed. The HNSCC patients into the low-risk group and the high-risk group were divided by using the median RS; the HNSCC patients in the high-risk group had a worse survival than those in the low-risk group. The risk signature was verified to be an independent predictive marker for HNSCC patients. Meanwhile, the RS had the largest contribution to survival of these patients based on the predictive nomogram. In addition, the low-risk HNSCC patients exhibited significantly enriched immune cells, along with an association with high chemosensitivity. Conclusion The constructed gene signature can independently function as a predictive indicator for the clinical features of HNSCC patients. The low-risk HNSCC subjects might benefit from immunotherapy and chemotherapy.


Introduction
As the sixth most prevalent type of malignancy, head and neck squamous cell carcinoma (HNSCC) is the seventh main cause of cancer-related mortalities globally [1].A study conducted in the United States predicted that by 2022, approximately 66,470 new HNSCC cases would arise and 15,050 HNSCC-related deaths would occur [2].Despite steady advancements in relevant medical treatments, like surgery, radiotherapy, and chemotherapy, the five-year survival of individuals with HNSCC has not significantly improved [3].Therefore, finding new and innovative novel prognostic factors for HNSCC patients is an urgent need.
HNSCC is considered an immunodeficiency disease.The main mechanisms underlying the disease include the induction of immune tolerance, local immune escape, and the destruction of T-cell signals [4].The immune microenvironment of HNSCC has been widely studied [5,6].For instance, the human leukocyte antigen (HLA) is responsible for the vital function of transmitting signals between tumor antigen peptides and killer T cells [7].A previous study demonstrated that more than 50% of HNSCC patients had low HLA expression, with extensive lymph node metastasis and poor prognosis [8].HNSCC tumor cells could also release chemical factors, to induce many immunosuppressive hematological cells to enter the immune microenvironment, thus suppressing the immune response [9].
Meanwhile, studying how HNSCC survival is linked to infiltrating immune cell proliferation and function could improve the survival of HNSCC patients [10][11][12].Immunotherapy's effect on the clinical outcomes of individuals with HNSCC has also been intensively studied [13][14][15].This is why the exploration of immune-related biomarkers to anticipate the clinical features of individuals with HNSCC is imperative.Recent studies demonstrated that immunerelated biomarkers could affect the biological behavior of HNSCC as well as the status of patients.For instance, Yao et al.'s model consisted of four immune-related genes (IRGs), including PVR, TNFRSF12A, IL21R, and SOCS1 [16]; Chen et al. constructed predictive model based on three IRGs (SFRP4, CPXM1, and COL5A1) [17]; and Zhang et al. established a model based on six IRGs (PLAU, STC2, TNFRSF4, PDGFA, DKK1, and CHGB) for the prognostic prediction of HNSCC [18].Although these studies have constructed a proper model to predict the prognosis of patients with HNSCC, the progression of HNSCC is complexity and uncertainty.Therefore, to date, reliable and predictive biomarkers for identifying HNSCC are still limited; it is essential to continuously look for newly representative biomarkers.
During this research, a risk signature of seven immunerelated genes was developed for accurately predicting the clinical outcomes of HNSCC subjects, which may provide an effective treatment strategy for these patients.
2.2.Development of the IRG-Based Signature.The differentially expressed genes (DEGs) between cancerous and healthy samples were identified with the help of TCGA database after multiple test corrections by false discovery rate ðFDRÞ < 0:05 and jlog fold change ðFCÞj > 2 [21].Then, screening the intersections of these DEGs with IRGs was carried out.Univariate Cox proportional hazards regression analysis was employed for the determination of the predictive ability of IRGs for the overall survival (OS) of individuals with HNSCC with the aid of the "survival" package in R [22].The genes with a threshold of P < 0:05 were subjected to further evaluation using multivariate Cox regression analysis [23].Then, the expression levels of the hub genes were compared for further exploring their expression features in normal and HNSCC tumor tissues.Subsequently, the calculation of the IRG-based signature-related risk score (RS) was done using the Cox regression coefficient and gene expression formula given below: N, Expi, and Co-eff indicate signature gene number, gene expression levels, and regression coefficient values, respectively.Using the median value of RS as a criterion, the individuals with HNSCC were classified into the low-and highrisk groups.

Prognosis
Prediction by the IRG-Based Signature.For the verification of the IRG-based signature's prognostic performance, the signature's impact on OS in both risk groups was subjected to comparison by Kaplan-Meier (K-M) survival analysis utilizing the "survival" package in R software, followed by the area under the curve (AUC) of the receiver operating characteristic (ROC) curves to assess IRG-based signature's accuracy through the "survival ROC" package in R software [24].Subsequently, the GEO database (GSE27020) was used for external validation.It was also subjected to K-M survival analysis and ROC curves to identify the signature's prognostic value and precision.The RS distribution and survival status of individuals with HNSCC in TCGA were constructed to further understand the prognostic capability of the signature.

Correlation Analysis.
The association of IRG-based signature with clinical variables (age, pathological grade, gender, and tumor and TNM stages) was analyzed.Moreover, the clinical factors and robustness of the signature in predicting the OS were demonstrated by employing univariate and multivariate Cox regression analyses.

2
Mediators of Inflammation 2.5.Construction of a Predictive Nomogram.Based on clinical variables, a nomogram was used to establishment a prognostic scoring system for predicting survival in HNSCC patients both in TCGA and GSE27020 databases.
2.6.Somatic Mutation Analysis.We obtained the somatic mutation profiles of all tumor samples from TCGA database and explored the mutation analysis for 528 patients.The R software "maftools" package was utilized to analyzed and visualized for mutation data of the low-risk group and the high-risk group.

Immune Microenvironment Analysis.
Considering the involvement of infiltrating immune cells in the tumor microenvironment, the single-sample gene set enrichment analysis (ssGSEA) algorithm was utilized to evaluate the immune score of each HNSCC sample from TCGA and GSE27020 databases [25].The different proportions of the infiltrating immune cells between the low-and high-risk groups were assessed by the Wilcoxon test.Moreover, the association between the RS and immune-related biological functions was performed for further exploring the underlining mechanisms.The gene expression profiles corresponding to samples of TCGA and GSE27020 databases were selected to perform the gene set variation analysis (GSVA).
2.8.Prediction of Clinical Application.The calculation of the half inhibitory concentration (IC50) of common chemotherapeutic agents was done, and the differences in the IC50 across the two risk groups were also evaluated for predicting the clinical application of the IRG-based signature both in TCGA and GSE27020 databases.
2.9.Molecular Mechanism Analysis.The STRING biological database (https://string-db.org/)was applied for extraction of the protein-protein interaction (PPI) network [26] as a mathematical representation of the physical contacts among differentially expressed IRGs linked to HNSCC patient survival.Thereafter, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were employed in determining the potential function of the immune-related DEGs [27].3 Mediators of Inflammation 2.10.Investigating the Expression of Hub Genes.The Gene Expression Profiling Interactive Analysis (GEPIA) (http:// gepia.cancer-pku.cn/)was utilized for the investigation of the differential mRNA expression profiles of the hub genes in the IRG-based signature.Moreover, the Human Protein Atlas (HPA) (https://www.proteinatlas.org/) was employed for the purpose of validating the translational levels of these hub genes.
2.11.Statistical Analysis.R version 3.6.2was utilized to conduct statistical analysis procedures.DEGs were com-pared with multiple test corrections with FDR < 0:05 and jlogFCj > 2 were viewed as being dramatically dysregulated.The survival curves were estimated by using the K-M survival analysis and log-rank test between different groups.Clinicopathological features were compared by univariate and multivariate Cox regression analyses.The ssGSEA algorithm and Wilcoxon test were used to compare different proportions of the infiltrating immune cells between different groups.The t-test or Wilcoxon test for comparisons of two variables, and a P < 0:05 (two-side) was taken as a statistically significant value.5 Mediators of Inflammation identified to predict survival in the univariate Cox regression analysis.To study their interactions, the STRING biological database was utilized to construct the PPI network, containing 11 nodes and 22 edges.Based on the degree of genes, IL1A, CTLA4, CCR8, IL12RB2, TNFRSF4, CXCL13, and PLAU appeared to be the core genes among these IRGs (Figure 3(a)).As for functional analysis in Figure 3(b), the 136 immune-related DEGs were mostly enriched in immune response/cytokine mediation (BP), immunoglobulin complex/external side of plasma membrane (CC), and cytokine activity/signaling receptor activator activity/receptor ligand activity (MF) by GO analysis.By KEGG pathway analysis, the genes were mostly enriched in cytokine-cytokine receptor interaction, viral protein interaction with cytokine and cytokine receptor, and chemokine signaling pathway.

Development and Verification of the IRG-Based
Prognostic Signature.The detailed characteristics along with population demographics are given in Table 1.To develop a predictive IRG-based signature, seven IRGs were chosen after univariate and multivariate Cox analyses (Table 2).Meanwhile, the expression levels of the seven IRGs were further investigated.Compared with normal tissues, only SLURP1   Moreover, individuals with HNSCC were categorized into the low-risk group (n = 249) and the high-risk group (n = 249) as per the median RS.HNSCC patients at high risk showed a worse survival in the K-M analysis, in comparison to the patients at low risk (P < 0:001) (Figure 4(a)), with the AUC of 0.685 for the 5-year ROC curve (Figure 4(b)), indicating a certain predictive value of the signature in predicting the survival of individuals with HNSCC.Meanwhile, this IRG-based signature was validated in the GEO database (GSE27020) of 109 HNSCC patients who were also grouped into the low-risk group (n = 54) and the high-risk group (n = 55).Consistent with TCGA database, the K-M analysis of the GEO data exhibited that high-risk HNSCC individuals presented a worse survival in comparison with the low-risk group (P < 0:05) (Figure 4(c)), with the AUC of 0.637 for 5-year ROC curve (Figure 4(d)).
Additionally, there were more deaths in HNSCC patients with the elevation in the value of RS (Figures 4(e) and 4(f)), and the seven genes showed differences in mRNA expression across the two groups in the heat map (Figure 4(g)).Furthermore, the associations between clinicopathological parameters and the RS and associations between the seven genes and clinical variables were also evaluated (Table 3, Figure 5(c)).The results revealed that the male gender, high tumor stage, and T stage were linked to a greater value of RS.In addition, mRNA expression levels of IGHM and SLURP1 appeared to be elevated in females in comparison to males.The mRNA expression level of STC2 appeared to be elevated in males when compared with females, and a high pathological grade was correlated with lower mRNA expression of GAST and SLURP1.The results also suggested that higher mRNA expression of IGHM was remarkably linked to a high grade.Greater mRNA expression of GAST was significantly linked to a more advanced tumor stage.The elevated mRNA expression level of DKK1 and GAST was correlated with the advanced T stage.Moreover, lower mRNA expression of IGHM, as well as higher mRNA expression of SLURP1 and STC2, was correlated with the advanced M stage.3.5.Somatic Mutation Analysis.We obtained somatic mutation profiles of 528 patients in TCGA database.Around 241 (97.57%) and 229 (93.47%) samples possessed somatic mutations in the high-risk and low-risk groups, respectively.The top 30 mutated genes for high-risk and low-risk groups 3.6.Immune Microenvironment Analysis.The association between 23 immune cells infiltration differences and different risk groups was analyzed in TCGA and GSE27020 databases.Patients in the low-risk group showed higher     In addition, the relationship between immune pathway scores and RS were analyzed in order to better explore the immune-related biological functions.Functions with a correlation greater than 0.2 and P < 0:05 are shown in Supplementary Figure 1.The results indicated that 14 immunerelated pathways were correlated negatively with the RS in TCGA database (Supplementary Figure 1A).In the GSE27020 database, 8 immune-related pathways were correlated negatively with the RS, while 1 was correlated positively (Supplementary Figure 1B).These immunerelated pathway scores vary with increasing levels of RS, implying that an imbalance in these pathways is closely related to tumor development.

Prediction of Clinical Application.
The association of risk with the therapeutic efficacy of common chemotherapeutic agents in HNSCC was also studied.The findings exhibited that the low-risk HNSCC patients presented increased sensitivity to Elesclomol, GW843682X, Midostaurin, Pazopanib, QS11, and Salubrinal in TCGA database (Figure 9(a)), and the low-risk group was more likely with higher sensitivity of Bexarotene, BI.2536, MG.132, QS11, Salubrinal, and Thapsigargin in the GSE27020 database (Figure 9(b)).The results indicated that HNSCC patients with low risk represented higher sensitivity to chemotherapy.

Investigation of the Expression of the Seven IRGs.
The expression of the seven IRGs in HNSCC was explored with the help of the GEPIA database.The expression levels of the seven IRGs varied remarkably across cancerous and healthy tissues (Figure 10(a)).However, to validate these findings, more experimental analyses were required.Moreover, the HPA database was employed to investigate the expression of the seven IRGs at the translation level.Among the seven IRGs, expressions of IGHM and SLURP1 were lower in the HNSCC tissues.Moreover, STC2 showed higher expression in HNSCC by immunohistochemistry.No remarkable variations were observed in the expressions of GAST, IL12RB2, and TNFRSF4 across normal and HNSCC tissues, while DKK1 was not detected by immunohistochemistry in the HPA database (Figure 10(b)).However, to further validate the translational relevance of the seven IRGs on HNSCC, more clinical analyses on HNSCC samples are needed.

Discussion
During this research, an IRG-based signature was established, which was capable of anticipating the clinical landscapes of HNSCC patients and correlated with clinicopathological characteristics of affected individuals, the numbers of tumor-infiltrating immune cells, and the efficacy of common chemotherapeutics.These findings suggested that this signature may be valuable for predicting HNSCC-related prognosis and provide good clinical application in immunotherapy and chemotherapy.
The IRG-based signature consisted of seven genes (i.e., DKK1, GAST, IGHM, IL12RB2, SLURP1, STC2, and TNFRSF4).Among them, DKK1 is a member of the DKK family and regulates cell proliferation, migration, and apoptosis in various tumor tissues through β-catenin-dependent and β-catenin-independent mechanisms [28].Moreover, as a tumor suppressor gene, DKK1 causes apoptosis and suppresses cell proliferation [29].Gao et al. suggested that elevated DKK1 expression levels can predict poor prognosis in HNSCC patients [30].STC2 regulates tumor cell     14 Mediators of Inflammation proliferation, apoptosis, and angiogenesis and is also vital for the invasiveness and metastasis of HNSCC [31].IL12RB2 is a subunit of the IL-12 receptor, and an increased ratio of IL12RB2-positive tumor-infiltrating lymphocytes is indicative of a good prognosis in laryngeal cancer [32].SLURP1 belongs to the Ly6/uPAR family that lacks a GPI-anchoring signal sequence and is associated with a poor prognosis of HNSCC [33].Furthermore, one of the tumor necrosis factor receptors, TNFRSF4, could be a useful target for immunotherapy of HNSCC [34].Although there are no published reports on GAST and IGHM for HNSCC, these genes may be related to tumorigenesis and development [35,36].In general, these previous findings emphasize the importance of these seven genes in HNSCC prognosis prediction.Furthermore, the expression levels of GAST, IL12RB2, and TNFRSF4 in HNSCC samples appeared to be elevated in healthy tissues from the GEPIA database, while no apparent variations were observed between cancerous and healthy tissues from the HPA data.Except for SLURP1 and STC2, IGHM expression in HNSCC tissues was remarkably increased compared to that in healthy samples from the GEPIA database, which was inconsistent with the HPA database.This could be due to abnormal methylation.However, further experimentation is required to confirm this finding.
In the multivariate analysis for the associations between clinicopathological factors and the risk IRG-based signature, a high-immune RS was linked to a high tumor stage and T stage.Also, the signature predicted the possible clinical features of HNSCC subjects, likely by regulating the tumor immune microenvironment.The tumor-infiltrating immune cells are known to be correlated with the progression and OS of HNSCC subjects [37], and a high level of infiltrating immune cells is often a good predictor for the OS of patients [38,39].Therefore, the risk IRG-based signature is expected to correlate with infiltrating immune cells.As expected, lowrisk HNSCC patients had increased infiltration rates of 23 immune cells, indicating the effectiveness of immunotherapy in the low-risk category compared with that in the high-risk category.Owing to the importance of chemotherapy in HNSCC, the IC 50 values of various chemotherapeutic agents were compared in the two groups.A lower RS was linked to a higher IC 50 of QS11 and Salubrinal in both TCGA and GSE27020 databases.The QS11 is an inhibitor of ADPribosylation factor GTPase-activating protein 1, which modulates Wnt/β-catenin signaling through an effect on protein trafficking [40]; and the Salubrinal is a selective cell complex inhibitor that inhibits endoplasmic reticulum stressmediated apoptosis.Despite these two drugs are not commonly used as chemotherapy drugs for HNSCC, the findings of this study may be valuable for future research.
This study had some limitations.(i) Owing to limited HNSCC samples in TCGA and GSE27020 databases, an issue of the time period was evident in this study.(ii) The analyses were performed using publicly available data from retrospective studies, and the outcomes must be validated in further research with larger samples and functional experiments.(iii) There is a need for further exploration of other possible predictive factors linked to clinical outcomes in HNSCC.(iv) There is a need for further investigation of the mechanisms underlying the functions of the IRG-based signature in HNSCC.Bioinformatics analysis with a specific reference value was used as a basis to conclude this study.Further corresponding molecular experiments are required to validate these findings.
To conclude, a risk signature based on seven IRGs (DKK1, GAST, IGHM, IL12RB2, SLURP1, STC2, and TNFRSF4) was developed.This signature serves as a potential biological marker and treatment target for immunotherapy and chemotherapy of HNSCC.These findings may facilitate future studies on the molecular mechanisms of HNSCC.

Figure 2 :
Figure 2: Analysis of differentially expressed genes.(a) Volcano plot of differentially expressed genes in HNSCC.Red dots represent upregulated genes, and green dots represent downregulated genes with statistical significance (FDR < 0:05, jlogFCj > 2), while black dots represent the genes without differential significance.(b) Heatmap of differentially expressed genes in HNSCC tumor tissues.The colors from green to red represent differentially expressed genes with low to high expression levels.(c) Volcano plot of differentially expressed immune-related genes in HNSCC tumor tissues.(d) Heatmap of differentially expressed immune-related genes in HNSCC tumor tissues.

Figure 3 :
Figure 3: Immune-related differentially expressed genes analyzes.(a) PPI network of immune-related differentially expressed genes as predictors of prognosis of HNSCC patients.The Arabic numerals represent the degree of genes.(b) The GO and KEGG pathway analyses based on immune-related differentially expressed genes.(c) Comparison of the expression levels of the seven IRGs between normal tissues and tumor tissues.* P < 0:05, * * P < 0:01, and * * * P < 0:001.

3. 4 .
Construction of a Predictive Nomogram.The clinical variables and RS were included in the nomogram.As indicated in the nomogram, the RS had the largest contribution to survival of patients with HNSCC both in TCGA and GSE27020 databases (Figures 6(a) and 6(b)).

Figure 4 :
Figure 4: Development and validation of prognostic signature derived from immune-related genes.(a) K-M analysis of the effect of the prognostic signature on OS of HNSCC patients in TCGA database.(b) ROC curves of the prognostic signature in TCGA database.(c) K-M analysis of the effect of the prognostic signature on OS of HNSCC patients in the GEO database.(d) ROC curves of the prognostic signature in the GEO database.(e) The RS distribution in HNSCC patients.(f) The survival status of HNSCC patients.(g) Heatmap of the expression of nine immune-related genes in the low-risk and high-risk groups.

Figure 5 :
Figure 5: Correlations between the prognostic signature and clinical characteristics of HNSCC.(a) Forest plot of univariate Cox analysis.(b) Forest plot of multivariate Cox analysis.(c) Correlations between the risk score of expression of the seven genes and clinical characteristics.

Figure 6 :
Figure 6: Nomogram for the prediction of survival for patients with HNSCC.(a) Nomogram for the prediction of survival at 3 and 5 years in TCGA database.(b) Nomogram for the prediction of survival at 3 and 5 years in the GSE27020 database.

Figure 7 :
Figure 7: Evaluation of somatic mutation.(a) The mutation profile of the top 30 mutation genes in high-risk patients.(b) The mutation profile of the top 30 mutation genes in low-risk patients.

Figure 10 :
Figure 10: The expression of the seven IRGs.(a) The mRNA expressions of the hub genes from the GEPIA database.(b) Validation of the hub genes on a translational level using the HPA database.

Table 1 :
Comparison of characteristics between TCGA database and the GEO database (GSE27020).
TCGA: The Cancer Genome Atlas; GEO: Gene Expression Omnibus.* Chi-square test for the comparison of characteristics between TCGA database and the GSE27020 database for each clinical variable.

Table 2 :
The detailed information of the immune-related gene signature for the survival of HNSCC patients.

Table 3 :
Correlations between the seven immune-related genes and clinical characteristics.