HPV-Related Prognostic Signature Predicts Survival in Head and Neck Squamous Cell Carcinoma

Background Head and neck squamous cell carcinoma (HNSCC) is one of the most common cancers, worldwide. Considering the role of human papilloma virus (HPV) in tumor development and sensitivity to treatment of HNSCC, we aimed to explore the prognostic classification ability of HPV-related signatures in head and neck cancer. Methods HPV-related signatures were screened out based on Gene Expression Omnibus (GEO) and the Cancer Genome Atlas (TCGA) databases. HPV-related signatures with prognostic value were identified through univariate Cox regression analysis and a risk signature was established by least absolute shrinkage and selection operator (LASSO). Further, we developed a nomogram by integrating independent prognostic factors. Results A total of 55 HPV-associated signatures were differentially expressed and ten of them were associated with prognosis of HNSCC patients. The prognostic signature based on CDKN2A, CELSR3, DMRTA2, SERPINE1, TJP3, FADD, and IGF2BP2 expression was constructed. Univariate and multivariate regression analyses demonstrated that the novel prognostic signature was an independent prognostic factor of HNSCC. The nomogram integrating the prognostic signature and other independent prognostic factors was developed. Conclusion In summary, the prognostic signature of the HPV-related signatures might serve as an important prognostic biomarker for patients with HNSCC.


Introduction
Nearly 95% of head and neck cancer cases are caused by head and neck squamous cell carcinoma (HNSCC), considered the sixth most common form of cancer globally [1]. It has been reported that, despite advances in treatment modalities, outcomes of HNSCC patients have not been improved signifcantly [2]. Local invasion and distant metastasis at frst diagnosis are the main reasons for their worse prognosis [1]. Terefore, it is urgent to construct new prognostic models to aid early clinical diagnosis.
As far as etiology is concerned, HNSCC is primarily caused by tobacco and alcohol use [3]. Troughout the world, research studies on HPV have proliferated in recent years [4]. Globally, approximately 25% of the number of head and neck cancer cases are caused by HPV infection [4]. Also, HPV causes almost all cases of cervical cancer and a subset of other anogenital cancers [5].
In this study, we tried to identify HPV-related signatures and construct a novel prognostic model and nomogram by using sequencing data and clinically relevant information of patients with HNSCC. We expect to improve the diagnosis and prognosis of patients with head and neck cancer from the perspective of HPV. geo/) [6], which included 73 HPV-positive and 196 HPVnegative patients with HNSCC, and clinical information is shown in Supplementary Table 2. Te package "limma" [7] was used for diferential expression analysis with the screening criteria (adj.p < 0.05). Transcript expression profling of 501 HNSCC tumor samples and 44 adjacent normal tissues was obtained from TCGA [8], and clinical information is shown in Supplementary Table 3.

Enrichment Analysis.
To reveal the biological features of hub HPV-related signatures, Gene Ontology (GO) [9] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [10] pathway analyses were conducted in this part. Tightly linked interaction network may facilitate the exploration of specifc mechanisms in HNSCC, and closely linked genes may lead to an accurate prognostic model. In parallel, we constructed a protein interaction network by using the STRING database (https://string-db.org/) [11] in order to visualize the hub signatures.

Construction and Verifcation of the Prognostic Signature.
Prognosis-associated signatures were picked out through the univariate Cox regression analysis with p < 0.01 in TCGA cohort. To avoid overftting, least absolute shrinkage and selection operator (LASSO) analysis was applied to further screen HPV-related signatures. We calculated the risk score for each HNSCC patient by using the following formula: Risk score � gene (A) expression × coef (A) + gene (B) expression × coef (B) + gene (i) expression × coef (i) [12]. Patients with HNSCC were divided into the high-and lowrisk groups according to the median risk score. Kaplan-Meier survival analysis was used to compare the overall survival rate between the high-and low-risk groups. Principal component analysis (PCA) was used to evaluate the clustering ability of the prognostic signature.

Immune Infltration Analysis.
In this study, we attempted to perform immune infltration analysis by using the CIBERSORT [13] and single-sample Gene Set Enrichment Analysis (ssGSEA) algorithm [14]. Moreover, we attempted to mine drugs targeting the high-risk group with poor prognosis by using the Genomics of Drug Sensitivity in Cancer (GDSC) database [15].

Cox Regression Analyses and Construction of a Nomogram.
Univariate and multivariate Cox regression analyses were used to evaluate independent prognostic factors in HNSCC. A nomogram integrating the independent prognostic factors was established by "rms" package [16]. Calibration curves and area under the curve (AUC) were used to verify the validity of the nomogram [17,18] we constructed.

Expression and Pathway Enrichment of HPV-Related
Signatures in HNSCC Patients. We derived 1136 HPVassociated signatures with signifcant diferences from the GSE65858 cohort (Figure 1(a), Supplemental Table 1). A total of 55 HPV-associated signatures were diferentially expressed in TCGA cohort according to screening criteria (Figures 1(b)-1(c)). In the light of GO and KEGG pathway analyses, HPV-associated signatures were signifcantly accumulated in focal adhesion, ECM-receptor interaction, AGE-RAGE signaling pathway in diabetic complications, protein digestion and absorption, and relaxin signaling pathway (Figures 1(d)-1(e)). Also, a protein interaction network was constructed with tightly linked HPV-associated signatures ( Figure 1(f )).

Construction and Validation of a Prognostic Model Based on HPV-Related Signatures in HNSCC Patients.
Ten prognosis-related signatures were identifed by univariate Cox regression analysis (Figure 2 Figure 1A. Te relationship between prognostic signature and clinicopathological factors is shown in Figures 4(a)-4(f ). Interestingly, we found that the risk score was highly expressed in smokers compared to nonsmokers (p � 0.00046, Figure 4(e)). In addition, HPVnegative patients had a signifcantly higher risk value than HPV-positive patients with HNSCC (Supplemental Figure 1B).

Te Prognostic Signature Was an Independent Prognostic
Factor for HNSCC Patients. Using both univariate and multivariate Cox regression analyses, age, stage, and the prognostic signature were found to be reliable independent predictors of HNSCC (Figures 6(a)-6(b)). Subsequently, a nomogram integrating independent prognostic factors with signifcant diferences was developed to predict the one-, three-, and fve-year OS of HNSCC patients (Figure 6(c)). In calibration curves, the predictive curves originated from the nomogram showed a high agreement with the ideal curve ( Figure 6(d)). AUC values of the nomogram were 0.665, 0.679, and 0.645 for one-, three-, and fve-year OS,

Discussion
Head and neck squamous cell carcinoma is a collection of epithelial tumors originating mainly from the mucosa of the oral cavity, oropharynx, larynx, or hypopharynx [19]. Despite continuous advances in treatment modalities, the prognosis of patients with head and neck cancer has not improved signifcantly over the past decades [20]. Te high recurrence rate of patients and the low response to intervention treatments such as chemoradiotherapy are the main reasons for the poor prognosis of patients [21,22]. In recent years, the popularization and rapid development of next-generation sequencing (NGS) technology have enabled us to have a new understanding and insight into the molecular landscape of diferent tumors [23]. For head and neck cancer, these advances provided new insights into the diferent molecular mechanisms of HNSCC, the tumor microenvironment (TME) and heterogeneity of HNSCC, and the diversity of clinical responses among HNSCC subtypes [23][24][25].
In this study, we attempted to construct a new prognostic model from the perspective of HPV by microarray and high-throughput sequencing data. Te fowchart of this study is shown in Figure 7. On the basis of the important impact of HPV on HNSCC, we identifed HPV-associated signatures by comparing the transcriptome data of HPVpositive and HPV-negative HNSCC patients from the GEO dataset. To achieve a larger sample size to justify the scientifc quality of this research, TCGA database was used for subsequent analyses. Ultimately, 55 HPV-associated signatures were diferentially expressed and were signifcantly accumulated in focal adhesion, ECM-receptor interaction, AGE-RAGE signaling pathway in diabetic complications, protein digestion and absorption, and relaxin signaling pathway. Subsequently, seven HPV-related signatures (CDKN2A, CELSR3, DMRTA2, SERPINE1, TJP3, FADD, and IGF2BP2) that enabled the classifcation of high-and low-risk HNSCC patients were screened out.
Among these prognostic HPV-associated signatures, we noted one of the interesting. Cyclin-dependent kinase inhibitor 2A, also known as p16, plays a critical role in cellcycle regulation [26]. In addition, infection with HPV leads to overexpression of p16, and thus p16 is often used as a common marker of HPV positivity [26,27]. Recently, multiple studies have surfaced that p16 expression in head and neck cancers was independent of HPV infection and should not serve as a reliable marker for HPV infection [28,29]. Consistent with previous studies, p16 functioned as a tumor suppressor in HNSCC patients of this study and as a risk score reducer in the prognostic model with a negative coef value.
Interestingly, in the present study, we found that smoking HNSCC patients had a higher risk value. Tese results indicated that smoking would afect HPV status as well as the expression of HPV-related signatures and may give clues to the mechanism of head and neck cancers. Trough prognostic analysis, we validated the prognostic model's validity and accuracy. In addition, immune infltration analysis showed a higher level of immune cell infltration in the low-risk group than that of the high-risk group, which may explain the worse prognosis in the highrisk group. To our knowledge, this study is the frst attempt *** ****** * * ******** *** * * *     Meanwhile, in our study, we encountered a few limitations. Te sample size and cohort size need to be expanded to guarantee their accuracy for this study. More patients with HNSCC and more prospective clinical trials need to be

Conclusion
We screened out diferentially expressed signatures between HPV + and HPV − HNSCC patients and developed a novel prognostic signature based on large sample datasets. Meanwhile, a novel nomogram was constructed by integrating independent prognostic factors.

Conflicts of Interest
Te authors declare that they have no conficts of interest.