miRNA-Based Feature Classifier Is Associated with Tumor Mutational Burden in Head and Neck Squamous Cell Carcinoma

Tumor mutation burden (TMB) is considered to be an independent genetic biomarker that can predict the tumor patient's response to immune checkpoint inhibitors (ICIs). Meanwhile, microRNA (miRNA) plays a key role in regulating the anticancer immune response. However, the correlation between miRNA expression patterns and TMB is not elucidated in HNSCC. In the HNSCC cohort of the TCGA dataset, miRNAs that were differentially expressed in high TMB and low TMB samples were screened. The least absolute contraction and selection operator (LASSO) method is used to construct a miRNA-based feature classifier to predict the TMB level in the training set. The test set is used to verify the classifier. The correlation between the miRNA-based classifier index and the expression of three immune checkpoints (PD1, PDL1, and CTLA4) was explored. We further perform functional enrichment analysis on the miRNA contained in the miRNA-based feature classifier. Twenty-five differentially expressed miRNAs are used to build miRNA-based feature classifiers to predict TMB levels. The accuracy of the 25-miRNA-based signature classifier is 0.822 in the training set, 0.702 in the test set, and 0.774 in the total set. The miRNA-based feature classifier index showed a low correlation with PD1 and PDL1, but no correlation with CTLA4. The enrichment analysis of these 25 miRNAs shows that they are involved in many immune-related biological processes and cancer-related pathways. The miRNA expression patterns are related to tumor mutation burden, and miRNA-based feature classifiers can be used as biomarkers to predict TMB levels in HNSCC.


Introduction
Head and neck squamous cell carcinoma (HNSCC) remains the primary cause of cancer-related mortality in the world, which is characterized by advanced diagnosis [1], poor prognosis [2], low overall survival [3], and high recurrence [4]. HNSCC is regarded as the tenth most common cancer in the world and the seventh most common cause of cancer deaths. There are approximately 400,000 oral and pharyngeal diseases, 160,000 laryngeal cancers, and 300,000 deaths worldwide each year [5][6][7]. HNSCC is a common heterogeneous tumor that exists in the oral, pharyngeal, and larynx lesions [8]. Risk factors for oral cancer may include mutant oncogenes, the presence of extensive P53, and lower levels of tumor hypoxia, smoke, alcohol, age factor (>65 years), and HPV or EBV infection, but the mechanism remains to be explored in detail. Existing treatments are deficient for patients with locally advanced or distantly metastatic HNSCC. Therefore, looking for a new treatment method is currently urgent.
At present, tumor immunotherapy has been the main driving force for personalized precision medicine, and efforts are being made to use the immune system to treat advanced or metastatic cancers [9][10][11][12]. The main categories of immunotherapy include monoclonal antibody immune checkpoint inhibitors (ICIs), therapeutic antibodies, cancer vaccines, immune system regulators, cell therapy, and synthetic small molecule inhibitors. In recent years, the most worth mentioning is that HNSCC has made a breakthrough in immunotherapy, especially in the discovery of inhibitors of the immune pathway-related checkpoint, such as antiprogrammed death 1 (anti-PD1) and anti-CTLA4, which have important effects on the treatment of locally advanced metastatic HNSCC [13][14][15].
Tumor mutation load (TMB) is a new type of biomarker that predicts the effect of immunotherapy. Multiple studies have investigated the prognostic role of TMB in immunotherapy for multiple cancer types [16][17][18]. The translation of mutated genes into modified proteins requires posttranscriptional regulation, and microRNA (miRNA) is an important molecule involved in posttranscriptional regulation. Meanwhile, the abnormal expression of miRNAs was involved in the occurrence of cancer, and it has attracted people's attention. Previous studies have shown that miRNA may act as a prognostic indicator in many types of cancers [19,20]. Recently, people are paying more and more attention to the role of miRNA in regulating the anticancer immune response [21,22]. It has been clarified that miRNA is involved in mediating and controlling various immune and cancer cell interactions. Therefore, we hypothesized that miRNA expression patterns can be used as biomarkers to predict TMB levels. To verify our hypothesis, we downloaded the data of the head and neck squamous cell carcinoma (HNSCC) dataset from TCGA, including mutation annotation files and miRNA expression profiles, and constructed miRNA-based feature classifiers to predict TMB levels.

Materials and Methods
2.1. Data Acquisition and Processing. First, the mRNA and miRNA profiling was downloaded from the TCGA database by the GDC tool (https://portal.gdc.cancer.gov/ ), including 502 HNSCC patients and 44 adjacent-tumor samples. Then, we excluded the adjacent-tumor samples and retained the tumor samples. We also obtained the somatic mutation data of all patients in the "Masked Somatic Mutation" category of TCGA processed by VarScan software. Besides, clinical information of each patient, including age, gender, TNM stage, tumor grade, and survival time, is obtained through the GDC portal.
In the end, a total of 501 samples with both miRNA expression and mutation profiling were considered eligible in this analysis. These samples were randomly assigned to the training (60%) and test (40%) sets. The flowchart of this study is exhibited in Figure 1.

Assessment of TMB for Each Patient and Prognostic
Analysis. Tumor mutational burden was defined as the number of somatic variants per megabase (MB) of the genome [23]. We employed the Kaplan-Meier analysis with the logrank test to screen optimal TMB value. Then, the patients were divided into the high and low TMB groups according to the optimal cut-off point of the TMB value. Meanwhile, the TMB levels from the TCGA cohort were merged with corresponding survival data of each sample via merge function in R. Also, we further assess the relationship between the TMB groups with several clinical variables. We used the Wilcoxon rank-sum test to compare the clinical characteristics of the two groups, while the Kruskal-Wallis test is used for comparison between three or more groups.

Mutation Analysis.
The mutation data in the MAF of HNSCC patients were used in the TCGA dataset. The R package "maftools" was used to display the mutation profile of each group [24]. The maftools was also used to impute the mutation rate of each gene and also identified significant mutant genes in different subtypes (p < 0:05).

Identification of Differentially Expressed miRNAs.
The differentially expressed miRNAs between high TMB samples and low TMB samples were conducted using the "limma" package in R [25]. The differentially expressed miRNA of datasets with |log2 fold change | ≥1:0 and a p value less than  Figure 1: Flowchart delineates the study design and analysis process.
2 BioMed Research International 0.01 was considered the selection criteria for subsequent analysis. The heatmap plot was also exhibited.

Principal Component Analysis (PCA) and Least Absolute
Shrinkage and Selection Operator (LASSO) Analysis. The expression values of differentially expressed miRNAs for each HNSCC sample were extracted in the training set. The LASSO logistic regression model was conducted using the "glmnet" package in R, which has a powerful predictive value and a low correlation between each other to prevent overfitting and applied to select optimal features for the highdimensional data [26]. Before using the expression profiles of all the differently expressed miRNAs for feature selection, we first performed PCA to examine the distribution of samples. PCA was then performed using the expression profile of the best differentially expressed miRNA. The PCA plot was drawn across the first two principal components.
2.6. miRNA-Based Feature Classifier for Predicting TMB Level. We used the LASSO method to build a prediction model, selected nonzero regression coefficients to identify the optimal miRNA set, and used the selected miRNA to build a miRNA-based feature classifier to predict TMB levels. The regression coefficients from the LASSO analysis were used to create a classification index for each sample, and the following formula was used to weigh the expression value of the selected miRNA: The Prognostic Index = ðExp miRNA1 * Coef 1Þ + ðExp miRNA2 * Coef 2Þ + ðExp miRNA3 * Coef 3Þ + ⋯+ðExp miRNAn * CoefnÞ. The test set is used to verify the robustness and portability of the classifier. The prognostic performance of the classifier was evaluated using receiver operating characteristic (ROC) curves by comparing the sensitivity and specificity of the survival prediction. The ROC curves were plotted using the "pROC" package in R [27].

Association between the miRNA-Based Signature
Classifier Model and the Expression of Three Immune Checkpoints. The correlation between the miRNA-based signature classifier index and the expression of three immune checkpoints (PD1, PDL1, and CTLA4) was analyzed using the Spearman rank correlation analysis. A p value of less than 0.05 was considered statistically significant.

Identification of Target Genes and Functional Enrichment
Analysis. starBase [28], TargetScan [29], and miRDB [30] were used to check whether these three immune checkpoints are target genes of any of these miRNAs. Besides, DIANA mirPath web server was used to perform the KEGG pathway and Gene Ontology (GO) enrichment analyses for these miRNAs [31]. The "ggplot2" package was used to visualize the enrichment results. The enrichment analysis was based on the threshold of p value < 0.05. 3 BioMed Research International 2.9. Correlation between miRNA-Based Signature Classifier and Pathways. Firstly, the gene set of the interested pathway was downloaded from the KEGG (Kyoto Encyclopedia of Genes and Genomes) dataset (http://www.genome.jp/kegg/ ). Secondly, single-sample gene set enrichment analysis (ssGSEA) was used to calculate the enrichment scores of the pathway. Thirdly, the relationship between miRNAbased signature classifier and pathways using the Spearman rank correlation analysis was studied.

Comparisons of Clinical Feature between High and Low
TMB Groups. We conducted the Kaplan-Meier analysis to screen the optimal cut-off point (TMB: 3.236842105); the survival plot is shown in Figures 2(a) and 2(b). Based on the optimal cut-off point, the HNSCC patients were divided into two groups: the high TMB and low TMB groups. We compared the clinical difference between the high and low

Comparisons of Gene Mutation between High and Low
TMB Groups. This study examined the association between the high and low TMB groups and the count of somatic mutation. Gene mutation profiles of these highly mutated genes are shown in Figures 4(a) and 4(b). In the high TMB group, TP53, TTN, and MUC16 were the most mutated genes, while TP53, TTN, and FAT1 were the most mutated genes in the low TMB group. MUC16 showed a higher   BioMed Research International mutation rate in the high TMB group, and FAT1 exhibited a higher mutation rate in the low TMB group with the cut-off point less than 0.05 (Figure 4(c)).

The Screen of Differentially Expressed miRNAs.
In both the train and test groups, there are no statistically significant disparities in basic clinical features in Table 1. The 206 samples with a high TMB level and 95 samples with a low TMB level were included in the training group. We conducted differentially expressed analysis in the training group that a total of 65 differentially expressed miRNAs, including 55 upregulated miRNAs and 10 downregulated miRNAs, were identified with the threshold of cut-off point (p < 0:01 and |log 2FC | >1:0). The heatmap plot of differentially expressed miRNAs is exhibited in Figure 5.
According to starBase, TargetScan, and miRDB, only PDL1 is targeted by hsa-miR-7-5p, while both PD1 and CTLA4 are not targeted gene of these 25 miRNAs.
3.6. GO and KEGG Analyses. The enrichment analysis was conducted to describe the biological function of the target of 25 miRNAs. It revealed enrichment of 301 Gene Ontology categories. The enrichment biological process is shown in Figure 9(a). Several immune-related biological processes were observed, including epidermal growth factor receptor signaling pathway, Toll-like receptor signaling pathway, leukocyte migration, and immune system process. A total of 58 KEGG pathways were enriched by the target genes (Figure 9(b)). Among these KEGG pathways, several cancer-related pathways were identified, including the Wnt signaling pathway, PI3K-AKT signaling pathway, P53 signaling pathway, and TGF-beta signaling pathway.

Association between miRNA-Based Classifier Index and
Pathways. Single-sample gene set enrichment analysis (ssGSEA) was employed to impute the enrichment scores of interested pathways. The heatmap is exhibited in Figure 10(a). Meanwhile, we found that the miRNA-based classifier index was negatively correlated with the Wnt signaling pathway, PI3K-AKT signaling pathway, and TGFbeta signaling pathway Figure 10(b).

Discussion
Satisfactory results had been achieved in the immunotherapy for the treatment of aggressive or advanced cancers [32][33][34]. The progress of the tumor depends on the evasion of immune surveillance of cancer cells and the acquisition of effective immune response traits. HNSCC is considered to be an immunosuppressive disease. Its absolute lymphocyte count is lower than those found in a healthy population; it defects human leukocyte antigen (HLA), impairs the natural killer (NK) cell activity, and has poor antigen presentation function [35][36][37]. Impairment to tumor-infiltrating T lymphocytes has also been reported in HNSCC and other cancers, which has a great impact on the clinical outcome [38,39]. Besides, inhibitory regulatory T cells (Tregs) are also associated with the tumor progression [40,41]; for example, Treg secretes inhibitory cytokines, such as TGF-β and IL-10, and expresses cytotoxic T lymphocyte-associated protein 4 (CTLA4). Therefore, immunomodulation therapy to  Although immunotherapy has shown satisfactory results for tumor therapy, only a small percentage of patients can benefit from it. Therefore, many studies have been designed to find predictive biomarkers of the immune response. At present, tumor mutational burden (TMB) is promising as another effective predictive biomarker for treatment with ICIs and independent of PDL1 expression [42,43]. Multiple studies have shown that TMB can be effectively measured from liquid biopsy/blood, which may be an alternative method for biopsy [44]. However, TMB assessment by liquid biopsy must confront the problem that circulating DNA derived from tumor cells is usually only a small part of the DNA of noncirculating cells, and it does not reflect the true situation of tumor TMB [45].
Because of the key role of miRNAs in tumor-related immune responses, several studies reported that miRNA was involved in the development of HNSCC [46]. However, the association of miRNA expression patterns and TMB was not previously described in HNSCC. In the present work, the differentially expressed miRNAs between high TMB level and low TMB level samples are identified, and their expression patterns can distinguish high TMB level and low-level TMB samples. Using the LASSO model, a miRNA-based feature classifier was established in the training set and then verified in an independent test set. The accuracy of the 25-miRNA-based classifier is 0.850 in the training set, 0.810 in the test set, and 0.840 in the overall set. The miRNA-based feature classifier can predict TMB levels well.
In HNSCC, the interaction between programmed death receptor 1 (PD1) and its ligands PDL1 and PDL2 has been shown to downregulate T cell activation in human models [47]. Besides, PDL1 is often found overexpressed in >50-60% of HNSCC patients [48]. Targeted immune checkpoint receptors, including anti-CTLA4 and antiprogrammed death 1 (anti-PD1), provide further hope for patients to benefit from immune modulation. In the study, we found that the miRNA-based feature classifier is highly correlated with SNCA (PD1), CD274 (PDL1), and CTLA4. It has been widely speculated that high TMB levels leading to an increase in tumor neoantigens may trigger the immune system to   attack the tumor. This study shows that multiple cancerrelated miRNAs are differentially expressed among tumor samples with different TMB levels. The enrichment analysis for the miRNA-signature classifier suggested that the 25 miRNAs are involved in immune-related biological processes and cancer-related KEGG pathways, such as the Toll-like receptor signaling pathway, leukocyte migration, immune system process, Wnt signaling pathway, P53 signaling pathway, and TGF-beta signaling pathway.
In conclusion, HNSCC patients with different TMB levels have different miRNA expression patterns. A miRNA-based signature classifier was constructed and may be served as a biomarker to predict TMB levels in HNSCC. As our study results were derived from bioinformatic analysis, further clinical studies are needed to confirm these results.

Abbreviations
HNSCC: Head and neck squamous cell carcinoma TMB: Tumor mutational burden ICI: Immune checkpoint inhibitors LASSO: The least absolute contraction and selection operator.

Data Availability
Data availability could be obtained from the TCGA website.

Conflicts of Interest
The authors declare that they have no competing interests.