Identification of Prognostic Aging-Related Genes Associated with Immune Cell Infiltration in Glioblastoma

,


Introduction
Glioblastoma multiforme (GBM) represents the most typical primary malignant intracranial tumor, particularly in the elderly [1]. Te standard frst-line treatment for GBM at this time involves the most extensive surgical resection along with radiotherapy and adjuvant chemotherapy [1][2][3]. Although considerable eforts have been made in the dealing of GBM in current years, the prognosis is still poor [4]. A previous study reported that the median survival time is only about one year, and about 5% of people survive for fve years overall [5]. Te patients' age has been measured as a major prognostic factor for clinical outcomes [6]. Recent statistics indicate that the percentage of elderly patients with GBM is up to 25%, which can be attributed to the gradual expansion of the digits in advanced aging patients [6]. However, the exact molecular pathogenesis of GBM in elderly patients has not yet been fully elucidated. As a result, more research on this disease is needed to predict therapeutic efcacy and guide clinical treatment decisions.
Notably, ageing is recognized as a major tumor risk factor, and thus ageing has become a focus in tumor research [7]. Many studies indicate that aging and aging-related diseases are mainly regulated by AGs which can suppress tumors through modulation of tumor cell senescence but may also enhance tumor enlargement, invasion, and metastasis [7]. However, the association between AGs and GBM prognosis has received little attention. Furthermore, there is no clear relationship between infammation and tumor immunity in GBM. To address the essential issue of genomic erosion, a sophisticated network of DNA damage response (DDR) systems has been developed.
Cell-cycle checkpoint pathways, damage tolerance mechanisms, and DNA repair mechanisms are a few of them [8].
Te HAGR is an important database for human agingrelated gene expression studies. Te value of AGs as prognostic factors for GBM patients was assessed in this research. Te aging-related gene expression profles were obtained from HAGR, while the GBM expression profles were derived from the CGGA database. Te main aim of this study was to elucidate the association between AGs and the prognosis of GBM by constructing a prognostic risk model. Meanwhile, the study also investigated the efects of AGs on GBM-related infammation and immunity.

Acquisition of Data.
All gene expression profles were obtained from three public databases: the HAGR (HAGR, https://genomics.senescence.info/genes/), the GEO S (GEO, https://www.ncbi.nlm.nih.gov/gds), and the (CGGA) database (CGGA, https://www.cgga.org.cn/). A total of 307 human AGs was downloaded from the HAGR, and the GEOquery package was used to access the expression profles of GSE4290 and GSE4412 from the GEO database. Te GSE4290 dataset contained 81 GBM samples, while the GSE4412 dataset contained 85 GBM samples and was used as an independent verifcation group. Moreover, the GBM appearance profle statistics were downloaded from the CGGA database. An entire set of 406 GBM samples with continuation information were acquired from the CGGA database and randomly allocated into two groups using a 7 : 3 ratio: the GBM training set (n � 284) and the GBM test set (n � 122). R software (version 3.6.3, https://www.r-project. org/) was utilized to analyze the data.

Analysis of Diferentially Expressed AGs (DEAGs).
Te R package limma was used to identify the DEAGs between the AGs and the GBM gene expression profles derived from GSE4290. |LogFC| > 1.5 and false fnding rate (FDR < 0.05) were set as the cut-of value. Finally, the DEAGs were visualized by a volcano plot using the ggpolt2 R package.

GO and KEGG Pathway
Analyses. Pathway enrichment analyses using the gene ontology (GO) and KEGG databases were conducted using the cluster Profler R package with a cut-of criterion of p value and FDR value <0.05 to investigate the gene function of the DEAGs. Biological processes (BP), cellular components (CC), and molecular functions make up the three categories that make up GO (MF).

Construction of a Prognostic Gene
Signature. To further screen DEAGs related to survival, univariate cox regression evaluation was used. Notably, the candidate prognostic genes were chosen using the 0.05 p value threshold. Next, in the CGGA training set, LASSO regression analysis was used. Te risk score was designed using the regression coefcient of each gene according to the following formula: (1) In the above formula, "n" indicates the number of the selected prognostic genes, "gene k " is the kth selected genes, "coefcient" represents the estimated regression coefcient of genes from the multivariate Cox regression analysis, and "Exp k " indicates the expression value of the kth selected genes. Te GBM training set retrieved from the CGGA database was then dichotomized into a high-risk and lowrisk groups according to the median risk score. A heatmap was used to show the relationship between candidate genes and risk scores, and Kaplan-Meier (KM) survival analysis and receiver operating characteristic (ROC) curve analysis were used to evaluate the risk score model's predictive power.

Gene Set Variation Analysis.
Te nonparametric, unsupervised technique for enriching gene sets is called gene set variation analysis (GSVA). Te CGGA dataset was subjected to GSVA using the GSVA R package to score the high-risk and low-risk groups in order to compare the signaling pathway activity between the two groups. In addition, gene-set enrichment analysis was used to pinpoint changes in gene expression at the pathway level in order to evaluate the biological utility of the risk model. Te Molecular Signatures Database v7.0 was used for running GSVA within the hallmark gene sets.

Evaluation of Immune Scores and Immune Cell
Infltration. Te ESTIMATE algorithm and the estimate R package were used to determine the immune and stromal scores for GBM samples. In addition, we imputed the composition of immune cell infltration in GBM through the CIBERSORT algorithm. It is worth noting that CIBERSORT provides a tool that is able to quantify the abundance of cell types in complex tissues using gene expression data [9].

Statistical
Analysis. R version 3.6.3 was utilized to conduct the statistical investigation. While survival statistics were conducted using the Kaplan-Meier curve and log-rank test, diferences in the distribution of the Chi-square test or Fisher's exact tests were used to compare categorical data. Te association between prognostic AGs and survival in GBM patients was also examined using univariate and multivariate Cox regression analysis. ROC curves were applied to validate the diagnostic value of the risk model, and the correlation between variables was determined using Spearman's rank correlation test. p < 0.05 was recognized to be statistically signifcant.

Analysis of DEAGs in GBM Samples.
In entire, 307 human AGs were downloaded from the HAGR, and the AGs were recognized using the gene expression profle of GSE4290. Te GSE4290 contained 29 DEAGs, of which 22 were upregulated and 7 were downregulated, according to the results. In order to see the DEAGs, a volcano plot was used (Figure 1(a)).

Functional Analysis of DEAGs.
Te biological functions and association of DEAGs in GSE4290 were explored using GO and KEGG pathway analysis. Figure 2(b) shows the top 30 KEGG enrichment terms. Functional analysis indicated that the DEAGs were enriched in cellular senescence, microRNAs in cancer, cell cycle, and other diverse tumorassociated pathways. Figure 2(a) shows the top 10 improved GO terms, including BP, CC, and MF. Notably, aging and cell aging were signifcantly developed in the GO BP terms. Tese results suggest that the DEAGs are intimately related with aging and tumor. Results represented in the forest plot showed that four DEAGs were signifcantly associated with the survival time, including C1QA, CDK1, EFEMP1, and IGFBP2 ( Figure 1(b)). Regarding that, LASSO regression was utilized to develop a prognostic risk technique for the four survivalassociated DEAGs. Te resulting formula was used to analyze the prognostic risk score: Based on their median risk scores, the patients in the CGGA training set were then categorised as high-risk or low-risk. Patients in the low-risk group had better overall survival (OS) than those in the high-risk group, according to the results of the survival analysis (p < 0.001, Figure 3(a)). Te AUC (area under the ROC curve) of the prognostic model was 0.747, 0.843, and 0.837 for the 1-, 3-, and 5-year OS, respectively, indicating a robust performance for survival prediction (Figure 3(b)). Figure 4(a) shows the risk plot for both high-and low-risk score groups, patient survival data, and gene expression information for the risk genes.

Verifcation of the Prognostic Risk Model in the Validation
Datasets. Te prognostic risk method was tested using CGGA test data to further validate its stability and reliability. Similarly, the GBM test set of the CGGA database was specifed into either high-risk (n � 61) or low-risk (n � 61) groups. Te K-M survival curve suggested that the overall survival (OS) of patients in the low-risk set was superior compared to those in the high-risk group (p < 0.001, Figure 3(c)). Te AUC for the GBM test set was 0.681, 0.785, and 0.738 for the 1-, 3-, and 5-year OS, respectively, indicating great performance for survival prediction (Figure 3(d)). Figure 4(b) shows the risk distribution, patient survival status, and gene expression data of the risk genes in the CGGA test. Furthermore, the stability and reliability of the prognostic risk model were validated using an independent dataset, the GSE4412 dataset retrieved from the GEO database. Te same risk model was applied, and the GBM test set obtained from the GEO was split into two groups: high-risk (n � 43) and low-risk (n � 42). Results showed signifcant diferences in the overall survival (OS) of patients between the low-risk group and the high-risk group (p < 0.001, Figure 3(e)). Moreover, the AUC was 0.725, 0.808, and 0.793 for the 1-, 3-, and 5-year OS, respectively ( Figure 3(f )). Te corresponding risk distribution, patient survival status, and gene expression data of the risk genes in the GSE4412 test set are displayed in Figure 4(c).

GSVA of Risk Score-Related Signaling Pathways.
GSVA was conducted to assess potential functional enrichment in the high-risk and low-risk groups in the CGGA dataset. Figure 5 shows the top 10 signaling pathways developed in the high-risk group, including angiogenesis, coagulation, epithelial-mesenchymal transition, hypoxia, IL-6/JAK/STAT3 signaling, provocative response, interferongamma response, and beta signaling. Most of them are common signaling pathways in the tumor immune microenvironment, metabolism, and progression.

Association between the Risk Score and Tumor Immunity.
Te immune and stromal scores in the CGGA and GSE4412 datasets, respectively, were computed using the ESTIMATE algorithm to clarify the association between the risk score and the immune/stromal score. Te low-risk group had better immune scores than the high-risk group, according to the fndings (Figures 6(a) and 7(a)). Moreover, Spearman's rank test results showed signifcant positive correlations among the risk score and immune score in CGGA and GSE4412 samples (Figures 6(b) and 7(b)). Meanwhile, the risk score also had a signifcantly positive association with the stromal score and ESTIMATE score in CGGA and GSE4412 samples (Figures 6(c), 6(d), 7(c), and 7(d)). Considering that immune cells include many diferent subtypes, CIBERSORT was used to deconvolute the composition fraction of immune cells in the CGGA dataset. In order to evaluate the relevance, the proportions of immune cells in the low-risk and high-risk groups were compared. According to the fndings, the low-risk group had a higher percentage of naive CD4 + T cells, regulatory T cells  (Tregs), gamma delta T cell monocytes, and activated mast cells than the high-risk group did (Figure 7(e)). Instead, fewer neutrophils, follicular helper T cells, M0 and M1 macrophages, and stimulated NK cells were present in the low-risk group compared to the high-risk group.

Discussion
Aging is a complex natural procedure, which involves agingrelated immune remodeling and dysfunction [10]. It is worth noting that the incidence of tumors increases signifcantly with age, which can be attributed to a decline in immune function [11]. To date, the technique of aging in GBM has not yet been fully illuminated, and there are no corresponding studies in such patients [7]. Terefore, studies should be conducted to determine the role of AGs in GBM and explore the association of AGs with the prognosis of GBM.
Tis study identifed the relationship between 29 DEAGs (Figures 1(a) and 1(b)) and GBM prognosis, and assembled a risk model with four DEAGs, including C1QA, CDK1, EFEMP1, and IGFBP2 to predict GBM prognosis. Following that, the model's prognostic value was determined using training and independent validation cohorts, with the results demonstrating a valid and robust performance for survival prediction. C1QA, CDK1, EFEMP1, and IGFBP2 were all signifcantly upregulated in high-risk score groups, which means that these patients have a worse prognosis.
Te C1QA gene, which encodes the C1q protein in macrophages, dendritic cells, and THP1 cells, has been implicated in the aging response and is involved in some neurodegenerative diseases [12]. Interestingly, increased gene expression of C1QA has been proven to cause a high infammatory state in the brain of people with psychosis [13]. In addition, a previous study concluded that increased C1QA expression may facilitate tumor progression and contribute towards an adverse outcome [8]. CDK1 participates in the regulation of the G2/M phase of the cell cycle [14]. Furthermore, CDK1 is frequently overexpressed in many human malignant tumor tissues, and it has been investigated as a PB for a variety of tumors. Over-expression of CDK1 in glioma and GBM cells contributes to glioma cell senescence escape and growth [15]. As a result, CDK1 has been proposed as a promising therapeutic target. Previous research has found that EFEMP1, also known as fbulin-3, is involved in ageing, age-related diseases, and tumor formation [16,17]. EFEMP1 knockout mice aged faster and lived shorter lives. However, previous research on the role of EFEMP1 in GBM has been inconsistent. On the one hand, some studies have shown that it has an antitumor efect by inhibiting glioma growth [18]. On the other hand, some studies found that over-expression of EFEMP1 may enhance glioma growth and contribute to resistance through the infuence of multiple oncogenic waving pathways, such as Notch, AKT, and EGFR waving pathways [19,20]. Results obtained in this study are consistent with the latter conclusion. However, this shows that more studies are essential to clarify the function of EFEMP1 in the pathogenesis of GBM. It has been informed that IGFBP-2 appearance is expressively increased after 50 years of age [21]. Moreover, several studies have indicated that there is an expressively positive correlation between IGFBP-2 concentrations and mortality in healthy elderly populations [22,23]. Overexpression of IGFBP-2 has also been found in GBM and many other types of human tumors [24][25][26]. Unfortunately, the high expression of IGFBP-2 was strongly associated with a signifcant shortening of survival, which is consistent with the results of this study [27,28]. Terefore, most of the existing studies propose using IGFBP-2 as a biomarker or potential novel target for GBM treatment [29,30].
For the frst time, a GBM prognostic risk model based on four AGs was developed in this study. Subsequent GSVA analysis disclosed that the risk genes' signaling pathways are elaborated in immunomodulatory and infammatory    Journal of Oncology   responses. Te results strongly suggested a relationship between the risk genes and the GBM immune microenvironment. Based on the above fndings, we further explored the relationship among the risk score and immune score and deconvoluted the conformation fraction of immune cells in the CGGA data set and GSE4412 data set. Te immune and  previous studies indicate that M1 macrophages can perform antitumorigenic functions, whereas M0 and M2 macrophages can perform protumorigenic functions [12,31,32]. Despite the fact that M1 macrophages have proinfammatory and antitumor efects, a previous study found that they were inversely related to survival in GBM patients [33]. Tis study has shown that the proportion of M0 and M1 macrophages was signifcantly higher in the high-risk group than in the low-risk group. Terefore, more comprehensive and in-depth studies should be conducted to elucidate the mechanism of action of macrophages in GBM. Te role of monocytes and mast cells in tumor development and progression has previously been established. Nonetheless, the interactions of monocytes and mast cells in the tumor microenvironment are complex and contradictory [34,35]. Tis study has shown that monocytes and activated mast cells were signifcantly lower in the high-risk group. NK cells are capable of directly killing tumor cells [36]. Although it has great cytotoxicity, the proportion of NK cells was low in the GBM immune microenvironment [37]. Interestingly, a previous study found that NK cell defciency in GBM improves prognosis, which is in line with our results [38]. With regard to T cells, CD4 + T cells seem to play a dual role in tumor immunity, while follicular helper T cells and gamma-delta T cells are relatively good prognostic signatures. Te results obtained in this study are in accordance with the above-mentioned conclusions, with the exception of follicular helper T cells [39][40][41][42]. Accumulating evidence suggests that regulatory T cells are involved in immunosuppression and are associated with tumor escape and tumor progression, which is unfavorable for the outcome [43,44]. Terefore, this discrepancy with our results should be explored more intensively. Overall, the occurrence and development of GBM involve a complex immune microenvironment [45], and thus more research is needed to explore the complex tumor immune relationships.
Tis study had some limitations as well. Although there was a correlation among the four AGs and the tumor immune microenvironment, further experimental verifcation is needed to assess the robustness of the prognostic risk model. Future studies should investigate how the four genes are elaborated in the regulation of tumor immunity.  In conclusion, a robust prognostic risk model was constructed and validated using four AGs, including C1QA, CDK1, EFEMP1, and IGFBP2, which had a close relationship with the immune microenvironment of GBM. Tis study ofers a new reference to further explore the pathogenesis and identify new and more efective GBM treatments.

Data Availability
Te data used to support the fndings of this study are available from the corresponding author upon reasonable request.