Identification of Prognostic Biomarkers of Glioblastoma Based on Multidatabase Integration and Its Correlation with Immune-Infiltration Cells

Background Glioblastoma (GBM) is the most malignant of all known intracranial tumors; meanwhile, most patients have a poor prognosis. In order to improve the poor prognosis of GBM patients as much as possible, it is specifically significant to identify biomarkers related to the gene diagnosis and gene therapy. Methods In this study, a total of 343 GBM specimens and 259 nontumor specimens were collected from four Gene Expression Omnibus (GEO) datasets and The Cancer Genome Atlas (TCGA) database; then, we analyzed the differentially expressed genes (DEGs) from the above data. Through Venn diagram analysis, 54 common upregulated DEGs and 22 common downregulated DEGs were triumphantly recognized. Results On the basis of the degree of formation communication in protein-protein interaction network (PPIN), the 10 upregulated central genes were ranked, incorporating LOX, IGFBP3, CD44, TIMP1, FN1, VEGFA, POSTN, COL1A1, COL1A2, and COL3A1. By combining the expression levels and the clinical features of GBM, we found that four hub genes (TIMP1, FN1, POSTN, and LOX) were significantly upregulated and related to poor prognosis of GBM. Meanwhile, univariate and multivariate Cox regression analysis suggested that TIMP1 could be one of the independent prognostic factors for GBM patients. Furthermore, TIMP1 was particularly correlated with the immune marker of macrophage M1, macrophage M2, neutrophils, tumor-associated macrophage, and Tregs. We then analyzed the role of TIMP1 in GBM cancer cell lines by relevant experiments, which indicated that TIMP1 knockdown resulted in the decreased cell proliferation, migration, and invasion. Conclusions Taken together, these findings demonstrated that TIMP1 might be a new biomarker to determine prognosis and immune infiltration of GBM patients.


Introduction
Glioma, a malignant intracranial tumor, is caused by abnormal proliferation of glial cells [1]. Glioblastoma or glioblastoma multiforme is classified as WHOIV [2]. GBM is the most common primary intracranial tumor, accounting for 54% of all gliomas. Malignant glioblastoma is characterized by rapid proliferation, rapid infiltration, and late diagnosis [3]. Although many studies have clarified the potential molecular mechanism of glioma, there is no comprehensive and effective treatment regimen and the long-term survival rate of glioma patients cannot be effectively improved [4]. In order to improve the prognosis of patients with GBM, the objective of this research is to find novel diagnostic and prognostic biomarkers and possible therapeutic targets of GBM, which is expected to guide clinical treatment and provide new ideas for improving the outcome of GBM patients.
Bioinformatics is a subject developed rapidly in recent years which is relied on big data processing, and it can extract valuable information for clinical research from massive data generated by multisource experiments [5]. Chen et al. performed an integrated bioinformatics analysis of the pancreatic cancer dataset in TCGA and established a risk score model consisting of DEGs, which provided a more effective way for forecasting survival situation than AJCC stage [6]. By using a variety of bioinformatics tools, Wang et al. found that the increased PITX1 expression is correlated with worse relapse-free survival (RFS), disease-specific survival (DSS), and overall survival (OS) in breast cancer patients. erefore, it may provide a basis for oriented drug treatment related to breast cancer [7]. Zhou et al. used software R to normalize the data of native GBM samples and nontumorous samples in the databases. e discriminated DEGs were used in grasping of the molecular mechanisms of GBM, and those DEGs hold promise as prognostic biomarkers for future transactions of GBM sufferers [8]. Although bioinformatics analysis has been widely used in solid tumors, there is little research on bioinformatics analysis of GBM. erefore, in-depth studies on the composition, functional changes, and immune effects of GBM mutated genes are of positively consequence for forecasting the prognosis of GBM patients and for the exploitation of potential targeted treatment. e tumor microenvironment (TME) is the cell environment that tumor cells depend on for survival and growth [9]. TME consists of peripheral blood vessels, extracellular matrix (ECM), nontumor cells, and a variety of biological factors that are influenced by malignant and untransformed cells within the tumor [10]. A number of cell types have been identified in TME using different cell-specific markers, including many different types of cells, such as stromal cells, fibroblasts, and various immune cells. e growth and survival of malignant tumor cells are promoted by growth factors secreted by nontumor cells in TME. At the same time, these growth factors also act as attractors to stimulate a variety of cells to migrate to TME [11]. ECM is another key component of TME in addition to special types of cells [12]. ECM not only participates in the supporting framework of TME cells but also is an abundant pivotal growth factor [13]. ECM plays a significant role in the genesis and development of tumors. In the late stage of tumor progression, ECM commonly gets out of control and becomes disorganized. Unnatural ECM can also lead to disordered behavior of TME cells and promote neovascularization and inflammatory response [14]. TME is distinguished from other normal tissues by these unique properties, and there is new evidence that microenvironmentmediated external stimulation plays a pivotal role in the survival of tumor cells [15]. Disrupting the tumor microenvironment that surrounds and infiltrates the tumor may provide a proven treatment modality for malignancy [16]. erefore, we conducted bioinformatics studies using the Gene Expression Profile Interaction Analysis 2 (GEPIA2) database to analyze the prognostic significance of hub genes. e correlation between core genes expression and GBM immune cell infiltration was detected using Tumor Immune Estimation Resource (TIMER) database. In order to identify potential biomarkers of glioblastoma, knocking down TIMP1 in U-87 MG and U-118 MG cell lines can inhibit cell proliferation, migration, and invasion, suggesting that TIMP1 plays a carcinogenic role in glioblastoma.

GEO Gene Expression
Data. Keywords such as "Glioblastoma," "Homo sapiens," and "Array expression profile" were searched in GEO database and TCGA database, and GBM mRNA expression datasets retrieved by the above keywords were screened. After a systematic review, four GSE profiles (GSE4290, GSE7696, GSE13276, and GSE29796) and one TCGA profile (glioblastoma multiforme) were selected and downloaded. e details are shown in Table 1. All data are freely available, and no human or animal experiments were involved in this study.

DEGs
Filtering. GEO data obtained DEGs between GBM and nontumorous brain tissue by using the GEO2R tool to standardize and convert original data from previously selected datasets into log2 format, setting screening criteria for log-fold change (FC)≥2 and P value < 0.05 in each file. en, the volcano diagram of DEGs was revealed by visual hierarchical clustering analysis, and the overlapping genes were inspected in Venn diagram.

Functional Annotation and Signaling Pathway Analysis of the Hub Genes.
To reveal the functions of DEGs, an online platform for data analysis and visualization (http://www. bioinformatics.com.cn) was used to dispose the GO function and KEGG paths of common upregulated and downregulated DEGs. GO term and KEGG pathways with P < 0.05 were considered statistically significant; meanwhile, Adj. P < 0.05 was regarded as statistically significant.

PPI Network Analysis.
We used the Search Tool for the Retrieval of Interacting Genes (STRING) [17] to process and analyze the obtained DEGs. e filtrated 54 common upregulated DEGs had beforehand been handed over to the STRING. High-degree nodes are considered to be the most significant in this network. In this experiment, we selected the 10 genes with the highest connectivity as the central genes after the statistics of Cytoscape (v3.7.1) plugin cytoHubba.

Detection of Hub Genes in GBM.
e GEPIA2 database [18] was used to visualize the mRNA expression of screened out genes in GBM and nonmalignant brain specimens. Human Protein Atlas (HPA) was used to determine the protein expression levels of 10 HUB genes in nonmalignant brain specimens and GBM organizations [19].

Extract RNAseq Data and Establish Survival and Risk
Curves. Using the UCSC XENA (https://xenabrowser.net/ datapages/) [20] after the unified processing of RNAseq data of TCGA and GTEX, the GBM data of TCGA and corresponding nonmalignant specimen data in GTEX were extracted. After log2 transformation of RNAseq data in TPM format, expression comparison among samples was conducted and the overall capability was assessed using receiveroperating characteristic (ROC) curve and area under ROC curve (AUC). For each hub gene, we used the pROC package to institute the ROC curve, calculated the AUC value, and visualized the ROC curve with the ggplot2 package. e closer the AUC value is to 1, the better the overall performance.

Survival Analyses and Univariate and Multivariate Cox Analysis for Hub Genes.
e relationship between overall survival (OS), disease-free survival (DFS), and core genes expressed in GBM sufferers was assessed by GEPIA2. Logrank test results with P < 0.01 were statistically significant. e hub genes with independent prognostic factors were further screened by univariate and multivariate Cox regression analysis.

Western
Blot. Cells were lysed by using RIPA buffer (Epizyme, China) containing protease inhibitor (AbMole, USA). e total proteins were separated by 10% SDS-PAGE and subsequently transferred to PVDF membranes (Millipore, USA). After blocking with 5% fat-free milk, incubate with primary antibodies against TIMP1 (A00561-1; Boster, China) and β-actin (Beyotime Biotechnology, China), respectively, using a dilution of 1 : 1000 for each antibody overnight at 4°C. After washing with TBST, secondary antibodies were introduced into the membrane for 1 h at room temperature. e interested proteins were visualized using the enhanced chemiluminescence (ECL, Epizyme), and the protein band intensity was analysed using ImageJ software.

CCK-8 Assay and Colony Formation Assay.
For CCK-8 assay, U-87 MG and U-118 MG cells transfected with sh-TIMP1 or NC were resuspended into single-cell suspension with medium containing 10% FBS, and 200 µL of cell suspension was added to a 96-well plate. CCK-8 solution was added to the plate after 1, 3, 5, or 7 days of incubation and incubated for another 4 hours. Absorbance at 450 nm was measured and recorded to construct cell growth curve. For colony formation assay, U-87 MG and U-118 MG cells transfected with sh-TIMP1 or NC were plated on six-well plate with 500 cells per plate. After two weeks of incubation, cells were fixed with methanol followed by staining with 0.1% crystal violet solution.

Cell Migration and Invasion Assays.
e ability of cell migration and cell invasion was tested by transwell experiments. U-87 MG and U-118 MG cells transfected with sh-TIMP1 or NC were plated in transwell chambers (Corning, 8 μm) without or with Matrigel-coated and cultured overnight. e cells that had moved across the membrane were fixed with methanol followed by staining with 0.1% crystal violet solution. Cells that migrated into the lower chamber were counted.

Identified DEGs.
e DEGs in the GSE4290, GSE7696, GSE13276, GSE29796, and TCGA datasets were identified by GEO2R. Add up to 1756 DEGs were recognized in the GSE4290, add up to 938 DEGs were identified in the GSE7696, add up to 546 DEGs were recognized in the GSE13276, add up to 3873 DEGs were recognized in the GSE29796, and add up to 2283 DEGs were recognized in TCGA (GBM).
rough visual hierarchical clustering analysis, the volcano map of these DEGs is clearly shown (Figure 1(a)), where the scarlet and blue dots, respectively, represent upregulated and downregulated genes. en, the Venn plot of the coexpression genes containing overlapping DEGs in the GSE4290, GSE7696, GSE13276, GSE29796, and TCGA datasets was constructed (Figures 1(b) and 1(c)).
ese 54 upregulated DEGs and 22 downregulated DEGs are plotted in Table 2. A total of 54 overlapping upregulated genes were acquired as core genes for further analysis.

GO and KEGG Enrichment Analyses.
We conducted GO and KEGG enrichment analyses of 54 upregulated DEGs and 22 downregulated DEGs to systematically understand the biological roles of these DEGs. Figure 2, respectively, lists the top enriched GO terms, and Figure 3(a) lists the top 30 KEGG pathways.
GO BP displayed that 54 upregulated DEGs were obviously enriched in the lectin pathway of extracellular matrix organization, extracellular structure organization, cell-substrate adhesion, and urogenital system development. GO CC analysis showed that the most enriched terms were collagen-containing extracellular matrix, basement membrane, endoplasmic reticulum lumen, and complex of collagen trimers. e top four significantly enriched MF terms included extracellular matrix structural constituent, extracellular matrix structural constituent conferring tensile strength, platelet-derived growth factor binding, and collagen binding (Figure 2(a)).
GO BP displayed that 22 downregulated DEGs were obviously enriched in the protein localization to axon, regulation of SNARE complex assembly, and response to magnesium ion and SNARE complex assembly. GO CC analysis showed that the top obviously enriched terms were main axon, juxtaparanode region of axon, voltage-gated potassium channel complex, and potassium channel complex. e top four significantly enriched MF terms included structural constituent of myelin sheath, magnesium ion binding, nucleoside monophosphate kinase activity, and phosphotransferase activity (Figure 2(b)).
In addition, the four obvious enrichment pathways of these 54 upregulated DEGs which attracted our attention were ECM-receptor interaction, focal adhesion, pathways in cancer, and the p53 signaling pathway (Figure 3(a)). e pathway diagram shows ECM-receptor interaction is one of the top enrichment pathways (Figure 3(b)).

Module Screening from the PPI Network.
e PPI pairs in the 54 upregulated DEGs were determined by using the STRING (Figure 4(a)). Cytoscape identified the top 10 central genes for connectivity of the degree score (Table 3). Cytoscape also identified the most tightly connected modules ( Figure 4(b)). KEGG analysis revealed that the 10 hub genes were significantly enriched in the pathways of ECMreceptor interaction, focal adhesion, amoebiasis, protein digestion and absorption, bladder cancer, pathways in cancer, mTOR signaling pathway, shigellosis, and the p53 signaling pathway (Figure 4(c)).

Validation of mRNA Expression in GBM.
e results of GEPIA database displayed that the mRNA expression levels of 10 genes in GBM tissues were expressively higher than those in nonmalignant cerebral cortex specimens (P < 0.01) ( Figure 5(a)). Notably, the protein levels of CD44, COL1A1, COL1A2, COL3A1, FN1, POSTN, TIMP1, and VEGFA were not voiced in normal cerebral cortex tissues, nevertheless, these genes moderately and highly expressed in GBM tissues ( Figure 5(b)). Taken together, the consequences of this study manifested that the hub genes were overexpressed at both transcriptional and translational expression levels in GBM patients.

Changes in Hub Gene Frequency and Prognostic Value in GBM.
e CBioPortal database was used to assess the frequency of genetic changes in these selected central genes from GBM. Approximately 43.45% of GBM clinical patients displayed vital changes in these central genes ( Figure 6 (Figure 7). From the above results, it can be concluded that these variables have high accuracy in predicting the outcomes of nontumor patients and GBM patients.

Survival Analysis and Cox Regression Analysis in GBM.
OS and DFS analyses of the central genes were further conducted by using GEPIA database. As shown in Figure 8(a), high expressions of TIMP1, FN1, LOX, and POSTN in GBM patients were significantly correlated with worse OS. Adverse DFS were also significantly scanned in GBM patients with elevated TIMP1, FN1, LOX, and POSTN expression levels (Figure 8(b)). e results of log-rank P test showed that only the OS and DFS of TIMP1, FN1, LOX, and POSTN had statistical significance simultaneously.    Figure 8(c).
To further explore the connection with GBM and the level of immune cell infiltration, we used TIMER2.0 to evaluate the connection with TIMP1 expression of various immune cells and immune marker genes in GBM. e results revealed that the expression level of TIMP1 was remarkably positively associated with most immune cell markers in GBM. Moreover, the result displayed that TAM marker (CCL2), macrophage M1 marker (PTGS2), macrophage M2 markers (CD163, VSIG4, and MS4A4A), neutrophil marker (ITGAM), and Treg marker (TGFB1) were observably positively associated with the expression level of TIMP1 in GBM (Table 4), which suggested that TIMP1 plays a vital role in the glioma immune microenvironment. Finally, higher Treg infiltration was related with worse prognosis for TIMP1 in GBM (Figure 9(b); HR � 2.05, P � 0.0395). Equally, higher cancer-associated fibroblast and mast cell infiltration was also related to poor outcome in GBM (Figures 9(c)

Discussion
GBM is the most invasive and worst prognosis of high-grade gliomas (WHOIII and WHOIV gliomas). Although surgical resection, radiotherapy, and chemotherapy have achieved tremendous advance in modern medicine, the median survival time of patients after initial diagnosis is only 12-15 months. Difficult radical resection of gliomas and resistance to radiotherapy and chemotherapy are the main causes of recurrence and treatment failure [2,3]. Early diagnoses will obviously enhance the clinical prognosis of GBM. Bioinformatics analysis can quickly and accurately identify biomarkers related to GBM development, which has important research value in the study of diagnostic markers and prognostic genes. Bioinformatics is a subject that has developed rapidly in recent years by relying on big data processing, and it can extract valuable information for clinical research from massive data generated by multisource experiments [5,22]. In this study, 5 gene expression profiles of GBM were retrieved from GEO and TCGA databases, including 343 glioblastoma specimens and 259 normal specimens. 54 common upregulated DEGs expression profiles and 22 common downregulated DEGs expression profiles were favorably identified, severally. e 10 central genes were selected according to the connectivity of PPIN, incorporating LOX, IGFBP3, CD44, TIMP1, FN1, VEGFA, POSTN, COL1A1, COL1A2, and COL3A1. Subsequently, we performed mRNA expression verification, gene change frequency, survival analysis, COX analysis, and tumor immune infiltration analysis of these 10 hub genes. Finally, four hub genes had been screened out including TIMP1, FN1, POSTN, and LOX. ese identified central genes may play a crucial role in GBM.
FN1 encodes fibronectin, a glycoprotein that exists in the form of dimers or polymers on the cell surface and in the extracellular matrix [23]. POSTN encodes a protein that plays a significant role in tissue development and regeneration. ese proteins are secreted to the extracellular matrix for tissue repair, such as granulation hyperplasia, mucosal repair, and ventricular remodeling [24]. LOX is a proteincoding gene that mainly encodes the lysyl-oxidase family and leads to a variety of transcriptional variants; one of the precoding proteins was hydrolyzed to produce a regulatory propeptide and a ripening enzyme [25]. TIMP1, also known as fibroblast collagenase inhibitor, is a protein-coding gene [26].
is gene was primarily known as an endogenous inhibitor of matrix metalloproteinases (MMPs) [27]. Among its related pathways are HIF-1 signaling pathway, IL6-mediated signaling events, and ECM organization. is gene family encodes proteins that are inhibitors of MMPs involved in extracellular matrix degradation [28]. With the exception of its suppressive role against most of the known MMPs, as a cytokine and a key regulator of ECM degradation, TIMP1 has a variety of functions related to tumor microenvironment and progression [29], tumor cell proliferation [30], and antiapoptotic activity in cancer [31][32][33]. Intriguingly, MMPs have been found to be synthesized mainly by adjacent and intervening stromal cells, similar to TIMP1 which is secreted in the tumor microenvironment [34,35]. Similarly, we discovered that TIMP1 was negatively in connection with tumor purity through the TIMER2.0 database, which suggested that TIMP1 was mainly derived from stromal cells rather than GBM cells. A large number of research studies suggested that TIMP1 is often highly expressed in several types of human cancer cells, containing prostate cancer [36], lung cancer [37], melanoma [38], breast cancer [35], and glioblastoma [39]. Based on the above conclusions, we suggested that FN1, POSTN, LOX, and TIMP1 can be used for cancer diagnosis and as prognostic indicators. Insulin-like growth factor binding protein 3 14 10 Journal of Oncology   Infiltration of immune cells into tumors is an essential factor affecting tumor occurrence and progression [40,41]. For instance, Chen et al. proposed that the expression of CXCL10 is related to the infiltration of miscellaneous immune cells. Tumor-associated macrophages express CXCL10 in pancreatic cancer, and its acceptor CXCR3 is extremely expressed in T cells. eir study showed that the expression of CXCL10 was actively in connection with multiple immune core proteins and suggested that CXCL10 can be used as one of the available targets of immunotherapy [6]. erefore, we conducted immune correlation analysis of the four core genes (FN1, POSTN, LOX, and TIMP1). Analysis of the results about correlation between immune infiltrates and TIMP1 expression showed that TIMP1 involved in tumor immunity. e high expression of TIMP1 combined with the high proportion of Tregs, cancer-associated fibroblast, and MAST cell suggested a poor prognosis in GBM patients. On the other hand, we also detected that TIMP1 was particularly connected with the immune marker of tumor-associated macrophage, macrophage M1, macrophage M2, neutrophils, and Tregs. Hence, the above results confirmed that the value of TIMP1 as a gene diagnostic for GBM and the prognostic value of GBM patients with high TIMP1 expression are cheek by jowl connected with the immune microenvironment of GBM.
By constructing TIMP1 knockdown glioblastoma cell line, we further analyzed the role of TIMP1 in glioblastoma and found that TIMP1 promoted cell proliferation. Knocking down TIMP1 by lentivirus transfection inhibited cell migration and invasion. is may be one of the potential mechanisms of TIMP1 promoting cell proliferation. e preprint of this paper was submitted in the early stage [42], and we added experimental verification later. In addition, the main limitations of this study are as follows: first of all, the expression levels of hub genes, such as TIMP1, POSTN, LOX, and FN1, have not been confirmed by immunohistochemistry in clinical specimens. Second, the role of these hub genes in the appearance and evolution of GBM is still vague and should be verified by in vitro and in vivo functional research studies.

Conclusion
In this study, we reported the results of a comprehensive bioinformatics analysis of biomarkers associated with glioblastoma prognosis and their association with immuneinfiltrating cells. e wholesale data analysis in this research afforded a comprehensive bioinformatics analysis of DEGs that may be concerned in the progress of GBM. e shortcoming of insufficient samples was overcome by using five open access databases for new joint analysis. However, the molecular mechanisms of TIMP1 in GBM still need to be further studied. In general, our experiments recommended the requirement to more discreetly score TIMP1 as a biomarker and therapeutic target, in order to develop the new antibodies for GBM exploration and prevention, with the hope of providing new ideas for clinical diagnosis and treatment of GBM patients and effectively improving sufferer prognosis.

Data Availability
All the experimental data involved in this study are obtained from open source. Please refer to the materials and methods in this paper for specific access.
Ethical Approval e author promises that all the results of this paper are original research, and all nonoriginal research results are canonically cited as far as possible.

Disclosure
ere is no academic misconduct in this study, such as plagiarizing other people's articles, submitting more than one manuscript, and modifying data results.

Conflicts of Interest
ere is no objection to the authorship and order and no intellectual property dispute. In addition, the research was conducted without any commercial or financial relationship and there are no potential conflicts of interest. (e, f ) CCK-8 and colony formation assays revealed that the proliferation abilities were suppressed in U-87 MG and U-118 MG cells transfected with sh-TIMP1 compared with the NC group. (g, h) Transwell assay revealed that cell migration and invasion were suppressed in U-87 MG and U-118 MG cells transfected with sh-TIMP1 compared with the NC group. * P < 0.05; * * P < 0.01; * * * P < 0.001.