Identification of Core Biomarkers Associated with Outcome in Glioma: Evidence from Bioinformatics Analysis

Glioma is the most common neoplasm of the central nervous system (CNS); the progression and outcomes of which are affected by a complicated network of genes and pathways. We chose a gene expression profile of GSE66354 from GEO database to search core biomarkers during the occurrence and development of glioma. A total of 149 samples, involving 136 glioma and 13 normal brain tissues, were enrolled in this article. 1980 differentially expressed genes (DEGs) including 697 upregulated genes and 1283 downregulated genes between glioma patients and healthy individuals were selected using GeoDiver and GEO2R tool. Then, gene ontology (GO) analysis as well as Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were carried out using the Database for Annotation, Visualization and Integrated Discovery (DAVID). Moreover, Cytoscape with Search Tool for the Retrieval of Interacting Genes (STRING) and Molecular Complex Detection (MCODE) plug-in was employed to imagine protein-protein interaction (PPI) of these DEGs. The upregulated genes were enriched in cell cycle, ECM-receptor interaction, and p53 signaling pathway, while the downregulated genes were enriched in retrograde endocannabinoid signaling, glutamatergic synapse, morphine addiction, GABAergic synapse, and calcium signaling pathway. Subsequently, 4 typical modules were discovered by the PPI network utilizing MCODE software. Besides, 15 hub genes were chosen according to the degree of connectivity, including TP53, CDK1, CCNB1, and CCNB2, the Kaplan-Meier analysis of which was further identified. In conclusion, this bioinformatics analysis indicated that DEGs and core genes, such as TP53, might influence the development of glioma, especially in tumor proliferation, which were expected to be promising biomarkers for diagnosis and treatment of glioma.


Introduction
Gliomas comprise about 30% of all brain tumors and central nervous system (CNS) tumors and 80% of all malignant brain tumors [1]. Malignant glioma is the most common brain glioma (accounting for~70%) with an annual incidence of 5 per 100,000 and an extremely malignant clinical outcome. Taking the most common type, glioblastoma (GBM), for example, possesses a median overall survival of only 10-15 months, which gives rise to serve health burden [2]. The pathogenesis of gliomas is very complicated; although a great many genes and proteins take part in the occurrence and development of gliomas, the mechanism still remains not clear [3]. At present, genetic factors and residual embryonic primitive cells in the brain are vital endogenous influences contributing to glioma while harmful physical and chemical factors along with virus infection serve as exogenous factors [4,5]. Regardless of these different factors of pathogenesis, gliomas share similar mechanisms of oncogenesis, such as retinoblastoma (RB), p53, and receptor tyrosine kinase (RTK) signaling pathways, which play critical parts in the development of glioma, especially in GBM. Hence, genes and proteins that can modulate these pathways are not only used for therapeutic targets and diagnostic markers but can also be utilized to assess the pathological characteristics of gliomas [6,7]. For example, isocitrate dehydrogenase (IDH) [8], epidermal growth factor receptor (EGFR) [9], and neurofibromatosis type 1 (NF1) [10] have been the center of attention in glioma genesis.
Currently, the diagnosis of gliomas mainly depends on pathological feature and medical imaging such as CT, MRI, DSA, PET, and SPECT, which have been closely related to experience of surgeons [11]. Because of lacking specificity of auxiliary examination biomarker, it is difficult for surgeons to obtain an accurate diagnosis and treatment of glioma as early as possible, so that many patients miss the optimal chance for surgery, thus increasing the risk of death. On the other hand, the common treatment strategies of gliomas consist of surgical resection as well as chemo and radiotherapy while targeting drugs are still limited [12]. Hence, searching for specific and sensitive biomarkers as well as some core genes or proteins as therapeutic target will benefit the diagnosis and treatment of gliomas.
At present, high-throughput sequencing has been applied as a very critical tool for medical research [13]. In this analysis, we chose GSE66354 from Gene Expression Omnibus (GEO) and used GeoDiver and GEO2R online tool to find the differentially expressed genes (DEGs). Subsequently, we made PPI network of the DEGs and selected core genes with a high degree of connectivity. In addition, analysis of biological process (BP), molecular function (MF), cellular component (CC), and KEGG pathways of the DEGs and four modules was performed. Moreover, overall survival (OS) analysis of these core genes was carried out. Then, the correlation analysis using TCGA database was employed to observe the potential relationship between genes. Briefly, this study would provide novel targets for diagnosis and treatment of glioma.

Identification of DEGs and Hub
Genes. There were 136 glioma (including 21 high-grade gliomas, 15 pilocytic astrocytomas, 19 medulloblastomas, 17 atypical teratoid/rhabdoid tumors, and 64 ependymomas) samples and 13 normal samples in our study. The results of DEGs analysis were freely available in GeoDiver (https://www.geodiver.co.uk/). The heat map and volcano plot showed these DEGs in glioma ( Figure 1). The GEO2R online analysis tool was applied to detect the DEGs, using adjust p value < 0.01 and |logFC| ≥ 2 as cutoff criteria. A total of 1980 DEGs were detected after the analysis of GSE66354; of which, 697 were upregulated genes and 1283 were downregulated. Besides, 15 hub genes with a high degree of connectivity were selected (Table 1).

GO Function and KEGG Pathway Analysis of DEGs.
To obtain a more in-depth understanding of the selected DEGs, GO function and KEGG pathway analysis were applied using DAVID ( Figure 2). GO analysis results showed that upregulated DEGs and downregulated DEGs were particularly enriched in biological process (BP), including mitotic cell cycle, mitotic cell cycle process, cell cycle process, mitotic nuclear division, and cell division for upregulated DEGs, and for downregulated DEGs including synaptic signaling, anterograde trans-synaptic signaling, trans-synaptic signaling, chemical synaptic transmission, and nervous system development (Table 2). For molecular function (MF), the upregulated DEGs were mainly enriched in extracellular matrix structural constituent, coreceptor activity, Wnt protein binding, microtubule motor activity, and growth factor binding, and the downregulated DEGs were enriched in gated channel activity, ion channel activity, substratespecific channel activity, channel activity, and passive transmembrane transporter activity (Table 2). In addition, GO cell component (CC) analysis also displayed that the upregulated DEGs were significantly enriched in the extracellular matrix component, proteinaceous extracellular matrix, extracellular matrix, basement membrane, and microtubule cytoskeleton, and downregulated DEGs were enriched in synapse, neuron part, synapse part, neuron projection, and axon ( Table 2). Figure 2(c) and Table 3 show the most significantly enriched KEGG pathway of the upregulated and downregulated DEGs. The upregulated DEGs were enriched in cell cycle, ECM-receptor interaction, p53 signaling pathway, pathways in cancer, and hepatitis B, while the downregulated DEGs were enriched in retrograde endocannabinoid signaling, glutamatergic synapse, morphine addiction, GABAergic synapse, and calcium signaling pathway.

Hub Genes and Module Screening from PPI Network.
Based on the information in the STRING protein query from public databases, we made the PPI network of the top 15 hub genes with higher degree of connectivity (Figure 2(d)). We selected TP53, TOP2A, CDK1, CCNB1, CDC20, CCNA2, NDC80, AURKA, BIRC5, CCNB2, KIF11, and MAD2L1, which with worse overall survival situation according to the Kaplan-Meier plotter. Based on the GO function, the KEGG pathway analysis, and the survival analysis, we found that TP53, CDK1, CCNB1, and CCNB2 were enriched in cell cycle, especially in p53 signaling pathway.  In order to detect significant modules in this PPI network, we used MCODE plug-in. The top 4 modules were selected ( Figure 5). KEGG pathway enrichment analysis showed that these 4 modules were mainly correlated with cell cycle, neuroactive ligand-receptor interaction, calcium signaling pathway, and arrhythmogenic right ventricular cardiomyopathy (ARVC) ( Table 4).

Discussion
The rising trend of the glioma morbidity has raised our attention in recent years, which is mainly caused by the failure to early diagnosis and treatment. Therefore, sensitive and specific biomarkers as well as core therapeutic targets of glioma are urgently needed to be screened. In the present analysis, 136 glioma samples (including 21 high-grade gliomas, 15 pilocytic astrocytomas, 19 medulloblastomas, 17 atypical teratoid/rhabdoid tumors, and 64 ependymomas) and 13 normal samples were enrolled from the GEO database of GSE66354. A total of 1980 DEGs were identified, including 697 upregulated genes and 1283 downregulated genes. To obtain an in-depth understanding of these DEGs, we performed the GO function and KEGG pathway analysis of these DEGs. And we found that the upregulated DEGs are mainly implicated with cell cycle, ECM-receptor interaction, and p53 signaling pathway, while the downregulated genes were enriched in retrograde endocannabinoid signaling, glutamatergic synapse, morphine addiction, GABAergic synapse, and calcium signaling pathway. In the meantime, we verified the effects of these hub genes on survival in patients with glioma. Of these 15 hub genes, PHLPP2, DLG4, and MYC displayed significant positive correlation with overall survival while TP53, TOP2A, CDK1, CCNB1, CDC20, CCNA2, NDC80, AURKA, BIRC5, CCNB2, KIF11, and MAD2L1 negatively correlated with overall survival in patients with glioma. Subsequently, we further confirmed the expression of TP53 and DLG4 in GBM and LGG.  3.1. DEGs Are Promising Candidates for the Diagnosis of Glioma. Our analysis picked up 1980 DEGs between patients with glioma and normal individuals. In our heat map, a total of 100 DEGs with most differential expression were captured. We hypothesized that these DEGs would be promised to be candidates for the diagnosis of glioma in future. In fact, some of these DEGs have been already uncovered to be good predictors of glioma. For instance, WEE1 is a regulator of the G2 checkpoint in cell. The WEE1-positive nuclear area is correlated with malignancy grade and WEE1 is associated with prognosis in GBM inversely [14].
LGI3, a secreted protein member of leucine-rich glioma inactivated (LGI) family, is predominantly expressed in the brain, skin, and adipose tissues, exerting roles as a multifunctional cytokine [15]. Studies have revealed that low expression levels of LGI3 are obviously associated with poor prognosis of glioma [16].
Collectively, the clinical value of these DEGs needs to be further explored.

Hub Genes May Serve as Core Therapeutic Targets in
Glioma. In this study, we selected 15 hub genes in glioma, which were in the core nodes in PPI network; thus, they might be the key therapeutic targets to combat glioma. TOP2A, one of the important molecular markers, would predict response to chemotherapy. In glioma, high levels of TOP2A mRNA have been noted in GBM in comparison with grade II and III astrocytomas and also correlate with tumor TOP2A protein levels. Interestingly, temozolomide inhibited TOP2A activity and siRNA knocked down of TOP2A rendered a glioma cell line resist. Hence, the transcript of TOP2A should be a good prognostic indicator in GBM patients receiving temozolomide chemotherapy. AURKA is located on chromosome 20q13, frequently amplified and overexpressed in human malignancies, involving breast cancer [17], pancreatic cancer [18], and gastric cancer [19]. A recent study has demonstrated that AURKA could confer self-renewal capacity of competing away the binding of AXIN from β-catenin, inducing β-catenin stabilization and activating Wnt signaling in glioma-initiating cells [20]. Hence, endogenous or exogenous drugs that could target these genes will combat glioma. For example, miR-124, an important downstream target gene of hedgehog (Hh) signaling, potentially interacts with the 3 ′ -UTR region of AURKA; thus, upregulating miR-124 significantly reduces the expression of AURKA and inhibits the proliferation and growth of human glioma cells [21].
3.3. TP53 Is Essential for the Progression of Glioma. Tumor protein p53 (TP53), also known as BCC7, LFS1, P53, and TRP53, encodes a tumor suppressor protein containing transcriptional activation, DNA binding, and oligomerization domains [22]. The encoded protein responds to diverse cellular stresses to regulate expression of target genes, thereby inducing cell cycle arrest, apoptosis, senescence, DNA repair, or changes in metabolism [23]. Mutations in this gene are associated with a variety of human cancers, such as breast cancer [24], prostate cancer [25], liver cancer [26], and colorectal carcinoma [27]. The PPI network analysis found that TP53 had a higher combined score with CDK2. Figure 4(c) shows the results of the correlation analysis between TP53 and CDK2. TP53 and CDK2 were obviously positively correlated. The CDK2 gene encodes a member of the serine/threonine protein kinase family that is involved in cell cycle regulation [28]. The CDK2 protein is implicated in the control of cell cycle progression [29]. Thus, targeting of cell cycle checkpoints in cancer cells may inhibit tumor growth and induce cell death. CDK2 is a vital regulator of S-phase progression and has been regarded as an anticancer drug target in several studies [30,31]. Recent studies show that for both DNA damage-and oncogene-induced cellular senescence, CDK2 transcript and protein are inhibited in a p53-and RB-dependent manner, and this repression is necessary for cell cycle exit during senescence, suggesting that there may exist a potential relationship between TP53 and CDK2 [32].  Limited by few studies about the relationship between TP53 and CDK2 in glioma, further research is necessary to clarify the underlying correlation and mechanism between TP53 and CDK2. Cyclin-dependent kinase 1 (CDK1), also called CDC2, CDC28A, and P34CDC2, encodes the protein of a member of the serine/threonine protein kinase family. This protein is a catalytic subunit of the highly conserved protein kinase complex known as M-phase promoting factor (MPF), which is essential for G1/S-and G2/M-phase transitions of eukaryotic cell cycle [33]. Mitotic cyclins stably associate with this protein and function as regulatory subunits. The kinase activity of this protein is controlled by cyclin accumulation and destruction through the cell cycle. The phosphorylation and dephosphorylation of this protein also play important regulatory roles in cell cycle control [34]. A previous study indicates that apoptosis induced by cdk1 inhibition is dependent on caspase activation and is concomitant with upregulation of transcriptional targets of TP53 [35]. Since CDK1 is associated with many types of human cancers, it is believed that CDK1 might play an important role in diagnosis and therapy of glioma.
Taken together, our bioinformatics analysis identified DEGs and they might play central roles in the occurrence, development, and prognosis of glioma. In this study, a total of 1980 DEGs and 15 hub genes were selected, and TP53, CDK1, CCNB1, and CCNB2 might be the core genes of gastric cancer. In order to get more accurate correlation results, we need to start a series of verification experiments later to confirm the results of this prediction. Anyway, this study could provide some powerful evidence for the future genomic individualized treatment of glioma.

Differential Gene Expression Analysis (DGEA)
. GeoDiver (https://www.geodiver.co.uk/) was an online web application for performing differential gene expression analysis (DGEA) and generally applicable gene-set enrichment analysis (GAGE) on gene expression datasets from the publicly available Gene Expression Omnibus (GEO) [37]. GeoDiver used   (e) TP53 protein was strongly upregulated in glioma tissues compared with normal brain tissues based on the Human Protein Atlas database. The normal brain tissue of TP53 was from a male, age 62, (patient ID: 1609; staining: not detected; intensity: negative; quantity: negative; location: none), and the glioma tissue was from a male, age 61, (patient ID: 2522; staining: high; intensity: strong; quantity: >75%; location: nuclear). (f) DLG4 protein was strongly downregulated in glioma tissues compared with normal brain tissues based on the Human Protein Atlas database. The normal brain tissue of DLG4 was from a male, age 45, (patient ID: 2521; staining: medium; intensity: moderate; quantity: >75%; location: membranous nuclear), and the glioma tissue was from a male, age 48 (patient ID: 3092; staining: not detected; intensity: negative; quantity: negative; location: none).    the limma R package to identify differentially expressed genes by fitting a linear model to each gene, which estimated the fold change in expression while accounting for standard errors by applying empirical Bayes smoothing. The adjust p values were used to decrease the false positive rate utilizing Benjamini and Hochberg false discovery rate method by default. Genes were ordered according to the fold change in the expression values. This information was presented as a heat map and a volcano plot.

Data
Processing of DEGs. GEO2R (https://www.ncbi.nlm .nih.gov/geo/geo2r/) was used to search DEGs between glioma samples and normal samples. GEO2R was an interactive online tool allowing users to compare two or more groups of samples in a GEO series and it analyzed most GEO series with gene symbol [38]. The adjusted p values were used to decrease the false positive rate using Benjamini and Hochberg false discovery rate method by default. The adjust p value < 0.01 and |logFC| ≥ 2 were set as the cutoff criterion. Then, 1980 DEGs were found, including 697 upregulated genes and 1283 downregulated genes, and the top 15 genes with a high degree of connectivity as hub genes were selected.

Gene Ontology and KEGG Pathway Analysis of DEGs.
Gene ontology (GO) analysis served as a useful approach to annotate genes and gene products and also identify characteristic biological attributing to high-throughput genome or transcriptome data [39]. Kyoto Encyclopedia of Genes and Genomes (KEGG) was a collection of databases which helps to handle genomes, biological pathways, diseases, chemical substances, and drugs [40]. The Database for Annotation, Visualization and Integrated Discovery (DAVID, https:// david.ncifcrf.gov/) was a web-based online bioinformatics resource that aims to provide tools for the functional interpretation of large lists of genes or proteins [41]. p < 0 05 was regarded as the cutoff criterion. We could visualize the core biological process (BP), molecular function (MF), cellular component (CC), and pathways among those DEGs using DAVID.
4.5. PPI Network and Module Analysis. Search Tool for the Retrieval of Interacting Genes (STRING) was a useful online tool designed to assess the protein-protein interaction (PPI) information [42]. To explore the potential relationship among those DEGs, we applied STRING app in Cytoscape and mapped the DEGs into STRING. And confidence score ≥ 0.4 and maximum number of interactors = 0 were set as the cutoff criterion. Moreover, the Molecular Complex Detection (MCODE) app was utilized to screen modules of the PPI network in Cytoscape with degree cutoff = 2, node score cutoff = 0.2, k-core = 2, and max. depth = 100. The pathway analysis of genes in these modules was performed by DAVID. In the meantime, 15 hub genes were also inserted into STRING with confidence score ≥ 0.4 and maximum number of interactors = 0. GO and KEGG pathway analyses were also carried out to identify the potential information.
4.6. Comparison of the Hub Gene Expression Level. The GEPIA (http://gepia.cancer-pku.cn/index.html) was a newly developed interactive web server that could analyze the RNA sequencing expression data of 9736 tumors and 8587 normal samples from the TCGA and the GTEx projects, using a standard processing pipeline [43]. It offered customizable functions involving tumor and normal differential expression analysis, and we could unveil the expression level of hub genes in glioma tissues and normal tissues. Then, the boxplots were performed to visualize the correlations. The Human Protein Atlas (HPA, https://www.proteinatlas.org/) was a Swedish-based program initiated in 2003 with the aim to map all the human proteins in cells, tissues, and organs using integration of various omics technologies, including antibody-based imaging, mass spectrometrybased proteomics, transcriptomics, and systems biology [44]. By acquiring immunohistochemical data of patients with or without glioma based on HPA, we further verified the expression of these hub genes.

Survival
Analysis of Hub Genes. The relapse-free and overall survival information were based on TCGA database and the GTEx projects based on GEPIA [43]. The hazard ratio (HR) with 95% confidence intervals and log rank p value were calculated and displayed on the plot.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
All authors declare that they have no conflict of interests to state.