Gastric cancer (GC) is one of the most common malignancies of the digestive system with few genetic markers for its early detection and prevention. In this study, differentially expressed genes (DEGs) were analyzed using GEO2R from GSE54129 and GSE13911 of the Gene Expression Omnibus (GEO). Then, gene enrichment analysis, protein-protein interaction (PPI) network construction, and topological analysis were performed on the DEGs by the Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, STRING, and Cytoscape. Finally, we performed survival analysis of key genes through the Kaplan-Meier plotter. A total of 1034 DEGs were identified in GC. GO and KEGG results showed that DEGs mainly enriched in plasma membrane, cell adhesion, and PI3K-Akt signaling pathway. Subsequently, the PPI network with 44 nodes and 333 edges was constructed, and 18 candidate genes in the network were focused on by centrality analysis and module analysis. Furthermore, data showed that high expressions of fibronectin 1(FN1), the tissue inhibitor of metalloproteinases 1 (TIMP1), secreted phosphoprotein 1 (SPP1), apolipoprotein E (APOE), and versican (VCAN) were related to poor overall survivals in GC patients. In summary, this study suggests that FN1, TIMP1, SPP1, APOE, and VCAN may act as the key genes in GC.
Gastric cancer (GC) is one of the malignant tumors threatening human health, and it is the fifth most common cancer and the third leading cause of cancer death in the world [
With the advance of the human genome project, cancer has been studied at the genetic level. Gene chips can be used to identify genes that cause early cancer. It has the characteristics of high flux, high sensitivity, and low cost. It is widely used in disease diagnosis and drug screening [
In this study, we aimed to obtain the key genes between GC patients and normal controls. We downloaded the gene expression profiles of GSE54129 and GSE13911 and identified 1034 differentially expressed genes (DEGs) in GC. Moreover, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis were carried out for DEGs, and GC-related protein-protein interaction (PPI) network was constructed. Furthermore, 18 candidate genes displayed high centrality values and located at the 1st module, which were found by the centrality analysis and module analysis on the basis of the GC-related PPI network. Data also showed that high expressions of fibronectin 1 (FN1), the tissue inhibitor of metalloproteinases 1 (TIMP1), secreted phosphoprotein 1 (SPP1), apolipoprotein E (APOE), and versican (VCAN) were related to a poor overall survival in gastric cancer patients. These key genes could be used as potential therapeutic targets and biomarkers for gastric cancer at early period.
The Gene Expression Omnibus database (
GEO2R (
The DAVID (
The STRING (
The Cytoscape is a software for visual networks [
Centrality analysis includes analyzing the degree, betweenness, and eigenvector of network nodes. Cytoscape plug-in CytoNCA was used to calculate the values of degree, betweenness, and eigenvector to predict the key genes [
The Kaplan-Meier plotter (KM plotter,
To explore the role of systems biology in the pathogenesis of GC, we analyzed two chip data of GSE54129 and GSE13911 by GEO2R. There were 3878 differentially expressed genes in GSE54129 and 3061 differentially expressed genes in GSE13911. Then, venny2.1.0 was used to obtain the intersection of the DEGs of the two chips. The results showed that 1034 differentially expressed genes appeared on both chips (Figure
Screening and identification of differentially expressed genes. (a) Venn diagram showed the differentially expressed genes of adj.
In order to better understand the biological function of DEGs, we conducted GO function and KEGG enrichment analysis by DAVID. GO results showed that DEGs significantly enriched in extracellular matrix organization, collagen catabolic process and cell adhesion of BP, extracellular space, extracellular region and extracellular exosome of CC, extracellular matrix structural constituent, and heparin binding and integrin binding of MF (Figure
GO and KEGG enrichment analysis of the PPI network. (a) Top 25 significantly enriched gene ontology terms, including three groups (biological process, cellular component, and molecular function),
To study the molecular mechanism of gastric cancer from a systematic perspective, PPI network was constructed to explore the relationship between proteins. PPI network was constructed by STRING for DEGs with a confidence level of >0.4. The result of network analysis showed that PPI enrichment
In order to explore more closely related genes in the complex PPI network, we conducted module analysis of the network by MCODE. The result showed that there were 27 modules in PPI network. We found that the first module was the most densely interacted region in PPI network, with a score of 15.488. The module was located at the center of the entire network, including 44 nodes and 333 edges (Figure
Module analysis of PPI networks obtained through Cytoscape’s plug-in MCODE. The most prominent module in the PPI network included 44 nodes and 333 edges.
To analyze the key genes in the complex PPI network, we used the centrality analysis to analyze them. First, we analyzed the subcases of the three parameters by their density. The results showed that degree, betweenness, and eigenvector were power-law distributions (Figure
Centrality analysis of PPI networks obtained through Cytoscape’s plug-in CytoNCA. (a) A density diagram of degree centrality. (b) A density diagram of betweenness centrality. (c) A density diagram of eigenvector centrality.
Correlation analysis of the top 5% of molecules of each centrality (degree, betweenness, and eigenvector). (a) The correlation coefficient between degree and betweenness was 0.793. (b) The correlation coefficient between degree and eigenvector was 0.920. (c) The correlation coefficient between betweenness and eigenvector was 0.620. (d) Venny2.1.0 was used to obtain the intersection of the top 5% of genes of each centrality (degree, betweenness, and eigenvector). The results showed that the 18 keys were further studied because of their high degree, betweenness, and eigenvector values.
Top 5% of candidate genes in the centrality analysis.
Gene | Degree | Betweenness | Eigenvector |
---|---|---|---|
FN1 | 127 | 77922.220 | 0.253395320 |
MMP9 | 93 | 45189.664 | 0.198073490 |
CXCL8 | 85 | 33816.297 | 0.146214620 |
CD44 | 84 | 41183.110 | 0.172110660 |
MYC | 82 | 48818.060 | 0.120246940 |
CXCL12 | 70 | 19844.107 | 0.157128860 |
TIMP1 | 65 | 8912.090 | 0.174682420 |
PTGS2 | 64 | 26884.875 | 0.114611500 |
SPP1 | 62 | 9094.989 | 0.165916760 |
APOB | 54 | 28650.291 | 0.087770930 |
VCAN | 52 | 9312.416 | 0.134104030 |
ICAM1 | 52 | 10434.576 | 0.122722970 |
CXCL1 | 52 | 10233.371 | 0.106985554 |
APOE | 51 | 17188.280 | 0.110692450 |
STAT1 | 51 | 19835.434 | 0.081156254 |
KRAS | 51 | 27196.115 | 0.077731330 |
BGN | 48 | 11646.911 | 0.141241250 |
C3 | 42 | 9121.711 | 0.088115714 |
Survival analysis of seven candidate genes was further studied using the KM plotter. The results showed that FN1, TIMP1, SPP1, APOE, and VCAN were related to OS in gastric cancer patients (
Survival analysis of key genes by the KM plotter in gastric cancer. (a) Gastric cancer patients with high expression of FN1 had poor prognosis. (b) Patients of gastric cancer with high expression of TIMP1 had poor prognosis. (c) Patients of gastric cancer with high expression of SPP1 had poor prognosis. (d) Patients of gastric cancer with high expression of APOE had poor prognosis. (e) Patients of gastric cancer with high expression of VCAN had poor prognosis (
Microarray technology is a product of the gradual implementation of the human genome project and the rapid development and application of molecular biology. With the rapid development of gene microarray technology, people can quickly measure the expression levels of thousands of genes simultaneously [
In this study, we screened a total of 1034 differentially expressed genes from GSE54129 and GSE13911 gene expression profiles, among which 403 genes were upregulated, and 631 genes were downregulated. The KEGG pathway enrichment analysis revealed that the DEGs were mainly in the pathways in cancer and PI3K-Akt signaling pathway. Previous studies had shown that gastric cancer cell proliferation can be promoted by activating PI3K/Akt signaling pathway [
To explore the pathogenesis of gastric cancer, we constructed PPI network for systematic analysis. In the STRING database, we set the minimum interaction score with parameters >0.400 to obtain the PPI network of protein interactions. This setup avoided noise and incomplete data affecting the PPI network. MCODE discovers dense regions in PPI networks based on connection data. This function is not affected by the false-positive effect of high-throughput technology. We selected the parameters
In order to further analyze the whole PPI network, the centrality analysis was used to explain the importance of the nodes in the network and the influence of the nodes on the network. We obtained 18 genes with high central values. Moreover, seven of the 18 genes were located in first-order module. Previous studies had shown that central nodes connected more protein-protein interactions, and central nodes also had more information for path enrichment analysis, which had a notable effect in the whole network [
The study of cancer survival analysis plays an important role in the evaluation of cancer prevention measures [
In this study, 1034 differentially expressed genes were identified. On based of these genes, GO and KEGG results showed they were mainly in plasma membrane, cell adhesion, and PI3K-Akt signaling pathway. Moreover, 18 topological key genes of the 1st-rank module were focused on. Furthermore, five of them (FN1, TIMP1, SPP1, APOE, and VCAN) were found to be related to gastric cancer. Therefore, it provides new research directions for the detection and treatment of gastric cancer. However, their involvement in the molecular mechanisms of disease needs further clinical studies.
The GSE54129 and GSE13911 data used to support the findings of this study are included within the article.
Xinyu Chong and Rui Peng are co-first authors.
The authors declare no conflicts of interest.
This study was supported by the National Natural Science Foundation of China (81570747).