As one of the most common malignant tumors, gastric cancer is the third cause of cancer-related mortality worldwide, which is mainly related to late presentation. Its incidence is affected by various genetic and environmental factors, reflecting a characteristic geographical distribution. Eastern Asia, Central and Eastern Europe, and South America are higher-risk areas, whereas Northern America and most parts of Africa are low-risk areas [
Many researchers have devoted themselves to study the pathogenesis of gastric cancer and look for the potential targets for diagnosis and treatment. At present, several factors, such as HER2, VEGF, FEGFR, and mammalian target of rapamycin (mTOR), have been considered as targets of therapy for gastric cancer [
To date, there are still no definitive tools for the diagnosis of gastric carcinoma, due to the fact that regulatory mechanism of gastric cancer is not clarified. The integration of multiple microarray studies may be useful to provide additional evidence for understanding the regulatory mechanism. Herein, we conducted integrated analysis of gastric cancer microarray data and identified more candidate differentially expressed genes (DEGs) between gastric cancer and normal control tissues. Moreover, the significantly enriched functions of these genes were screened and analyzed to discover the biological processes and signaling pathways associated with gastric cancer. A transcriptional regulatory network was further constructed.
Gene Expression Omnibus (GEO) database is a public functional genomics data repository (
Firstly, the six datasets were preprocessed by background correction and normalization. Limma package [
In order to assess the changes in DEGs occurring at the cellular level and the functional clustering of DEGs, the enrichment analysis tool GeneCodis3 (
DEGs between gastric cancer and normal tissues could be activated or repressed by TFs. All the TFs in human genome and the motifs of genomic binding sites were downloaded from the TRANSFAC database [
The online tool Cancer Browser (
According to the inclusion criteria, we downloaded six gene expression profiles of gastric cancer from microarray experiments. GEO IDs were GSE13911, GSE19826, GSE34942, GSE35809, GSE51105, and GSE57303. Totally, there were 340 tumor samples and 43 normal gastric tissues, respectively. The types of samples were as follows: GSE13911 (26 intestinal + 6 diffuse + 4 mixed + 2 unclassified), GSE19826 (unknown Lauren subtype), GSE34942 (39 intestinal + 11 diffuse + 6 unclassified), GSE35809 (34 intestinal + 30 diffuse + 6 unclassified), GSE51105 (49 intestinal + 35 diffuse + 10 mixed), and GSE57303 (Lauren subtype not further provided). The characteristics of eligible datasets were summarized in Table
Characteristics of the six microarray datasets for integrated analysis.
GEO ID | Platform | Sample (case : control) | Country | Year | Author |
---|---|---|---|---|---|
GSE13911 | GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 38 : 31 | Itay | 2008 | D’Errico et al. [ |
GSE19826 | GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 12 : 12 | China | 2010 | Wang et al. [ |
GSE34942 | GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 56 : 0 | Singapore | 2014 | Lei et al. [ |
GSE35809 | GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 70 : 0 | Singapore | 2012 | Lei et al. [ |
GSE51105 | GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 94 : 0 | Australia | 2014 | Busuttil et al. [ |
GSE57303 | GPL570 [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array | 70 : 0 | China | 2014 | Qian et al. [ |
Integrated analysis of six microarray datasets led to 17481 genes. Using the FDR < 0.01 as the statistical significance threshold, a total of 2327 DEGs were identified, including 2100 upregulated DEGs and 227 downregulated DEGs. The top ten upregulated and downregulated DEGs between gastric cancer and normal tissues were listed in Table
Top ten upregulated and downregulated DEGs between gastric cancer and normal tissues.
Symbol | Log FC |
|
---|---|---|
CST1 | 5.02 |
3.12 |
MMP11 | 3.27 |
5.35 |
COL1A1 | 3.27 |
1.21 |
GDF15 | 3.10 |
5.16 |
UBD | 3.02 |
8.33 |
APOC1 | 3.02 |
1.03 |
SPP1 | 2.92 |
1.67 |
CTHRC1 | 2.85 |
3.95 |
COL10A1 | 2.84 |
1.95 |
INHBA | 2.80 |
8.79 |
GKN1 | −5.70 |
6.73 |
GKN2 | −4.57 |
7.42 |
PGA3 | −4.45 |
2.89 |
MAL | −4.19 |
1.50 |
PGA5 | −3.73 |
1.24 |
ATP4B | −3.49 |
7.36 |
GIF | −3.33 |
4.22 |
ATP4A | −3.24 |
2.52 |
DPT | −3.22 |
1.67 |
C2orf40 | −3.17 |
5.14 |
GO enrichment analysis of DEGs was performed to understand their biological functions. In our present study, the three GO categories (biological process, cellular component, and molecular function) were detected, respectively, using web-based software GeneCodis3. The results of enrichment analysis showed that the significantly enriched GO terms for biological process were multicellular organismal process (GO: 32501, FDR =
Enriched GO terms of DEGs between gastric cancer and normal tissues.
GO ID | GO term | Number of genes | FDR |
---|---|---|---|
Biological process | |||
32501 | Multicellular organismal process | 58 | 1.85 |
3008 | System process | 32 | 1.96 |
7586 | Digestion | 9 | 6.11 |
44707 | Single-multicellular organismal process | 56 | 6.97 |
42391 | Regulation of membrane potential | 14 | 1.14 |
44057 | Regulation of system process | 13 | 1.23 |
7610 | Behavior | 20 | 1.27 |
43269 | Regulation of ion transport | 17 | 1.90 |
9719 | Response to endogenous stimulus | 30 | 2.12 |
1903522 | Regulation of blood circulation | 10 | 4.21 |
Cellular component | |||
5615 | Extracellular space | 22 | 4.55 |
44459 | Plasma membrane part | 48 | 1.61 |
44421 | Extracellular region part | 58 | 6.19 |
5576 | Extracellular region | 34 | 6.60 |
31226 | Intrinsic component of plasma membrane | 33 | 7.45 |
5887 | Integral component of plasma membrane | 31 | 4.49 |
31224 | Intrinsic component of membrane | 72 | 4.86 |
44425 | Membrane part | 85 | 7.79 |
1990351 | Transporter complex | 13 | 3.66 |
1902495 | Transmembrane transporter complex | 13 | 4.00 |
16021 | Integral component of membrane | 69 | 4.13 |
97458 | Neuron part | 31 | 4.20 |
34702 | Ion channel complex | 11 | 2.16 |
5886 | Plasma membrane | 53 | 2.64 |
98590 | Plasma membrane region | 22 | 3.34 |
Molecular function | |||
0005201 | Extracellular matrix structural constituent | 8 | 2.82 |
0043168 | Anion binding | 164 | 4.39 |
Moreover, the KEGG pathway enrichment analysis indicated that cell cycle (FDR =
Top 15 enriched KEGG pathways of DEGs between gastric cancer and normal tissues.
KEGG ID | KEGG term | Count | FDR |
---|---|---|---|
3040 | Spliceosome | 57 | 5.45 |
4110 | Cell cycle | 57 | 4.81 |
3013 | RNA transport | 51 | 8.58 |
3008 | Ribosome biogenesis in eukaryotes | 36 | 5.29 |
3030 | DNA replication | 23 | 8.72 |
240 | Pyrimidine metabolism | 32 | 5.19 |
230 | Purine metabolism | 40 | 1.11 |
3430 | Mismatch repair | 13 | 5.98 |
4114 | Oocyte meiosis | 27 | 2.78 |
3440 | Homologous recombination | 13 | 3.89 |
4914 | Progesterone-mediated oocyte maturation | 23 | 6.62 |
3015 | mRNA surveillance pathway | 22 | 6.82 |
4115 | p53 signaling pathway | 20 | 7.97 |
4141 | Protein processing in endoplasmic reticulum | 32 | 1.10 |
3018 | RNA degradation | 19 | 3.27 |
In order to display the TFs-target genes regulatory network for gastric cancer, we utilized TRANSFAC to inquire TFs and their latent target genes and then selected the differentially expressed TFs and latent target genes in gastric cancer tissues. We found a total of 70 differentially expressed TFs (54 upregulated and 16 downregulated) and 470 latent differentially expressed target genes in gastric cancer, respectively (Table
Top ten TFs interacting with the most DEGs.
TFs | Log FC | Count | Genes |
---|---|---|---|
BRCA1 | 1.067827 | 49 | SLC6A6, NHLRC3, TAF2, POGK, GMFB, NUSAP1, TCF20, CXCL1, FANCB, GCNT4, MPHOSPH9, TAF15, SMG1, WRAP53, HNRNPA2B1, AP1S3, ATP13A3, COPA, TMEM132A, PGM2L1, CTR9, DHX37, SAPCD2, INTS1, SLC30A7, THUMPD2, ZNF707, CCDC34, HMGN1, SLC35A2, ENPP6, CHUK, PRKCSH, ARMC10, RANBP1, LOC389906, CEP72, TIPIN, ILF3, GEMIN5, DCLRE1C, SPAG5, TRMT6, TTYH3, ZC3H11A, MIS18A, SUPV3L1, MND1, PTGES3 |
|
|||
ARID3A | 0.844259 | 47 | MCM4, LSG1, NUP35, SPINT2, C18orf54, TASP1, REXO4, VCPIP1, AGTRAP, RFWD2, QTRTD1, PPP1R9B, CACNA2D3, ZNF207, AASDHPPT, CDC123, SLC6A4, STAMBPL1, HLF, GINS1, PIGU, TRIM37, CORO7-PAM16, ADRB2, CCNF, DDX31, TTLL5, CDH24, CAD, RPAP3, IWS1, ELK1, FBXO45, NEFL, PPP3R1, TARDBP, G2E3, AMPD1, SUPT7L, NMT1, TSLP, ORC1, FANCF, FAM213B, NUP93, TACC3, CHERP |
|
|||
EHF | 1.300121 | 42 | C9orf114, LRFN4, FTSJ3, LARP4, NFYA, PDRG1, ATP2A2, DPP6, ATP2C1, SNORD116-2, DCLRE1B, NME1-NME2, CENPL, ZNF146, STIL, NLK, MFAP2, DPAGT1, SNRNP200, GDF15, ATF6, UHMK1, IFI30, TRMT1, MLH3, PLBD2, PARG, ITGA2, DARS2, LY6E, KIF4A, ADPGK, USP2, TRUB1, FGFR4, BRMS1, NEIL3, ZNF598, SAFB, NCAPG2, C2orf15, MTHFD1 |
|
|||
SOX10 | −1.06478 | 42 | YWHAB, GCA, DTYMK, TAF4, STMN1, TOP1MT, RBM12B, RAD51D, DDX10, KIFC1, CCNE2, LOC100129034, SPTAN1, DNAJC14, NUP155, SUV39H2, SNX5, SST, AJUBA, ZBTB33, CCNB1, QSOX2, NVL, NOM1, OSBPL3, ILF3, UBE2T, UBE2C, SNRPF, CBX8, PKP4, EIF3J, GCN1L1, BAZ1A, EXO1, ESRRG, ANKRD52, AGFG1, SNRNP40, TBL1XR1, SPICE1, SGOL2 |
|
|||
ZNF263 | 0.435575 | 41 | PTGES3, R3HDM1, TTYH3, RPGRIP1L, POM121, KIF2C, GABPB1, SLC7A6, ZNF526, SYMPK, KLHL12, SETDB1, PAK2, HNRNPC, POLD3, TPR, NOM1, THBS2, SULF1, SYNJ2, ATP13A2, KIF20B, CHEK1, STIP1, LRPPRC, ZMYND15, LRRC3B, MAMDC2, TNFRSF10B, SOX4, AURKAPS1, NT5C1A, TMEM199, CDK5RAP1, RAI14, SHQ1, DSCC1, ATP2A2, PTGR1, ZSCAN29, PMM2 |
|
|||
FOXL1 | 0.691973 | 38 | BCCIP, CNOT6, CCT3, CKAP2L, ZNF335, XPO5, SMARCC1, BTG2, OLFM3, PSMD12, EFCAB11, WDYHV1, PALB2, NCAPD2, TMEM5, PDRG1, FHL1, SRP72, SORCS1, TEX261, TXNDC12, ATG7, DPAGT1, HIATL1, LAMB1, UBE2O, TCOF1, NIT2, PLEKHG4, TNRC18, DUS4L, NLRC5, STAU1, TP53BP1, POLG, SSB, MMS22L, RAI14 |
|
|||
FEV | −0.75328 | 37 | WBP11, SH3KBP1, USP1, TIMM8A, KRT18, LTV1, ZNF485, PAK2, PODXL, ADHFE1, DIP2B, POLG2, PUS7, RCC2, DPM2, RPGRIP1L, BLOC1S2, WDR12, NCEH1, IWS1, COG2, DEPDC1, NCAM1, EPHB4, POLQ, CCT6A, MAPRE1, CENPW, SLC28A1, PIK3CB, RNF2, NSUN2, TYK2, DAZAP1, C2orf15, HN1L, SMYD5 |
|
|||
GATA3 | 0.606954 | 35 | FCHO1, ZDHHC9, CCNF, PIK3CB, TOP3A, ZNF678, EML4, WDR43, FANCM, GPN1, COL4A1, MB21D1, GORASP2, DUSP12, LGALS8, WDR3, CDC6, ZBTB41, EAF1, UFM1, HSPBAP1, PATL1, COL1A1, ARFGAP1, IKBIP, NOMO1, KAT2B, TTI1, SPG21, FAM107A, RAD51, HMMR, UHMK1, BMP1, ZC3H3 |
|
|||
FOXC1 | 1.210691 | 32 | VIT, PBK, AKAP8, ANAPC5, ILF2, NLN, RBM27, STX6, ZNF473, CHD4, MSH6, CREBZF, ZNF341, DBF4, ZNF107, PKP4, HNRNPD, CNOT6, U2SURP, CENPP, SFRP1, SUPV3L1, SFMBT1, CDKN3, NUP188, GCN1L1, NUPL1, MAMDC2, PMS1, RCCD1, UBQLN4, SMARCA5 |
|
|||
FOXD1 | 0.795645 | 28 | MTPAP, BAZ1A, CHEK1, SLC30A5, NCL, MAPRE1, AQP4, USP14, EARS2, SYNCRIP, PALB2, SLC37A3, PHF6, POLR3E, TPM3, HOXB9, CD46, CLCN5, GOLT1B, C2orf44, AAGAB, NEK2, FAM208B, MYH9, UGGT1, NOL10, PRIMA1, ZNF92 |
The established transcriptional regulatory network in gastric cancer. Rectangle indicates TFs, and ellipse indicates target genes. Red-color and green-color nodes represent products of upregulated and downregulated TFs, respectively. Blue nodes indicate differentially expressed target genes.
The top ten differentially expressed TFs were selected for validation. The online validation revealed that expression patterns of the top ten TFs were similar to the integrated analysis. The results revealed that SOX10 and FEV were downregulated, while BRCA1, ARID3A, EHF, ZNF263, FOXL1, GATA3, FOXC1, and FOXD1 were upregulated in primary gastric adenocarcinoma compared with the normal lung tissue (Figure
Heat map of top ten differentially expressed TFs in the dataset of TCGA stomach adenocarcinoma (STAD) gene expression by RNAseq. Sample type: green indicates the primary tumor of stomach adenocarcinoma (
Gastric cancer has few symptoms during the early stages, and most patients are usually diagnosed after the cancer has progressed to an advanced stage, which results in short survival times. Therefore, the high mortality rate underlines the need for early diagnosis and effective medical treatments for the patients [
In this study, according to integrated analysis of six microarray datasets for gastric cancer, 2327 DEGs were identified (2100 upregulated and 227 downregulated). We also observed that digestion (GO: 7586, FDR =
Moreover, 70 differentially expressed TFs were identified and a transcriptional regulatory network was constructed. In the network, top ten TFs regulating most downstream target genes were BRCA1, ARID3A, EHF, SOX10, ZNF263, FOXL1, FEV, GATA3, FOXC1, and FOXD1. Most of them were involved in the progression of gastric cancer.
BRCA1 is an important tumor suppressor, which plays an essential role in maintaining genomic stability and integrity. BRCA1 was previously suggested as a good prognostic factor for gastric cancer [
ARID3A is a member of the ARID family of DNA-binding proteins. The expression of ARID3A was markedly increased in colon cancer tissue compared with matched normal colonic mucosa. A previous study suggested that strong expression of ARID3A may predict a good prognosis in patients with colorectal carcinoma, and Song et al. mentioned that whether ARID3A acts as an oncogene or tumor suppressor remains controversial [
Abnormalities of SOX factors have been shown to play critical roles in cancer formation and development. SOX10 was identified as a methylated gene in digestive cancers [
The expression of FOXC1 has significance in the development, progression, and metastasis of gastric cancer, and overexpression of FOXC1 may serve as a useful marker for predicting the outcome of patients with gastric cancer [
It was reported that FOXL1 was also upregulated in pancreatic intraepithelial neoplasia [
The expression level of GATA3 was significantly increased in patients with gastric cancer [
Taken together, our integrated analysis discovered a bunch of DEGs in gastric cancer. Moreover, the results of function enrichment analysis revealed that some biological functions or pathways may be closely related to the development of gastric cancer, including digestion, cell cycle, and homologous recombination. The constructed transcriptional regulatory network may be helpful to further understand the underlying regulatory mechanism of gastric cancer. Ten TFs regulating most downstream target genes were obtained: BRCA1, ARID3A, EHF, SOX10, ZNF263, FOXL1, FEV, GATA3, FOXC1, and FOXD1.
Differentially expressed genes
Gene Expression Omnibus
Gene Ontology
Kyoto Encyclopedia of Genes and Genomes
Transcription factors.
The authors declare that they have no competing interests.