The Regulatory Network of Gastric Cancer Pathogenesis and Its Potential Therapeutic Active Ingredients of Traditional Chinese Medicine Based on Bioinformatics, Molecular Docking, and Molecular Dynamics Simulation

Objective This study aims to investigate the functional gene network in gastric carcinogenesis by using bioinformatics; besides, the diagnostic utility of key genes and potential active ingredients of traditional Chinese medicine (TCM) for treatment in gastric cancer have been explored. Methods The Cancer Genome Atlas and Gene Expression Omnibus databases have been applied to analyze the differentially expressed genes (DEGs) between gastric cancer and normal gastric tissues. Then, the DEGs underwent Gene Ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses using the Metascape database. The STRING database and the Cytoscape software were utilized for the protein-protein interaction network of DEGs and hub genes screening. Furthermore, survival and expression analyses of hub genes were conducted using Gene Expression Profiling Interactive Analysis and Human Protein Atlas databases. By using the Comparative Toxicogenomics Database, the hub genes interconnected with active ingredients of TCM were analyzed to provide potential information for the treatment of gastric cancer. After the molecular docking of the active ingredients of TCM to specific hub gene receptor proteins, the molecular dynamics simulation GROMACS was applied to validate the conformation of the strongest binding ability in the molecular docking. Results A total of 291 significant DEGs were found, from which 12 hub genes were screened out. Among these hub genes, the expressions of five hub genes including COL1A1, COL5A2, MMP12, SERPINE1, and VCAN were significantly correlated with the overall survival. Furthermore, four potential therapeutic active ingredients of TCM were acquired, including quercetin, resveratrol, emodin, and schizandrin B. In addition, the molecular docking results exhibited that the active ingredients of TCM formed stable binding with the hub gene targets. SERPINE1 (3UT3)-Emodin and COL1A1 (7DV6)-Quercetin were subjected to molecular dynamics simulations as conformations of continuing research significance, and both were found to be stably bound as a result of the interaction of van der Waals potentials, electrostatic, and hydrogen bonding. Conclusion Our findings may provide novel insights and references for the screening of biomarkers, the prognostic evaluation, and the identification of potential active ingredients of TCM for gastric cancer treatment.


Introduction
Gastric cancer, as a malignancy occurring in the gastric mucosal epithelium, is the ffth most frequent cancer and the third most common cause of cancer deaths worldwide [1], with 1.48 million new cases annually. Te occurrence and development of gastric cancer is a complicated process involving multiple factors, steps, and genes, whilst the etiology and pathogenesis have not been fully elucidated so far. It is well accepted that the risk factors of gastric cancer contain diet, lifestyle (smoking and alcohol consumption), and Helicobacter pylori infection [2]. Te main therapeutic approaches for gastric cancer in recent years were endoscopic resection [3], surgery [4], radiotherapy [5], neoadjuvant chemotherapy [6], immunotherapy [7], and traditional Chinese medicine (TCM) [8]. Early gastric cancer is difcult to be diagnosed due to its insidious onset, resulting in a large number of patients missing the golden treatment period and even approximately 70% of patients with gastric cancer at an advanced stage of diagnosis [9], which severely limits the efcacy of surgery and radiotherapy [9]. Although conventional chemotherapeutic agents, such as cisplatin and 5-fuorouracil, have provided enormous clinical benefts for patients with advanced gastric cancer, they pose a huge challenge to treatment because of their resistance and cytotoxicity [10,11]. Terefore, the search for genes tightly related to the development of gastric cancer is highly valuable for clarifying the pathogenesis of gastric cancer and dissecting potential drugs to prevent and treat gastric cancer.
TCM has historically been known for its multitargeting and low adverse efects, which has great advantages in improving the quality of life of patients with digestive system diseases [12,13]. Of note, a large number of studies have reported that TCM combined with chemotherapeutic drugs improves the survival and prognosis of patients with gastric cancer [14][15][16]. Mechanistic research [16] has manifested that TCM and its active ingredients can exert antitumor efects through various mechanisms, such as inhibition of cell proliferation, interference with angiogenesis, repression of cell motility, and regulation of infammation-related factors. Terefore, it is evident that the antigastric cancer efects of TCM and its active ingredients cannot be underestimated. Tis study set out to dissect the hub genes afecting gastric carcinogenesis by mining diverse bioinformatics databases and to search for potential therapeutic targets to provide a bioinformatics basis for the discovery of active ingredients of TCM, which were fnally confrmed by molecular docking and molecular dynamics simulation techniques.

Data Collection.
Te transcriptome data of the tumor tissues of 375 patients with gastric cancer and 32 matched normal tissues were downloaded from Te Cancer Genome Atlas (TCGA) database (https://cancergenome.nih.gov, it was accessed on November 15, 2021). Te Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/ gds/, it was accessed on November 20, 2021) was utilized to fnd datasets (GSE103236 and GSE54129), which matched the requirements of this study, of diferentially expressed gene (DEG) profles between gastric cancer and adjacent normal tissues that with gastric cancer as the keyword. Among them, the GSE103236 microarray data contained 10 gastric cancer samples and 9 normal tissue samples, and the GSE54129 microarray data consisted of 111 gastric cancer samples and 21 normal tissue samples.

Screening of DEGs between Gastric
Cancer and Adjacent Normal Tissues. DEGs in gastric cancer were screened in the transcriptome data of tumor tissues of patients with gastric cancer and 32 matching normal tissues downloaded from the TCGA database using the R language software with |log2 fold change (log2FC)| ≥ 2 and false discovery rate (FDR) < 0.01 as screening criteria. Te expression matrices of gastric cancer and normal tissues in GSE103236 and GSE54129 microarray samples were differentially analyzed by GEO 2R with the screening criteria of |log2FC| ≥ 1 and FDR < 0.05, respectively, followed by the comparison of the DEG datasets between gastric cancer and normal groups in the two sets of microarray data. Ten, the combination was conducted on the DEGs obtained from the data of the two microarray samples. Finally, the DEGs harvested from TCGA were intersected with the DEGs attained from the 2 sets of GEO microarray samples, followed by the plotting of the Veen diagram to acquire the intersecting genes, that is, gastric cancer-related DEGs.

Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Enrichment Analyses.
Metascape (https://metascape.org/, it was visited on November 25, 2021) is a repository that annotates the biological functions of genes and proteomes. In this study, the Metascape database was employed to perform the GO and KEGG enrichment analyses on the obtained DEGs. In addition, P < 0.05 was adopted as the basis for determining the specifcity of signifcant DEGs in the pathway enrichment analysis, biological process (BP), molecular function (MF), and cell components (CC).

Construction of the Protein-Protein Interaction (PPI)
Network and Acquisition of Hub Genes. Te intersecting genes were imported into the STRING database (https:// stringdb.org/, which was visited on December 5, 2021). Te PPI network of signifcant DEGs was constructed by setting the interaction score threshold to >0.400, limiting the study population to human species, and hiding the free genes disconnected from the networks, followed by the exportation of the string-interactions fle. Te string-interactions fle was imported into the Cytoscape software. Afterward, the PPI network was scored twice using the CytoNCA plug-in for betweenness (BC), closeness (CC), degree (DC), eigenvector (EC), local average connectivity-based method (LAC), network (NC), subgraph (SC), and information (IC). Te hub genes of the PPI network were retrieved based on the genes with scores greater than the median value.

Survival Analysis and Validation of Hub Genes.
Te Gene Expression Profling Interactive Analysis (GEPIA) database (https://gepia.cancer-pku.cn, it was accessed on December 15, 2021) [17] is a database containing gene expression profles of various tumors and cancers, which can be utilized to assess the mRNA expression of genes in the prognosis of gastric cancer and explore the impact of high and low expression of genes on the overall survival of patients with gastric cancer. Te hub genes obtained from 1.2.3 were sequentially imported into the GEPIA database. Te patients in the corresponding dataset were arranged into high and low-expression groups as per the median gene expression, followed by statistical analyses using the logrank test. In the analyses, P < 0.05 was considered statistically signifcant and acted as a basis for identifying whether hub genes were correlated with the prognosis of patients, and the hazard ratios (HR) indicated the probability of cancer progression or death in patients with a high gene expression relative to those with low gene expression. Te GEPIA database was adopted to verify the expression of the hub genes in gastric cancer and normal gastric tissues detected by RNA sequencing, with the results shown in box plots.

Te Pathology Section
Assay of the Hub Genes. Te expression of the proteins encoded by the hub genes in gastric cancer and normal gastric tissues was examined using the Human Protein Atlas (HPA) database (https://www. proteinatlas.org, it was accessed on December 25, 2021), with the collection of representative immunohistochemistry staining images.

Screening of Active Ingredients of TCM Targeting Core
Pathogenic Genes of Gastric Cancer. Te Comparative Toxicogenomics Database (CTD, https://ctdbase.org, it was visited on December 30, 2021), as an innovative digital ecosystem that relates toxicological information for chemicals, genes, phenotypes, diseases, and exposures, can be applied for the research of the interaction between gene targets and active ingredients of TCM [18]. Te hub genes were imported into the search box to retrieve the compounds interacting with the hub genes with humans as the species. Tereafter, the compounds were exported in the form of an Excel sheet to analyze the involved active ingredients of TCM using the interaction value >1 as the screening criteria.

Molecular Docking.
Te structural formulas of active components were downloaded from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/, it was accessed on January 5, 2022). Te corresponding three-dimensional (3D) structures were created by the Chem3D software and exported to mol2 format. Ten, the PDB format of the hub protein domain was downloaded from the Protein Data Bank database (https://www.rcsb.org/). Te protein was dehydrated and dephosphorized using the PyMOL software, and AutoDockTools1.5 was used. Te software was employed to convert the PDB format of active components of drugs and hub gene fle to the pdbqt format and search for the active pocket. Finally, the Vina script was run to calculate the molecular binding energy, followed by the display of the molecular docking results. Meanwhile, the Discovery Studio 2019 was run to fnd the docking sites and calculate the fexible binding LibDockScore. Te output molecular docking results were imported into the PyMOL software for the display of molecular docking conformations. If the binding energy was less than 0, the ligand and the receptor could bind spontaneously. When the Vina binding energy was less than −5.0 kcal·mol −1 and the LibDockScore was greater than 100, the ligand-receptor complex formed a stable docking. Te molecular docking results of the ligand-receptor complex were displayed in 3D and 2D to evaluate the reliability of bioinformatics analyses and predictions.

Molecular Dynamics
Simulation. Te optimal conformation in the molecular docking was utilized as the initial structure for further molecular dynamics simulations. Based on the docked complex, the all-atomic molecular dynamics simulation was carried out using the classical molecular dynamics simulation software GRO-MACS (2020.06), analyzing the existing mechanism and verifying the reliability of the binding model. Te Amber99SB-ILDN force feld parameters were utilized for receptor proteins and ligand molecules, and the ligand molecular topology fle was generated using Antechamber and ACPYPE programs. After the dodecahedral solvation box was selected, the nearest distance between the system boundary and the complex was set as 1.5 nm. Ten, the TIP3P water model was selected and Na + or Cl − was randomly added to the complex system using the VERLET truncation method to counteract the charge carried by the system. Te energy of the system was reduced. NVT was in charge of the system's temperature regulation, which was kept at 300 K. Te pressure was controlled by NPT to make the pressure constant at 101.325 kPa. Based on the abovementioned equilibrium, the free kinetic simulation was implemented for 100 ns. Te root mean square deviation (RMSD) was adopted to represent the degree of molecular structure changes to measure the stability of the complex system. In the meantime, the root mean square fuctuation (RMSF) and the radius of gyration (Rg) were utilized to analyze the fuctuation of protein structures and folding tightness. Te change in the protein binding cavity was refected by the solvent accessibility surface area (SASA). Subsequent to the analysis of changes in the number of hydrogen bonds between receptor proteins and ligand molecules with simulation time, the receptorligand binding free energy was calculated using the molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) method, and the trajectory data were analyzed by using the Molecular Dynamics module.

Selection of DEGs between Gastric Cancer and Adjacent
Normal Tissues. Te 501 and 3893 statistically signifcant DEGs obtained from the GSE10323 and GSE54129 datasets, respectively, were merged and de-duplicated to obtain 4249 genes. In addition, 2891 DEGs that conformed to the screening criteria were acquired from the TCGA database. Te 2 sets of data were entered into the Draw Venn Diagram for a map analysis, which fnally yielded a total of 291 intersecting genes in the GEO and TCGA data, that is, signifcant gastric cancer-related DEGs ( Figure 1).

GO and KEGG Enrichment Analyses of Signifcant DEGs in Gastric Cancer Tissues versus Adjacent Normal Tissues.
Te results of the GO analysis showed that DEGs were mainly involved in an extracellular matrix (ECM), a collagen-containing extracellular matrix, a structural molecule activity, and an external encapsulating structure ( Figure 2).
Te KEGG analysis results manifested that DEGs were majorly enriched in various signaling pathways, including IL-17 signaling pathway, TNF signaling pathway, protein digestion and absorption, gastric acid secretion, transcriptional misregulation in cancer, ECM-receptor interaction, focal adhesion, PI3K-Akt signaling pathway, cell cycle, and p53 signaling pathway ( Figure 3).

Te Survival Analysis and Verifcation of Hub Genes.
Te prognostic value of 12 hub genes was scientifcally evaluated using the GEPIA database, displaying that fve hub genes, SERPINE1, COL1A1, MMP12, COL5A2, and VCAN, exerted an obvious efect on the overall prognosis and survival of patients ( Figure 5). In Figure 5, the log-rankP < 0.05 represented that the prognosis and survival of patients with high gene expression were statistically diferent from those of patients with low gene expression. HR stood for the probability of cancer progression or death in patients with high gene expression relative to those with low gene expression. For instance, HR � 1.5 suggested that the risk of cancer progression or death in patients with high gene expression was 1.5 times higher than that in patients with low gene expression. Te results documented that the high expression of SERPINE1, COL1A1, COL5A2, and VCAN was associated with poor overall survival (P < 0.05; HR > 1), whereas the high expression of MMP12 predicted a favorable prognosis (P < 0.05; HR < 1). Furthermore, the diferences in the expression of these fve genes between gastric cancer and normal gastric tissues were further confrmed in the GEPIA database, which depicted that the mRNA levels of the abovementioned fve genes were substantially higher in gastric cancer samples than in normal gastric samples (P < 0.05; Figure 6).

Verifcation of Hub Genes by the Pathology Section Assay.
In addition to studying the mRNA levels of the hub genes, their protein levels were also measured using immunohistochemistry analysis through the HPA database. Due to the lack of immunohistochemistry staining information for the gastric cancer-related COL5A2, MMP12, and SERPINE1, the representative staining results of COL1A1 and VCAN were selected and are exhibited in Figure 7. Tere existed more positive expression results of COL1A1 and VCAN in gastric cancer tissues than in normal gastric tissues, indicating elevated protein levels of COL1A1 and VCAN in gastric cancer tissues. Tese results were concordant with the results of mRNA levels and validated our fndings in another way.

Retrieval of Active Ingredients of TCM Targeting Genes
Closely Related to Gastric Carcinogenesis. Based on the CTD database, the active compounds of TCM that acted on SERPINE1 comprised quercetin, resveratrol, and emodin. Te active compounds of TCM that targeted COL1A1 incorporated quercetin, resveratrol, and schisandrin B (Table 1).

Results of Molecular Docking.
Te results of the Vina docking showed that the hub proteins (SERPINE1 and COL1A1) could form stable t-docking with the corresponding active compounds of TCM, with a binding energy of lower than −5.0 kcal·mol −1 ( Table 2). In addition, the active ingredients of TCM were docked with corresponding target proteins using the Discovery Studio 2019 software, followed by the calculation of the LibDockScore. Te docking sites were observed for all hub proteins (SERPINE1 and COL1A1) and active ingredients of TCM. Among them, the docking models formed by SERPINE1 with quercetin, resveratrol, and emodin and COL1A1 with quercetin all had larger than 100 of LibDockScore, whereas the docking models formed by others possessed less than 100 of Lib-DockScore. Finally, the compound results output by the Vina was introduced into the PyMOL software, and 3D and 2D molecular docking with protein ligands was displayed using the Discovery Studio 2019 software. Figure 8 depicts the best combinations of the docking between target proteins and active compounds: SERPINE1 (3UT3)-Emodin and COL1A1 (7DV6)-Quercetin. performed on the abovementioned conformations as follows: SERPINE1 (3UT3)-Emodin and COL1A1 (7DV6)-Quercetin. RMSD stands for the distance between the same atoms at diferent simulation times, which can reveal the position changes between the protein conformation and the initial conformation during the simulation process. Te changing trend of RMSD of proteins and ligands is also a momentous index to judge whether the simulation is stable or not. Te analysis manifested that the SERPINE1-Emodin complex exhibited a certain degree of stability during the simulation, with the mean RMSD of 0.215 nm (max � 0.284 nm, min � 0.102 nm). Te value of the system increased slowly within 5-40 ns, and the curve tended to be stable after 40 ns. Meanwhile, it was also noted that there was a peak fuctuation in the RMSD curve of all complexes with SERPINE1 protein after 80 ns, whilst the curve of emodin was still stable. It was speculated that there might exist a certain conformational transformation of the protein at this time (Figure 9(a)), not the disturbance caused by the unstable binding of emodin. Te value of RMSD in the COL1A1-Quercetin complex system was in the range of 0.162-0.286 nm (mean � 0.229 nm). Moreover, the value of RMSD was elevated continuously within 0-20 ns. Te curve converged and maintained stably after 20 ns and fuctuated (max � 0.284 nm) after 90 ns, which was also caused by the conformational change of the protein itself. It was worth noting that emodin and quercetin molecules showed a high degree of stability throughout the simulation process, further confrming the reliability and stability of the binding (Figure 9(b)).

Results of RMSF.
RMSF is the average atomic position change for the time, which can characterize the fexibility of protein structure and the intensity of motion throughout the simulation. In this study, RMSF values were adopted to ascertain the structural fexibility of protein binding to ligands and the volatility of binding active amino acids. As manifested in Figure 10(a), the SERPINE1 protein contained a variety of fexible regions (max � 0.297 nm) and mainly was the loop structure. In addition, the RMSF values of amino acid residues in other regions were less than 0.2 nm, which demonstrated certain structural stability (mean � 0.095 nm).
As manifested in Figure 10(b), the COL1A1 protein consisted of approximately 5-segment loop structures with obvious volatility, with the highest RMSF value in the free end (max � 0.482 nm). On the other hand, the fuctuation of the residues located in the binding cavity to quercetin (THR325, ALA326, HIS328, and ASN332) was considerably lower than that of the residues in other regions, indicating that the persistent interaction between quercetin and COL1A1 proteins could stabilize the related structures and residues. Evidence-Based Complementary and Alternative Medicine 3.8.3. Hydrogen Bonds. Hydrogen bonds assume a key role in the formation and maintenance of the complex and also afict the stability of ligand-protein binding. Te hydrogen bond formation between ligand molecules and receptor proteins was dynamically observed in the time scale of 100 ns dynamics simulation. Te results documented that the emodin molecule steadily formed one hydrogen bond with the SERPIINE1 protein (Figure 11(a)). In addition, the COL1A1-Quercetin complex formed two hydrogen bonds on average (Figure 11(b)) and continued to form hydrogen bonds with residues HIS328 and ASN332. A certain number of hydrogen bonds also stabilized the binding conformation of the complex.

Rg.
Rg can characterize the compactness of protein structure and also refect changes in a protein-peptide chain looseness during the simulation. Tis study analyzed the compactness of the structure of SERPINE1 and COL1A1 proteins subsequent to ligand binding and then ascertained whether the ligands depolymerized the protein or impacted  the normal folding of proteins. Admittedly, the smaller the Rg value, the more normal and stable the structure. However, this value also is infuenced by the structure of the protein itself. Tereby, attention also needed to be paid to the stability of the curve. As described in Figure 12(a), there was no marked change in the conformational folding of SERPINE1 protein binding to the emodin molecule, with the value in the range of 2.129-2.153 nm (mean � 2.113 nm) during the whole simulation. As discovered in Figure 12(    be considered that the protein structure was stable and that the binding of quercetin did not afect the conformation of the COL1A1 protein.

Results of SASA.
Te SASA of SERPINE1 protein is detailed in Figure 13(a), with an average value of 163.399 nm 2 (max � 172.691 nm 2 , min � 151.246 nm 2 ) and periodical changes. Tus, it was speculated that the protein had a conformational transition during 25 ns, but the curve remained highly consistent and stable overall. Te correlation of the SERPINE1 protein structure presented that there were several loop structures around the emodin molecules, located binding cavity with certain fuctuations, which infuenced the contact between local solvent molecules and proteins. However, the stable conformation of emodin did not signally change the overall SASA value.
Te SASA value of COL1A1 protein fuctuated in the range of 147.332-168.371 nm 2 (mean � 155.392 nm 2 ). Te    (Figure 13(b)). Te result refected that quercetin molecules occupied the protein binding cavity to discharge the existing water molecules inside and thus diminishing its SASA value. Te high stability of the curve demonstrated that quercetin bound stably in the cavity and did not present with obvious conformational changes. In addition, when the SASA value was decomposed into each amino acid residue, it was observed that the residues binding to quercetin (THR325, ALA326, HIS328, and ASN332) had lower SASA values, which could also prove the high stability of the abovementioned binding conformation.
3.8.6. Binding Free Energy. Te binding free energy of the complex in our research was calculated using the widely applied g_mmpbsa script [19]. Te results displayed the specifc values of each energy as shown in Table 3. Also, the binding strength of ligand molecules to target proteins was quantitatively analyzed by ΔG bind . Te binding free energy was −96.588 kJ/mol for the SERPINE1-Emodin complex. Te residue TYR79 (−15.773 kJ/mol) in SERPINE1 protein had the most prominent energy contribution, followed by PHE117 (−6.127 kJ/mol), ARG118 (−4.348 kJ/mol), MET45 (−3.066 kJ/ mol), and LEU75 (−1.928 kJ/mol). Te hot spot residues that interacted with emodin were distributed around it to ensure the stability of the binding during the simulation (Figure 14(a)). Te binding free energy (−114.307 kJ/mol) between COL1A1 protein and quercetin was lower than that of the SERPINE1-Emodin complex. Among hot spot residues, LEU386 (−8.307 kJ/mol), LEU305 (−5.992 kJ/mol), and VAL258 (−5.99 kJ/mol) contributed to outstanding binding free energy and played key parts in maintaining the binding mode of the COL1A1-Quercetin complex (Figure 14(b)).

Discussion
Despite the advances in therapies for gastric cancer, overall survival and prognosis remain unsatisfactory. In recent  years, the rapid development of various bioinformatics technologies provides a viable avenue for the discovery of novel tumor-related diagnostic and therapeutic biomarkers.
In this study, DEGs in gastric cancer were frst identifed by analyzing gene expression data from TCGA and GEO databases, which acquired a total of 291 intersecting DEGs. Te GO enrichment analysis revealed that DEGs were primarily enriched in ECM, collagen-containing extracellular matrix, structural molecule activity, and external encapsulating structure. Te KEGG results elucidated that DEGs were predominantly enriched in the ECM-receptor interaction signaling pathway, the PI3K/Akt signaling pathway, the p53 signaling pathway, and so on. Te ECM is physiologically essential for intercellular signal transmission, intercellular interaction, and orchestration of cell proliferation, diferentiation, and migration [20,21]. ECM can impede tumor cell migration and invasion, and when its integrity is compromised, tumor cells are more prone to migrate and invade the microenvironment [22]. Te PI3K/Akt signaling pathway can manipulate a wide range of biological behaviors of cells, and abnormalities in the PI3K/Akt signaling pathway may trigger the development of gastric cancer [23]. In addition, the p53 signaling pathway is one of the most classic antioncogenic pathways, and p53 transcription factors are implicated in the mediation of numerous transcriptional processes and cellular processes, such as maintenance of genomic stability, cell metabolism, cell apoptosis, cell migration/invasion, and other biological processes [24,25]. A prior study [26] has found that p53 overexpression dramatically represses the growth and metastasis of tumor cells, which is due to the molecular basis of the excellent anticancer impact of p53 and explains the mutation of the p53 locus in nearly half of cancer patients [24]. In addition, the cell cycle signaling pathway is a fundamental process of cell proliferation, the enhanced activity of which can lead to tumor progression [27]. Of note, the deregulation of the cell cycle signaling pathway is a critical cause of uncontrolled cell proliferation [28]. In summary, massive research has unraveled that the ECM, PI3K/Akt, p53, and cell cycle signaling pathways are tightly associated with the development of gastric cancer, indicating the  Residues   240  245  250  255  260  265  270  275  280  285  290  295  300  305  310  315  320  325  330  335  340  345  350  355  360  365  370  375  380  385  390  395  400  405  reliability of the bioinformatics analysis results in this study for the regulatory network of gastric cancer pathogenesis. Our study predicted 12 hub genes with a strong correlation with gastric carcinogenesis. Moreover, combined with the prognostic value, it was observed that the alterations of fve genes, COL1A1, COL5A2, MMP12, SERPINE1, and VCAN, were strongly related to the poor overall survival of patients. COL1A1 and COL5A2 both belong to the collagen family, which is a major component of the ECM [29]. More importantly, the upregulation of collagens assumes a critical role in the promotion of tumor growth. COL1A1, as the most abundant protein in the encoded collagen family, is a primary component of the ECM that can afict cell behaviors and tissue structures [30]. Li et al. [31] concluded that COL1A1 suppressed proliferation, migration, and invasion of gastric cancer cells. In addition, Zhang et al. [32] further demonstrated that ectopic COL1A1 facilitated gastric cancer cell proliferation in vitro. A mechanistic study elucidated that COL5A2 might accelerate tumor progression through hypoxia, coagulation, apical junction, angiogenesis, and apoptosis [33]. Tan et al. [34] discovered that COL5A2 upregulation contributed to the facilitation of gastric cancer cell migration. Together, these studies illuminate that COL1A1 and COL5A2 may not only be reliable biomarkers of gastric cancer cell proliferation and metastasis, and also key predictors of poor prognosis in patients with gastric cancer.
Te dysregulation of MMP12 (also known as human macrophage metalloelastase) has been hypothesized to be linked to all sorts of cancers, such as gastric cancer [35], but it predicts diferent prognoses in diferent tissues. Cheng et al. [36] found that the high expression of MMP12 predicted a good prognosis in gastric cancer due to a tight correlation with reduced angiogenesis and vascular infltration, which could function as a valid predictor for patients with gastric cancer. Consistently, the present study also elaborated that gastric cancer patients with MMP12 high expression had longer overall survival.
A prior study [37] reported that the SERPINE1 gene, also termed PAI-1, was associated with oncogene activation. Another mechanistic study unraveled that SERPINE1 enhanced metastasis in gastric cancer and accelerated peritoneal tumor growth in a mouse model of gastric cancer metastasis [38]. Also, it was clarifed in previous research [39] that SERPINE1 was a potent biomarker correlated with epithelial-mesenchymal transition in gastric cancer. Te research of Yang et al. [40] identifed that SERPINE1 could promote tumor cell proliferation, migration, and invasion by manipulating EMT and that SERPINE1 overexpression culminated in a poorer prognosis and could be an independent prognostic factor for patients with gastric adenocarcinoma.
VCAN, a multifunctional proteoglycan, is a member of the proteoglycan family, which is a main component of the ECM. It has been documented that VCAN is aberrantly expressed in a huge range of tumors, such as breast [41], ovarian [42], and colorectal [43] tumors and plays a pivotal role in tumor cell invasion, metastasis, and immune infltration. Accumulating research [44,45] unveiled that VCAN is highly expressed in gastric cancer and closely related to the survival of patients with gastric cancer patients, which thereby might act as an essential prognostic marker for the survival of patients with gastric cancer. Huang et al. [46] suggested that VCAN might impact the development of gastric cancer by modulating the tumor microenvironment, which might be a potential therapeutic target for gastric cancer. Te prediction in our research is concurrent with the abovementioned fndings and illustrated that VCAN may represent a novel prognostic biomarker for gastric cancer.
Te search for target genes is a very important part of the drug discovery process. Intriguingly, mounting molecules, compounds, and drugs have been noted to share complex relationships with a large quantity of genes and proteins [47][48][49]. TCM and natural compounds contain a large number of active ingredients, which provides more possibilities and opportunities for drug development and use. In this study, a total of four potential active ingredients in TCM were identifed based on the CTD database, including quercetin, resveratrol, emodin, and schisandrin B, all of which could be utilized in the prevention and treatment of gastric cancer.
It is extensively accepted that Vina binding energy to the receptor protein < −5.0 kcal·mol −1 or LibDockScore > 100 indicates the strong binding power of compounds. Our data displayed that all of the four screened active ingredients of TCM had less than −5.0 kcal·mol −1 of binding to the corresponding protein receptor molecules in the docking results. At the molecular level, the aforesaid results illustrated that the potential therapeutic efects of these four active ingredients, especially the SERPINE1-Emodin complex and the COL1A1-Quercetin complex, for gastric cancer not only possessed Vina binding energy < −5.0 kcal·mol −1 but also had LibDockScore > 100. Molecular dynamics simulations help unveil various dynamic interactions between a ligand and receptor, their interaction mechanism, and stability [50]. Here, we found that SERPINE1-Emodin and COL1A1-Quercetin could demonstrate relatively stable binding, which was consistent with the molecular docking results. Of note, Van der Waals potentials, electrostatic, and hydrogen bonding are most critical for their stable binding. SERPINE1 and COL1A1 have high overall fexibility and contain multiple fexible regions, which may be related to the specifc structure of the proteins. Based on this, sufcient attention should be paid to the research of the mechanism of these two complexes in the improvement or the overall pathogenesis and treatment of gastric cancer.
Quercetin is a common favonoid that is an active ingredient in numerous Chinese herbal medicines. Quercetin has been reported to exhibit antioxidant [51], antiinfammatory [52,53], and antimicrobial activities [54,55] and is also considered an anticancer agent [56]. A large body of epidemiological evidence has elucidated that the consumption of quercetin-rich vegetables and fruits may prevent the development of several cancers [57,58]. Quercetin exerts antitumor impacts on gastric cancer cells by inducing apoptosis [59]. Another study [60] manifested that Quercetin could mediate the Akt-mTOR and hypoxia-induced factor 1 α (HIF-1α) signaling pathways to activate the autophagic process in gastric cancer cells and also restrict gastric cancer cell metastasis by blocking the uPA/uPAR function [61].
Emodin, the principal active ingredient of the Chinese herbal medicines, Rheum ofcinale, Polygonum multiforum, and Aloe leaves, is an anthraquinone derivative with various pharmacological activities, including antioxidant, anticancer, and anti-infammatory efects [62]. Notably, prior research [63] has unraveled that emodin can subdue cell proliferation, facilitate cell apoptosis, and alter cell redox status, invasion, metastasis, and tumor angiogenesis. Currently, emodin has been evidenced to impede the growth of cells in lung, colon, and gastric cancers [64,65].

Conclusion
Altogether, we analyzed the prognostic value of the hub genes and elucidated the interactions between the genes in gastric cancer pathogenesis. Te functional analysis revealed the enrichment of ECM-receptor interaction, PI3K/Akt and p53 in gastric cancer by bioinformatics. Besides, we further mined the active ingredients of TCM targeting the hub genes, and it is shown that bioinformatics combined with molecular docking and molecular dynamics simulations can not only screen the hub pathogenic genes and potential active ingredients of medicines but also unveil the binding pattern of small-molecule ligands to protein receptors of the disease. We hope this study could provide a novel perspective for the biomarkers screen and TCM active ingredients' selection in gastric cancer.

Data Availability
Te datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest
Te authors declare that they have no conficts of interest.