Progressive and Prognostic Performance of an Extracellular Matrix-Receptor Interaction Signature in Gastric Cancer

The role of an extracellular matrix- (ECM-) receptor interaction signature has not been fully clarified in gastric cancer. This study performed comprehensive analyses on the differentially expressed ECM-related genes, clinicopathologic features, and prognostic application in gastric cancer. The differentially expressed genes between tumorous and matched normal tissues in The Cancer Genome Atlas (TCGA) and validation cohorts were identified by a paired t-test. Consensus clusters were built to find the correlation between clinicopathologic features and subclusters. Then, the least absolute shrinkage and selection operator (lasso) method was used to construct a risk score model. Correlation analyses were made to reveal the relation between risk score-stratified subgroups and clinicopathologic features or significant signatures. In TCGA (26 pairs) and validation cohort (134 pairs), 25 ECM-related genes were significantly highly expressed and 11 genes were downexpressed in gastric cancer. ECM-based subclusters were slightly related to clinicopathologic features. We constructed a risk score model = 0.081∗log2 (CD36) + 0.043∗log2 (COL5A2) + 0.001∗log2 (ITGB5) + 0.039∗log2 (SDC2) + 0.135∗log2 (SV2B) + 0.012∗log2 (THBS1) + 0.068∗log2 (VTN) + 0.023∗log2 (VWF). The risk score model could well predict the outcome of patients with gastric cancer in both training (n = 351, HR: 1.807, 95% CI: 1.292-2.528, P = 0.00046) and validation (n = 300, HR: 1.866, 95% CI: 1.347-2.584, P = 0.00014) cohorts. Besides, risk score-based subgroups were associated with angiogenesis, cell adhesion molecules, complement and coagulation cascades, TGF-beta signaling, and mismatch repair-relevant signatures (P < 0.0001). By univariate (1.845, 95% CI: 1.382-2.462, P < 0.001) and multivariate (1.756, 95% CI: 1.284-2.402, P < 0.001) analyses, we regarded the risk score as an independent risk factor in gastric cancer. Our findings revealed that ECM compositions became accomplices in the tumorigenesis, progression, and poor survival of gastric cancer.


Introduction
As a common tumor of the digestive system, gastric cancer is the fifth common malignant tumor and the third leading cause of cancer death in the world [1,2]. Due to the occult course of gastric cancer, it is of great significance to clarify the pathogenesis and find effective markers for gastric cancer.
In recent years, studies have shown that the extracellular matrix (ECM) remodeling, namely, the synthesis, distribu-tion, and degradation of ECM, is closely connected to the differentiation, proliferation, invasion, and metastasis of malignant tumors [3]. ECM constitutes the main part of the extracellular microenvironment [4]. It is a complex organic unity constructed by a variety of insoluble extracellular macromolecules in a certain proportion and structure. It is the site of cell survival and activity, with physical functions such as connection, support, water retention, pressure resistance, and protection. In addition, by integrin or other cell surface receptors, it can directly interact with cells to regulate growth, metabolism, function, migration, proliferation, and differentiation of cells, thus to adjust functions of the whole tissue and organs [4]. Recent studies on solid tumors such as breast cancer and ovarian cancer have suggested that ECM underwent a remodeling process similar to embryonic development in tumor progression. The reconstructed ECM then forms a loose microenvironment for cancer cells, giving rise to high proliferation, low differentiation, and invasion and metastasis of tumor cells [5]. Therefore, the identification of prominent ECM-relevant tumor markers that derive the biological perspective into the development and progression of gastric cancer would be of clinical value. In this study, the differentially expressed ECM-relevant markers were identified between gastric cancer and normal tissues. Based on the selection operator (lasso) regression model, it revealed that the ECM-relevant markers exhibited a great value to predict the prognosis of gastric cancer.

Building a lasso Regression Model.
We conducted the univariate analysis of each ECM-receptor interaction-related genes. Then, the genes with P < 0:05 were selected in the establishment of a lasso regression model. The lasso regression model was built by the package "glmnet" of R [10]. According to the lasso model, each patient is assigned a risk score. We defined patients with a risk score ≥ median value in the high-risk group (N = 175); otherwise, in the low-risk group (N = 176).

Statistical Analyses.
We identified differentially expressed genes between tumorous and matched normal tissues in TCGA and validation cohorts by a paired t-test. Consensus clusters were built by the package "ConsensusClusterPlus" of R [11]. We identified a consensus matrix of TCGA for k from 2 to 9. Gene set enrichment analysis (GSEA) was used to analyze the most enriched gene sets of the high-and low-risk groups [12,13]. Packages "clusterProfiler" [14], "org.Hs.eg.db," "enrichplot," and "GO plot" [15] of R were applied to perform GO analyses and visualize the results. The package "GSVA" was applied to get single-sample gene set enrichment analysis (ssGSEA) of relevant signatures [16]. The package "survminer" was used to visualize the survival time of high-and low-risk groups. A P value > 0.05 was considered to indicate a statistically significant difference. All analyses were conducted with R (https://www.r-project.org/). The hazard ratios were shown with 95% confidence interval (95% CI).

Building Consensus Clusters and Correlation between
Clinicopathologic Features and Clusters. We identified consensus matrixes of TCGA for k from 2 to 9 (Figure 2(a) and Supplementary Figure 3). In consideration of discrimination and simplicity, we chose k = 2 to build consensus clusters. Principal component analysis (PCA) showed that two consensus clusters had a certain degree of differentiation ( Figure 2(b)). Patients in cluster 2 (N = 190) had worse outcomes than patients in cluster 1 (N = 185) (P = 0:0032) (Figure 2(c)). Besides, stratified clusters were slightly related to the histologic grade, cancer type, tumor stage, and TNM stage, while presenting no correlation with PIK3CA, KMT2D, PCLO, FAT4, ARID1A, LRP1B, and TP53 mutations (Figures 2(d)-2(f) and Supplementary Table 1).

Discussion
Gastric cancer is characterized by insidious onset, easy metastasis, early misdiagnosis, and high recurrence rate [17]. Due to the lack of a simple domestic screening system, most patients with gastric cancer are in the late stage when first diagnosed, greatly influencing their clinical therapeutic effect and survival quality [18]. Within this context, tumor markers, in the field of biochemistry, have received increasing attention for their characters such as noninvasive, safe, simple, inexpensive, and easy to monitor dynamically [19]. For gastric cancer, many tumor markers have been detected   0  10  20  30  40  50  60  70  80  90  100  Time in months  Survival probability   150  139  124  108  103  99  78  56  42  17  4  150  126  97  81  72  66  62  46  32

18
Disease Markers from the perspective of genetic traits or genetic modification. In this study, we revealed that in gastric cancer, many ECMrelevant molecules also were effective tumor markers, possessing an important value in clinical application.
In previous research, ECM-relevant molecules have been identified as progression and prognostic biomarkers in some other solid tumors that were used for impacting clinical decisions and overall outcomes. For example, in colon adenocarcinoma (CAC), COL1A2, THBS2, and COL1A1 were related to prognosis [20]. In addition, it was found that the level of ITGA5 in CAC was significantly linked to overall survival (OS), which might serve as an independent prognostic indicator [21]. In neuroblastoma, it has been revealed that there existed an association between SDC3 expression and improved prognosis [22]. Additionally, the high expression level of SDC3 was also associated with poor prognosis in patients with renal cell carcinoma [23]. For lung cancer, COL5A1 was highly expressed in patients with recurrence and short survival [24]. SSP1 was upregulated in tumor tis-sues, and low expression of SSP1 had a significant relationship with the better outcome [25]. Moreover, according to the reported references, FN1 likely represented a signature biomarker for lung cancer in the prediction of responses to treatments [26]. In contrast to these cancer types that we have discussed, ECM-receptor interaction-relevant genes have been poorly studied as progressive and prognostic biomarkers in gastric cancer. Through the KEGG database, we systematically examined 84 ECM-receptor interactionrelevant genes in this study and found that most of them were differentially expressed in gastric cancer tissues. On the basis of these genes, we divided patients into two subclusters. As we had expected, the subclusters exhibited good prognostic performance (P = 0:032). For better prediction of survival with ECM-receptor interaction-relevant genes, lasso regression analysis was then conducted. Thereinto, we found that eight significant genes (VTN, SV2B, CD36, VWF, ITGB5, SDC2, COL5A2, and THBS1) were related to ECMreceptor interaction and an eight-gene risk score model

19
Disease Markers was constructed based on them. The risk score model had its favorable performance in predicting prognosis of gastric cancer. The eight genes may be potential prognostic markers for gastric cancer.
In a variety of tumors, such as cervix neoplasia [27], ovarian cancer [28], and prostate cancer [29], VTN was considered a promising biomarker, which encoded vitronectin, an adhesive glycoprotein that connected cells with ECM. Recently, a report also revealed that VTN was a poor prognostic factor in gastric cancer [30]. Likewise, VWF, encoding von Willebrand factor that is a platelet adhesion glycoprotein, has been widely used as a biomarker in cancer, and it also has been identified as a new therapeutic target in gastric cancer [31]. As for THBS1, encoding thrombospondin 1, it took part in angiogenesis and tumor progression, whose increased expression was significantly correlated with tumor differentiation [32]. COL5A1, encoding an alpha chain of type V collagen, was a promising prognostic marker considered to have a good potential for the treatment of patients with gastric cancer as well [33]. The expression of CD36 was reported in relation to gastric cancer metastasis via O-GlcNAcylation [34]. However, the current literature mostly explores the role of one gene in gastric cancer and rarely links them to explore the combined effect on the gastric cancer treatment. Besides, ITGB5, encoding integrin-β5, was thought to be involved in the regulation of tumor initiation and progression by mediating links between cells and ECM. The literature reported in glioblastoma [35], hepatocellular carcinoma [36], and cervical cancer [37] that ITGB5 could serve as a predictive biomarker. In ITGB5, the gene expression analysis identified that its expression was elevated in gastric tumor tissue [38]. Nevertheless, the function of ITGB5 in gastric cancer is not yet fully elucidated. As for SV2B and SDC2, encoding a member of the synaptic vesicle protein 2 and syndecan 2, respectively, both of them have not been fully studied in gastric cancer. SV2B was identified as a key prognosis-associated marker in glioblastoma multiforme and prostate cancer [39,40]. In spite of this, the study of SV2B in tumors is still limited. Relatively speaking, SDC2 has been well studied in various tumors, especially in colorectal cancer, lung cancer, prostate cancer, and esophageal squamous cell carcinoma [41][42][43][44][45]. According to the discussion above, we considered that SV2B and SDC2 deserved to be further studied in gastric cancer. The disruption in ECM organization lost its regularity, which will compromise gastric cancer foci. ECM compositions became accomplices in the tumorigenesis, progression, and poor survival of gastric cancer. The aberrant ECM signature should be simultaneously inhibited in the treatment of gastric cancer [46].
We further investigated the possible mechanisms underlying the differences between low-and high-risk groups. It was found that there existed a significant difference in angiogenesis between the two groups. As you know, it has been suggested that angiogenesis provided nutrients for tumor growth and pathways for cell metastasis [47]. Consistent with our research, the angiogenesis signature was upregulated in the high-risk group. Besides, the angiogenesis depends on migration and proliferation of vascular endothelial cells [48]. In this process, endothelial cells must attach to each other and to the extracellular matrix to form and expand new microvessels. ECM is one of the critical influencers in the survival of vascular endothelial cells [49]. Thus, we speculated that these differentially expressed genes could promote the formation of tumor blood vessels and further affect the development and prognosis of tumors. Moreover, cell adhesion molecules presented as one of the main media between cells and ECM. The changes of cell adhesion molecules could affect multiple signaling pathways, thereby affecting the pathophysiology of cancer tissues [50]. In addition to possible changes in angiogenesis and cell adhesion molecules, complement and coagulation cascades were also affected in gastric cancer, which might participate in tumor progression and prognosis. Increasing evidence has indicated that complement and coagulation cascades were significantly involved in the signaling pathway in gallbladder cancer [51], clear cell renal cell carcinoma [52], smallcell lung cancer [53], epithelial ovarian cancer [54], bladder cancer [55], and head and neck cancer [56]. In gastric cancer, Gu et al. once pointed that complement and coagulation cascades were significantly enriched pathways [57]. However, the research about it in gastric cancer is insufficient, and there is no direct evidence to clarify that the upregulation of this pathway connects with the prognosis of gastric cancer. From the results in this study, we also found that TGF-β signaling pathway is upregulated in gastric cancer, which was in line with the results of existing research. The dysregulated pathway could promote the generation of ECM [58], leading to tissue fibrosis. An overactivated TGF-β signaling pathway could induce tumor growth and metastasis by promoting epithelial-mesenchymal transformation and angiogenesis [59]. Of course, the results indicated that we could further research the relationship between the eight significant genes and TGF-β.
Furthermore, the downregulation of base excision repair and nucleotide excision repair signatures in the high-risk group was consistent with the current research in gastric cancer. Particularly, DNA mismatch repair is one of the most prevalent pathways involved in a damaged base excision repair system. Absence of base excision repair could result in the accumulation of DNA damage, leading to cancer malignant transformations and poor prognosis. This imbalance was also associated with DNA polymorphism regulation, and such uncorrected false DNA variant likely had relation to cancer risk [60]. The defects in nucleotide excision repair would lead to the increased instability of the genome. Besides, unrepaired DNA damage possibly increased genetic susceptibility to cancers and risk of carcinogenesis [61]. Thus, according to the mentioned results above, the association between the excision repair and eight significant genes deserved to be further explored.
Recent research suggested that the impact of age as an independent risk factor on gastric cancer may differ depending on the cancer stage [62]. Although the finding of age as an independent risk factor in this study had a certain particular value, large-scale clinical data is urgently needed to verify and thus to direct the establishment of a clinical treating scheme. We identified a risk score model to predict prognosis of 20 Disease Markers patients with gastric cancer and validate it in an independent cohort. For the simple and convenient assessment, we could choose it to provide some references. However, we need to acknowledge that the risk score is a relative value, which varies in different institutes and different detection methods. After unifying the testing methods, we need to collect as many samples as possible to identify the cut-off value to guide the oncologists.

Conclusions
In conclusion, we produced comprehensive analyses to investigate the vital role of an ECM-receptor interaction signature in gastric cancer. ECM compositions became accomplices in the tumorigenesis, progression, and poor survival of gastric cancer.