Transcription Analysis of the THBS2 Gene through Regulation by Potential Noncoding Diagnostic Biomarkers and Oncogenes of Gastric Cancer in the ECM-Receptor Interaction Signaling Pathway: Integrated System Biology and Experimental Investigation

Background Gastric cancer (GC) is the second most frequent cause of cancer-related death worldwide and the fourth most common malignancy. Despite significant improvements in patient survival over the past few decades, the prognosis for patients with GC remains dismal because of the high recurrence rate. In this comprehensive system biology and experimental investigation, we aimed to find new novel diagnostic biomarkers of GC through a regulatory RNA interaction network. Methods Gene expression, coexpression, and survival analyses were performed using microarray and RNAseq datasets (analyzed by RStudio, GEPIA2, and ENCORI). RNA interaction analysis was performed using miRWalk and ENCORI online databases. Gene set enrichment analysis (GSEA) was performed to find related signaling pathways of up- and downregulated genes in the microarray dataset. Gene ontology and pathway enrichment analysis were performed by the enrichr database. Protein interaction analysis was performed by STRING online database. Validation of expression and coexpression analyses was performed using a qRT-PCR experiment. Results Based on bioinformatics analyses, THBS2 (FC: 7.14, FDR < 0.0001) has a significantly high expression in GC samples. lncRNAs BAIAP2-AS1, TSIX, and LINC01215 have RNA interaction with THBS2. BAIAP2-AS1 (FC: 1.44, FDR: 0.018), TSIX (FC: 1.34, FDR: 0.038), and LINC01215 (FC: 1.19, FDR: 0.046) have significant upregulation in GC samples. THBS2 has a significant role in the regulation of the ECM-receptor signaling pathway. miR-4677-5p has a significant RNA interaction with THBS2. The expression level of THBS2, BAIAP2-AS1, TSIX, and LINC01215 has a nonsignificant negative correlation with the survival rate of GC patients (HR: 0.28, logrank p: 0.28). qRT-PCR experiment validates mentioned bioinformatics expression analyses. BAIAP2-AS1 (AUC: 0.7136, p value: 0.0096), TSIX (AUC: 0.7456, p value: 0.0029), and LINC01215 (AUC: 0.7872, p value: 0.0005) could be acceptable diagnostic biomarkers of GC. Conclusion BAIAP2-AS1, lncRNA LINC01215, lncRNA TSIX, and miR-4677-5p might modulate the ECM-receptor signaling pathway via regulation of THBS2 expression level, as the high-expressed noncoding RNAs in GC. Furthermore, mentioned lncRNAs could be considered potential diagnostic biomarkers of GC.


Introduction
Globally, gastric cancer (GC) is the second most common cause of cancer-related mortality and the fourth most prevalent malignant disease [1].The prognosis for patients with GC is still poor because of the high recurrence rate, despite major advancements in patient survival over the past several decades [2].GC is frequently identified at an advanced stage.Since most cases of GC are asymptomatic until they reach late stages, it is crucial to use efficient screening techniques to identify cases early in order to reduce GC fatalities [2].In order to serve as a marker for healthy biologic processes, destructive processes, or pharmacological responses to therapeutic interventions, biomarkers are traits that can be objectively studied and measured.Recent developments in genome analysis have led to the discovery of several biomarkers relating to DNA, RNA, exosomes, etc.The creation of these biomarkers in the field of cancer therapy is anticipated to have a significant impact on the progression of the disease, the choice of effective therapeutic approaches, and effective follow-up programs [2].
Better technology and bioinformatics analyses to comprehend dynamic changes in biology and tumor plasticity will be linked to further advancements in cancer therapy.Consideration must be given to tumor heterogeneity, the interaction between the cancer genome and the epigenome, the surrounding microenvironment, and vertical access (changes over time) of cancer biological components to address molecular evolution and horizontal access (changes over sites of disease involvement) to address tumor heterogeneity.The potential of computational medicine and data sharing inspires researchers to create exciting initiatives that integrate big data and bioinformatics.The possibility of treating cancer ultimately rests with the development of efficient treatment approaches, well-planned clinical trials, and coordinated efforts among crucial players in cancer therapy [3].
Long noncoding RNAs (lncRNAs) have gotten a lot of attention as possible diagnostic, prognostic, or predictive biomarkers because of their high specificity and ease of accessibility in a noninvasive way, as well as their aberrant expression under diverse pathological and physiological situations.They could possibly be used as stomach cancer treatment targets [4].
Based on previous studies, lncRNAs have significant roles in the different biological processes correlated to GC.For example, lncRNA PCAT-1, which is significantly expressed in tissues and cells of gastric cancer resistant to DDP, increases DDP resistance in gastric cancer cells by engaging EZH2 to epigenetically repress PTEN expression and controlling the miR-128/ZEB1 axis [5,6].EZH2 is also considered as a potential prognostic biomarker of hepatocellular carcinoma [7].Similarly, it has been discovered that the DDP-resistant gastric cancer cells SGC7901/DDP and BGC823/DDP express the lncRNA DANCR at a high level.DANCR knockdown in these cells encourages apoptosis and prevents cell division.On the other hand, DDP-induced SGC901 and BGC823 cells with overexpressed DANCR might increase the expression of MDR genes MDR1 and MRP1 [8].Through the upregulation of MDR1, MRP1, and Bax expression as well as the downregulation of Bcl-2 expression, lncRNA SNHG5 decreased the DDP sensitivity of the gastric cancer cells BGC823 and SGC7901 [9].
Based on GeneCards (http://genecards.org),THBS2 produces a member of the thrombospondin family of proteins.It is a homotrimeric glycoprotein with disulfide links that mediate interactions between cells and between cells and a matrix.It has been demonstrated that this protein acts as a powerful inhibitor of tumor angiogenesis and proliferation.Studies of the mouse equivalent imply that this protein may modify the mesenchymal cells' cell surface characteristics and be involved in cell adhesion and migration.Through regulation by miR-221-3p, THBS2 might promote angiogenesis in cervical cancer [10].Zhang et al. in 2022 revealed that THBS2 has a significant upregulation in gastric cancer patients.Also, this study revealed that high expression of THBS2 has significant correlations with pathological grade, T stage, and poor overall survival of patients [11].
In this study, we performed a comprehensive bioinformatics investigation and experimental validation to find potential novel biomarkers of GC.Also, we demonstrate novel RNA and protein interaction networks to find novel regulatory noncoding RNAs in GC patients.The central core of this study is the THBS2 gene as a potential misregulated mRNA in GC patients.

Materials and Methods
2.1.Microarray Data Analysis.Microarray analysis was performed on the gastric cancer-related datasets.GSE54129 was investigated in order to find the differentially expressed genes (DEGs) in the gastric cancer microarray datasets.111 GC samples and 21 control samples from this dataset were evaluated.GPL570 (HG-U133 Plus 2, Affymetrix Human Genome U133 Plus 2.0 Array) is the source of this dataset.The raw data from the GEO online database (https://www .ncbi.nlm.nih.gov/geo/) was transmitted to the RStudio environment and then normalized using the affy [12] package.The microarray dataset underwent statistical analysis using the limma [13] package.The affy and limma packages were obtained from the Bioconductor online site (https://www .bioconductor.org/).For the analysis of microarray data, a significance threshold of 0.0001 was chosen (adjusted p value).The microarray data analysis visualizations were created using the ggplot2 [14] and pheatmap packages, which are available from CRAN (https://cran.r-project.org).In this microarray study, the expression of 47568 RNA transcript (21257 genes) was investigated.Following normalization (RMA method), logarithmic scaling, and elimination of the transcripts with no expression in the dataset, the difference in the expression level of all RNAs was calculated.The RNAs with logFC > 3 were chosen as the upregulated RNAs, while logFC < −3 was selected as the threshold of low expression.
The STRING online database analyzed and visualized protein-protein interactions [20].GO and pathway enrichment analyses were performed by enrichr online database [21][22][23].The investigation of the interactions between  International Journal of Genomics microRNAs and mRNAs was carried out by miRWalk (http:// mirwalk.umm.uni-heidelberg.de/)[24][25][26].The top miRNAs with the following criteria were selected: binding probability, 1; position, 3 ′ UTR (seed region); and lowest binding energy.Sampe size was calculated using the following formula: is the variance of control and tumor samples, and Z 1−α/2 and Z 1−β are the statistical power of samples (numeric values: 1.96 and 0.84, respectively).

Clinical Characteristics of Tissue Samples.
All patients signed written consent forms, and all methods for the research in this study involving human samples were approved by the Al-Zahra Hospital Ethics Committee, Isfahan University of Medical Science.Samples of normal gastric tissue and gastric cancer from 25 individuals with gastric cancer were compared in a case-control study.Normal gastric tissues are adjacent to tumor samples.None of the patients had ever received radiation or chemotherapy.Tissue samples were rinsed in distilled water and promptly frozen in liquid nitrogen for RNA Later solution (Invitrogen, USA) immersion for pathologist assessment.The clinicopathological characteristics of patients with breast and stomach cancer are listed in Table 1.
2.5.RNA Extraction, cDNA Synthesis, and qRT-PCR Experiment.TRIzol was employed to extract the RNA from both tumorous and normal tissues (Invitrogen, Carlsbad, CA, USA).Following the RNA extraction steps, cDNA synthesis was performed using the TaKaRa cDNA synthesis kit according to the manufacturer's instructions (TaKaRa, Tokyo, Japan).SYBR green, an Amplicon Company product from Denmark, was utilized to do real-time PCR, and a MIC realtime PCR instrument was used to perform reverse      1.As an internal control, the related expression was normalized using the quantity of GAPDH.In Table 2, the primer sequence is displayed.
The GraphPad Prism application was used to statistically evaluate the real-time PCR data and related visualizations (version 8).The qRT-PCR data were compared using the CT method to determine the expression levels between the tumor and control samples [27].The Shapiro-Wilk test was used to the expression data in order to ascertain whether the data were normal.Using paired t-test and Wilcoxon test on the CT data, the expression levels in tumor and control samples were compared.The DEG analysis of the microarray data was performed in RStudio (4.1.2).Based on 7 International Journal of Genomics sensitivity and specificity, the recipient operating characteristic (ROC) analysis was carried out by the GraphPad Prism for the real-time PCR datasets.A p value of less than 0.05 was selected as the significance threshold for this study.AUC values between 0.7 and 0.8 in the ROC analysis are regarded as acceptable, 0.8 and 0.9 as good (signifying a good biomarker), and 0.9 and 1 as excellent (indicating an outstanding biomarker).

Results
3.1.Microarray Data Analysis.Microarray data analysis revealed 37 upregulated genes and 60 low-expressed genes in the GSE54129 dataset.List of top 20 up-and downregulated genes is provided in Table 3. Figure 1 shows the heatmap of top 50 DEGs.Correlation clustering method was performed on samples and genes in this heatmap.Control and tumor samples are completely separated in different clusters.Also, upregulated and downregulated genes are completely clustered.Volcano plot of all genes in GSE54129 revealed up-and downregulated genes in the dataset (Figure 2).

GSEA.
Based on GSEA, upregulated genes of GSE54129 regulate the ECM-receptor interaction signaling pathway (Figures 3 and 4).Based on mentioned analysis, THBS2 is the most significant upregulated gene in the ECM-receptor Figure 5: miRNA and lncRNA interaction analysis of THBS2.Based on the miRWalk database, miRNA interaction analysis was performed.miR-4677-5p has the strongest interaction with the 3′UTR area of THBS2.In addition, lncRNAs LINC01215, TSIX, and BAIAP2-AS1 have direct interaction with THBS2 mRNA, based on the lncRNA interaction analysis using lncRRIsearch database.However, there is no interaction between miRNAs and lncRNAs in this network.International Journal of Genomics  9 International Journal of Genomics interaction pathway (FWER p value < 0.0001, rank metric score: 1.38).THBS2 was selected for further investigation.List of significant upregulated genes in mentioned signaling pathway is provided in Table 4.

GO and Pathway Enrichment
Analyses.Pathway enrichment and GO analyses were performed on mentioned pro-teins to find the biological processes, molecular functions, and cellular component, related to THBS2 and its interactome.Based on mentioned analyses, THBS2 and its interactome are located in collagen-containing extracellular matrix (GO:0062023).Also, mentioned proteins (Figure 6) mostly regulate metalloendopeptidase activity (GO:0004222).Furthermore, mentioned genes are significantly involved in extracellular structure organization (GO:0043062) process (Table 5).Pathway enrichment analysis revealed that THBS2 is significantly regulated following signaling pathways: ECM-receptor interaction, malaria, leukocyte transendothelial migration, phagosome, focal adhesion, and proteoglycans in cancer (Table 6).

Discussion
Our study demonstrates comprehensive novel results about the possible roles of coding and noncoding RNAs in GC development.Based on our investigation, lncRNAs BAIAP2-AS1, LINC01215, and TSIX could be considered novel potential diagnostic biomarkers of GC.Based on our bioinformatics and experimental analyses, mentioned lncRNAs might regulate the expression level of THBS2, a high-expressed mRNA, in GC patients.THBS2 and its interactome regulate the ECM-receptor interaction signaling pathway.Previous studies There was no previous study about the possible regulatory role  International Journal of Genomics of mentioned lncRNAs in the ECM-receptor signaling pathway.High expression of mentioned lncRNAs might disturb normal processes of the ECM-receptor signaling pathway.This disturbance may lead the normal gastric cells to malignancy.Furthermore, based on our analyses, the expression level of THBS2, BAIAP2-AS1, TSIX, and LINC01215 has a nonsignificant negative correlation with the survival rate of GC patients.ROC analysis revealed that BAIAP2-AS1, LINC01215, and TSIX could have a significant role as potential diagnostic biomarkers of GC.Furthermore, we demonstrate that the expression level of THBS2 has a significant positive correlation with the expression of BAIAP2-AS1, TSIX, and LINC01215, based on the qRT-PCR experiment.This result could validate our bioinformatics coexpression analyses.Also, based on our bioinformatics analyses, miR-4677-5p has a potential interaction with THBS2 mRNA.Low expression of THBS2 via miR-4677-5p could be considered a potential therapeutic method for GC patients.
Previous studies demonstrated possible roles of noncoding RNAs in different patients, including multiple sclerosis [29], breast cancer [30,31], and gastric cancer [32].Previous studies revealed some novel information about the possible roles of BAIAP2-AS1 in different cancer types.For example, Yang et al. in 2021 revealed that BAIAP2-AS1 might have a regulatory role in the miR-361-3p/SOX4 competitive endogenous RNA (ceRNA) axis.Based on this study, mentioned ceRNA axis could regulate the malignant progression of hepatocellular carcinoma (HCC).The mentioned study suggests that BAIAP2-AS1 has a significantly high expression in HCC samples in the TCGA RNAseq datasets, qRT-PCR experiment, and HCC cell lines [33].Mao et al. in 2018 revealed that BAIAP2-AS1 could have a significant role in the prediction of cervical cancer survival.In the mentioned study, a high-throughput TCGA data analysis was performed to evaluate the expression level of lncRNAs and the relation of selected DEGs with the survival rate of cervical cancer patients.Based on ROC analysis, BAIAP2-AS1 could act as a diagnostic biomarker of cervical cancer [34].Gong et al. in 2016 revealed that BAIAP2-AS1 has a significant upregulation in the hepatitis B virus-related HCC.Also, by silencing BAIAP2-AS1 (using small interfering RNAs (siRNAs)), it is demonstrated that BAIAP2-AS1 has a significant role in the regulation of MAPKAP1 and RAF1, and this lncRNA could act as a ceRNA in the HCC patients [35].There was no study about the possible role of BAIAP2-AS1 in GC, and we performed this study of BAIAP2-AS1 for the first time.
Previous studies revealed the possible roles of LINC01215 in the progression of different cancers.For example, according to Liu et al. in 2020, LINC01215 has a significant role in the survival rate of breast cancer patients (HR: 0.84, p value: 0.0001).Furthermore, based on the mentioned study, LINC01215 has a significant association with immune-related functions (the result of the Pearson correlation method) [36].Liu et al. in 2021 revealed that LINC01215 could promote lymph node metastasis and epithelial-mesenchymal transition in ovarian cancer.Based on this study, the downregulation of LINC01215 increases the expression level of RUNX3 through methylation of the RUNX3 promoter [37].
Furthermore, the downregulation of LINC01215 suppresses tumor growth, migration, and cell proliferation of ovarian cancer [37].Xu et al. in 2020 revealed that LINC01215 suppresses the growth of clear cell renal cell carcinoma tumors through reducing SLC2A3 expression level via miR-184 [38].There was no previous study about the possible role of LINC01215 in the development of gastric cancer.About the possible role of lncRNA TSIX in the development of GC, Sun et al. in 2021 revealed that TSIX might regulate the GC development via miR-320a/RAD51 ceRNA axis.Furthermore, this study revealed that the low expression of TSIX is one of the possible causes of RAD51 downregulation in the mentioned ceRNA axis.This ceRNA network simultaneously triggered the ATF6 signaling pathway following endoplasmic reticulum stress to encourage the death of GC cells and stop the illness.The TSIX/miR-320a/Rad51 network offers a novel method for treating GAC disease and may be a possible biological target of the GC [39].
Our study demonstrates comprehensive novel results about the possible roles of different coding and noncoding RNAs in GC development.Based on our investigation, lncRNAs BAIAP2-AS1, LINC01215, and TSIX could be considered novel potential diagnostic biomarkers of GC.Based on our bioinformatics and experimental analyses, mentioned lncRNAs might regulate the expression level of THBS2, a high-expressed mRNA, in GC patients.THBS2 and its interactome regulate the ECM-receptor interaction signaling This analysis was performed using GEPIA2 online database, and the statistical parameter was based on the default setting of this database.
13 International Journal of Genomics pathway.Previous studies approved the possible role of the ECM-receptor signaling pathway in the regulation of progression, survival rate, and tumorigenesis of GC [28].
In our study, we introduced a potential signature model for gastric cancer survival prediction, including THBS2, BAIAP2-AS1, LINC01215, and TSIX.However, our result was not statistically significant.Based on the obtained gene score, we suggest that same investigation through different methods be evaluated on our 4 evaluated genes.Previous studies revealed some potential signature models in GC.For example, Cheong et al. at 2020 demonstrated a predictive 32-gene signature model for GC using a predictive model based on support vector machine (SVM) [40].Another study at 2023 found a 7-gene signature model including the following genes: CCDC91, DYNC1I1, FAM83D, LBH, SLITRK5, WTIP, and NAP1L3.Through a similar approach to our investigation, they demonstrated that their suggested signature model modulated the TGF-beta signaling pathway [41].Zhang et al. at 2021 introduced a 4-gene prognostic model for GC through a multivariable Cox regression analysis.Based on this article, the following four genes could have potential implications for the prediction of GC: UTRN, MUC16, CCDC178, and HYDIN [42].However, there is no certain prediction model for GC, and more studies and validations are needed.It is highly recommended that the expression level of miR-4677-5p be evaluated by different experimental methods, like qRT-PCR.lncRNA-mRNA and miRNA-mRNA interactions in this study should be validated using different methods, like luciferase assay.A possible correlation of SNPs in the THBS2 sequence with the binding affinity of miR-4677-5p might be a perfect method to find more accurate information about the reasons for the high or low expression of THBS2.Since the result of our survival analysis was not significant, we highly recommend that the possible correlation of mentioned RNAs with 15 International Journal of Genomics the survival rate of GC patients be evaluated through a bigger sample size.

Conclusion
In this study, we demonstrate that upregulation of THBS2, lncRNAs BAIAP2-AS1, LINC01215, and TSIX has a significant correlation with GC.Furthermore, for the first time, we show that lncRNAs BAIAP2-AS1, LINC01215, TSIX, and miR-4677-5p might regulate the expression level of THBS2, and any disturbance in this regulatory network might disturb the ECM-receptor interaction signaling pathway and lead the normal gastric cells to malignancy.Mentioned lncRNAs could be considered as the potential diagnostic biomarkers of GC.Also, the expression level of THBS2, lncRNAs BAIAP2-AS1, LINC01215, and TSIX might have a meaningful negative correlation with the survival rate of GC patients.International Journal of Genomics

Figure 2 :
Figure 2: Volcano plot showing the DEGs in the GSE54129.Red color indicates the upregulated genes and green color indicates the downregulated genes.THBS2 is indicated by a black point in the plot as a significantly upregulated gene.Further analyses were performed on THBS2 as a significant high-expressed mRNA.

Figure 1 :Figure 3 :
Figure 1: Heatmap of top 50 differentially expressed genes in GSE54129.Genes are categorized into different clusters, based on the expression level.Samples also are categorized into two main clusters based on group (control or tumor).The first 21 columns in the left side of heatmap represent normal samples and other 111 columns in the right side of heatmap are showing the tumor samples.

Figure 4 :
Figure 4: Heatmap of upregulated genes, involved in ECM-receptor signaling pathway.Yellow color indicates normal samples, and the gray color indicates tumor samples.Red color indicates higher expression level of genes in related sample, and the blue color indicates lower expression level.THBS2 has the most change in the expression level, as a significant high-expressed gene in the GC samples.

Figure 6 :
Figure 6: Protein-protein interaction analysis of THBS2 by STRING online database.

Figure 7 :
Figure 7: Coexpression analysis of THBS2 and lncRNAs, based on ENCORI and GEPIA2 online databases.(a) Coexpression analysis based on ENCORI revealed that THBS2 has no significant coexpression with BAIAP-AS1, TSIX, and LINC01215.(b) Coexpression analysis based on GEPIA2 revealed that THBS2 has a significant coexpression with BAIAP2-AS1.

1 Figure 8 :
Figure 8: Relative expression analysis of THBS2 and selected lncRNAs, based on ENCORI online database.Based on expression analysis by ENCORI, THBS2, LINC01215, BAIAP2-AS1, and TSIX have significant high expression in GC samples, compared to control.

Figure 10 :
Figure 10: Survival analysis of THBS2, BAIAP2-AS1, LINC01215, and TSIX.Based on this analysis, mentioned RNAs have no significant correlation with the low survival rate of GC patients.This analysis was performed using GEPIA2 online database, and the statistical parameter was based on the default setting of this database.

Figure 11 :Figure 13 :
Figure11: qRT-PCR data analysis was performed using GraphPad Prism software.All statistical tests and graphs of qRT-PCR experiment were performed and visualized using that software.Relative expression analysis of qRT-PCR data revealed that THBS2, BAIAP2-AS1, TSIX, and LINC01215 have significant upregulation in GC samples, compared to control.

Table 1 :
Clinicopathological analysis of gastric cancer samples.

Table 2 :
Table of primer sequence.

Table 3 :
List of top 20 upregulated and downregulated genes in GSE54129.

Table 4 :
List of significant upregulated genes in the ECM-receptor interaction pathway.

Table 5 :
Gene ontology analysis of THBS2 and related proteins, based on enrichr database.

Table 6 :
Pathway enrichment analysis of THBS2 and correlated proteins.