Correlation between SMADs and Colorectal Cancer Expression, Prognosis, and Immune Infiltrates

Background In recent years, the incidence and mortality of colorectal cancer (CRC) are increasing, and the 5-year survival rate of advanced metastatic CRC is poor. Small mothers against decapentaplegic (SMAD) superfamily are intracellular signal transduction proteins associated with the development and prognosis of a variety of tumors. At present, no study has systematically analysed the relationship between SMADs and CRC. Methods Here, R3.6.3 was used to analyse the expression of SMADs in pan-cancer and CRC. Protein expression of SMADs were analysed by Human Protein Atlas (HPA). Gene expression profiling interactive analysis (GEPIA) was used to evaluate the correlation between SMADs and tumor stage in CRC. The effect of R language and GEPIA on prognosis was analysed. Mutation rates of SMADs in CRC were determined by cBioPortal, and potentially related genes were predicted using GeneMANIA. R analysis was used to correlate immune cell infiltration in CRC. Results Both SMAD1 and SMAD2 were found to be weakly expressed in CRC and correlated with the immune invasion level. SMAD1 was correlated with patient prognosis, and SMAD2 was correlated with tumor stage. SMAD3, SMAD4, and SMAD7 were all expressed at low levels in CRC and associated with a variety of immune cells. SMAD3 and SMAD4 proteins were also expressed at low levels, and SMAD4 had the highest mutation rate. SMAD5 and SMAD6 were overexpressed in CRC, and SMAD6 was also associated with patient overall survival (OS) and CD8+ T cells, macrophages, and neutrophils. Conclusions Our results reveal innovative and strong evidence that SMADs can be used as biomarkers for the treatment and prognosis of CRC.


Introduction
Colorectal cancer (CRC) is widely known as one of the most pervasive malignancies due to its third highest morbidity (10.0%) and second highest mortality (9.4%) among all cancers worldwide, and its morbidity and mortality are on the rise year by year [1]. Te 5-year survival rate for advanced metastatic colorectal cancer is less than 20% [2]. Te main treatment methods for CRC are surgery, radiotherapy, and chemotherapy, which are good for early colorectal cancer but poor for advanced and metastatic CRC [3]. Tere is no good treatment for advanced metastatic colorectal cancer. To eliminate the high incidence and mortality of CRC, further exploration of meaningful biomarkers is urgently needed to strengthen its therapeutic efcacy.
Tere are eight small mothers against decapentaplegic (SMAD) codes in the human genome [4]. SMAD proteins are a family of signal transduction molecules involved in the transforming growth factor β (TGF-β) ligand pathway. SMADs belong to the intracellular protein family with a total length of 500 amino acids, among which SMAD1, SMAD2, SMAD3, SMAD5, and SMAD8 act as TGF-β receptors in

Te Human Protein Atlas (HPA). Te Human Protein
Atlas (HPA) (http://www.proteinatlas.org/pathology) maps human proteins by analysing the efects of clinical results on various omics, primarily based on the relationship between the genome-wide transcriptome of protein-coding genes of 17 cancer types and clinical results [13]. In this study, we used this database to investigate the relationship between SMAD proteins and CRC. (GEPIA). GEPIA (http://gepia.cancer-pku.cn/) is an online web address based on Te Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression database (GTEx) consisting of thousands of tumor and healthy tissue sample data using standard processing pipelines, providing key interactive and customizable functionality [14]. In this study, GEPIA was used to analyse the correlation between SMADs and the pathological stage of CRC, and its prognostic value was analysed by this method.

Te Gene Expression Profling Interactive Analysis
2.3. cBioPortal. cBioPortal (http://cbioportal.org) is a free open platform for multidimensional cancer genome analysis, detection, and visualization at the deoxyribonucleic acid (DNA) level [15]. In this study, cBioPortal was used to predict mutation rates of the SMAD gene family in CRC.

2.4.
GeneMANIA. GeneMANIA (http://www.genemania. org) is a rich and friendly website for hypothesis of gene function, analysis of gene lists, identifcation of functionally similar genes, biofunctional genomics, and more [16]. In this study, we explored the SMAD interaction network and associated genes through the GeneMANIA database.

Statistical Analysis.
All statistical analyses were performed using R (V3.6.3). Te diferences were visualized using the ggplot2 software package. Paired t tests and Mann-Whitney U tests were used to detect diferences between colorectal cancer tissues and adjacent normal tissues. Te R package survminer was used for visualization of prognostic value, and the survival software package was used for statistical analysis of survival data. Te single sample gene enrichment analysis (ssGSEA) package of gene set variation analysis (GSVA) [17] was used for immune infltration analysis, and the Shapiro-Wilk normality test and Spearman correlation coefcient calculation were used to detect the correlation of immune infltration.

Diferential Expression of SMADs in Pan-Cancer and CRC.
Te expression diferences of SMADs across cancers were detected by the ggplot2 software package, as shown in Figure 1. Ten, the same package was used to detect the diferential expression of SMADs in 51 normal samples and 647 colorectal cancer samples (Figure 2), and the results showed that the expression levels of SMAD1-4, SMAD7, and SMAD9 were signifcantly downregulated, while the expression levels of SMAD5 and SMAD6 were signifcantly upregulated. Te specifc situation was analysed as follows.

Correlation between SMADs and CRC Tumor Stage.
By evaluating the correlation between SMAD expression and tumor stage in CRC patients, the results are shown in Figure 3. Te analysis results showed that the SMAD2 and SMAD7 groups had noticeable diferences (Figures 3(b) and 3(g), all p < 0.05), while SMAD1, SMAD3, SMAD4, SMAD 5, SMAD6, and SMAD9 groups had no signifcant diferences ( Figure 3

Protein Expression of SMADs in CRC.
Protein expression of SMADs in normal intestine and CRC tissues was analysed by HPA, as shown in Figure 4. Te results showed that the protein expression levels of SMAD1 and SMAD2 were signifcantly increased in CRC tissues (Figures 4(a) and 4(b)), the protein expression levels of SMAD3, SMAD4, and SMAD5 were signifcantly decreased in CRC tissues (Figures 4(c)-4(e)), and the protein expression levels of SMAD7 was not signifcantly diferent ( Figure 4(f )).

Analysis of SMAD Gene Mutation and Interaction Expression in CRC.
Te frequency of SMAD changes in CRC was detected by cBioPortal. Te results showed that in 881 CRC patients, the mutations of SMAD1 and SMAD6 were 1.9%, SMAD2 was 7%, SMAD3 and SMAD5 were 5%, SMAD4 was 18%, and the mutation rate was 4% for SMAD7 and 2.8% for SMAD9. Te OncoPrints contained in-frame mutations, missense mutations, splice mutations, truncating mutations, structural variants, amplifcations, deep deletions, and no alterations (Figure 7(a)). Trough the GeneMANIA database, twenty genes associated with the interaction network with SMADs were analysed (Figure 7(b)).

Correlation with Immune Infltration.
Te ssGSEA package of GSVA was used to comprehensively analyse the relationship between SMADs and immune cell infltration, as shown in Figure 8 and Table 1. Te results showed that the expression of SMAD1, SMAD4, and SMAD7 was positively correlated with the infltration of B cells, CD8+ T cells, dendritic cells (DCs), eosinophil macrophages, and neutrophils (Figures 8(a), 8(d), and 8(g)). SMAD2 International Journal of Analytical Chemistry expression was positively correlated with CD8+ T cells, macrophages, and neutrophils (Figure 8(b)). SMAD3 expression was positively correlated with B cells, CD8+ T cells, eosinophils, and macrophages (Figure 8(c)). SMAD5 expression was positively correlated with macrophage infltration, while SMAD5 expression was negatively correlated with DC infltration (Figure 8(e)). SMAD6 expression was positively correlated with DC infltration, and SMAD6 expression was negatively correlated with CD8+ T cell, macrophage, and neutrophil infltration (Figure 8(f )). Te expression of SMAD9 was positively correlated with eosinophil infltration, and the International Journal of Analytical Chemistry expression of SMAD9 was negatively correlated with neutrophil infltration (Figure 8(h)).

Discussion
Studies have shown that SMADs are involved in the development, metastasis, prognosis, and immune microenvironment of many tumors. Immune infltrating cells are related to the tumor microenvironment and infuence tumor growth and metastasis. Te high expression of SMAD1, SMAD2, and SMAD4 in gastric cancer tissues is signifcantly correlated with the prognosis of patients [18]. Studies related to lung cancer have found that the expressions of SMAD6, SMAD7, and SMAD9 in SMADs are downregulated in lung cancer and signifcantly correlated with the prognosis of patients [19]. However, studies related to SMADs and the occurrence, development, prognosis, and immunity of CRC have not been fully clarifed. SMAD1 is the activation type of SMAD receptor, which is involved in modifying cell growth, diferentiation, apoptosis, and other processes and plays an important role in the body's immune system. Current studies on SMAD1 in CRC have shown that high expression of SMAD1 can induce apoptosis of CRC [20]. SMAD1 can promote the occurrence of CRC tumors and induce migration and autophagy processes [21]. Tis study claimed that low expression of SMAD1 in colorectal cancer was related to prognosis and immune cell infltration, but SMAD1 protein was signifcantly increased in colorectal cancer tissues. Tese results suggest that high SMAD1 expression can be used as a diagnostic marker for CRC and as a marker associated with poor prognosis and immunoinfltration when SMAD1 begins to be low expressed in CRC.
SMAD2 plays diferent roles in diferent stages of cancer by regulating various biological processes [22]. In colorectal cancer, the tumor suppressor gene NIT1 is realized by activating the SMAD2/3 signaling pathway [23]. SMAD2 can promote the development of CRC by regulating the polarization of tumor macrophages [24]. In this study, SMAD2 expression in CRC was low, which was  SMAD3 plays the dual role of oncogene and tumor suppressor gene in tumor formation, and can be used as a prognostic marker for tumors [22]. SMAD4 is a tumor suppressor gene that plays a central role in TGF-β signaling pathway transduction [25]. In CRC, SMAD3 reduces its expression through miR-4429 and ultimately inhibits the occurrence, development, and metastasis of cancer cells [26]. A meta-analysis showed that a high mutation rate of SMAD4 in CRC patients was associated with poor prognosis but not with clinical stage [27]. Tis study showed that SMAD3, SMAD4, and their proteins were signifcantly underexpressed in colorectal cancer. However, there was no signifcant correlation between tumor stage and prognosis. Te maximum mutation rate of SMAD4 in CRC was 18%. Studies on immune infltration have shown that SMAD3 and SMAD4 are associated with a variety of immune cells. Our results are generally consistent with previous reports, suggesting that SMAD3 and SMAD4 can act as tumor suppressor genes of CRC and infuence patient immune status. However, whether SMAD4 can be used as a prognostic indicator needs further validation.
SMAD5 mediates TGF-β superfamily ligand signaling pathways as oncogenic genes [28]. SMAD6 can also regulate TGF-β signaling pathway, which is conducive to tumor growth, spread, and metastasis [29]. Overexpression of miR-186-5p in CRC can signifcantly reduce SMAD6, ultimately inhibiting the proliferation and migration of CRC cells and increasing the apoptosis of CRC cells [30]. Tis study found that SMAD5 and SMAD6 were signifcantly overexpressed in colorectal cancer. SMAD6 was signifcantly correlated with OS. Tese results are consistent with our study of SMAD5 and SMAD6. Tese results demonstrated that SMAD5 and SMAD6 could be used as oncogenes of CRC, and SMAD6 could also be used as a prognostic biomolecule.
SMAD7 is an inhibitor of TGF-β signaling pathway and antagonizes TGF-β-mediated diseases. SMAD7 plays a dual role in diferent tumor stages. As a tumor suppressor gene in the early stage and a tumor promoter gene in the late stage, SmAD7 is positively correlated with the degree of malignancy [31]. In CRC, SMAD7 can upregulate miR-424 by silencing circTBL1XR1, thus promoting the proliferation, invasion, and metastasis of CRC [32]. miR-4775 overexpression in CRC promotes invasion, metastasis, and epithelial-mesenchymal transition (EMT) processes of cancer cells by activating SMAD7 [33]. In this study, SMAD7 expression was signifcantly reduced in CRC and was associated with a variety of immune cells. Our study is consistent with the current relevant experimental verifcation, and the current literature suggests that there is a diference in colorectal-related expression between this study and SMAD7. Considering the dual role of SMAD7, CRC tissues may be in diferent stages, which is consistent with the actual situation. SMAD7 is both an oncogene and a tumor suppressor gene in CRC and can be used as a marker to evaluate the state of the immune microenvironment.
However, there are only eight members of the SMAD family from 1 to 8. However, some databases SMAD8 is directly named SMAD9, and some databases have both SMAD8 and SMAD9, so it is impossible to perform specifc analysis, so further analysis will not be conducted here.
Our study has some shortcomings. First, this study was mainly obtained through database analysis without relevant experimental verifcation. To better study the relationship between CRC and SMADs, experimental verifcation is needed to further verify the results and make the results more convincing. Second, due to the ambiguity between SMAD8 and SMAD9 in diferent databases, specifc analysis is not possible. Terefore, our team needs to continue to carry out relevant experimental verifcation in cell, animal, and clinical aspects.

Conclusions
In conclusion, this study used R language and several diferent database systems to analyse the diferential expression, mutation rate, prognostic analysis, and immune infltration of SMAD family members in CRC. Te results showed that SMAD1, SMAD2, SMAD3, SMAD4, and SMAD7 were signifcantly downregulated in CRC, while SMAD5 and SMAD were signifcantly upregulated in CRC. SMAD1 and SMAD2 proteins were signifcantly increased in CRC, SMAD3, SMAD4, and SMAD5 proteins were signifcantly decreased in CRC, and SMAD7 and SMAD9 protein expression was not signifcantly diferent. Only SMAD2 was associated with tumor stage of CRC. In terms of prognostic analysis, only SMAD1 was signifcantly correlated with DSS and PFI, while SMAD6 was signifcantly correlated with OS. SMAD4 had the highest mutation rate. In immune infltration, SMAD1, SMAD2, SMAD3, SMAD4, and SMAD7 were positively correlated with a variety of immune cells. By studying the relationship between SMADs family and CRC, in clinical practice, patients with high expression of SMAD1 and SMAD2 and low expression of SMAD3, SMAD4, and SMAD5 in tissue specimens can be identifed as CRC, which can be used as diagnostic markers. In order to understand the stage of the tumor, the increase of SMAD2 value can be detected. Based on the correlation between the expression level of a large number of patients and the stage, the interval range can be formulated to further determine the malignant degree of CRC in clinic. Te high expression of SMAD1 and low expression of SMAD6 can be detected to determine the prognosis of patients. In order to understand the immune microenvironment of CRC and develop immunotherapy methods, SMAD1, SMAD2, SMAD3, SMAD4, and SMAD7 of patients are of guiding signifcance. Trough the above systematic discussion, the diagnosis, treatment, and survival prognosis of CRC patients can be evaluated clinically by detecting the expression level of SMADs family, which is convenient and has guiding value.
International Journal of Analytical Chemistry