IL1RN and PRRX1 as a Prognostic Biomarker Correlated with Immune Infiltrates in Colorectal Cancer: Evidence from Bioinformatic Analysis

The extensive morbidity of colorectal cancer (CRC) and the inferior prognosis of terminal CRC urgently call for reliable prognostic biomarkers. For this, we identified 704 differentially expressed genes (DEGs) by intersecting three datasets, GSE41328, GSE37364, and GSE15960 from Gene Expression Omnibus database, to maximize the accuracy of the results. Preliminary analysis of the DEGs was then performed using online gene analysis datasets, such as DAVID, UCSC Cancer Genome Browser, CBioPortal, STRING, and UCSC Cancer Genome Browser. Cytoscape was utilized to visualize the protein perception interaction network of DEGs, and the bubble map of GO and KEGG enrichment function was demonstrated using the R package. The Molecular Complex Detection (MCODE), Biological Network Gene Oncology (BiNGO) plug-in in Cytoscape, was applied to further screen the DEGs to obtain 15 seed genes, which were IL1RN, GALNT12, ADH6, SCN7A, CXCL1, FGF18, SOX9, ACACB, PRRX1, MZB1, SLC22A3, CNNM4, LY6E, IFITM2, and GDPD3. Among them, IL1RN, ADH6, SCN7A, ACACB, MZB1, and GDPD3 exhibited statistically significant survival differences, whereas limited studies were conducted in CRC. Based on the enrichment results of the “Gene Ontology“(GO) and “Kyoto Encyclopedia of Genes and genomes “(KEGG) as well as documented findings of key genes, we further emphasized the potential of IL1RN and PRRX1 as markers of immune infiltrates in CRC and confirmed our hypothesis by compiling data from the UALCAN, Tumor Immune Estimation Resource, and TISIDB databases for these two genes. The above-mentioned genes might offer a valuable insight into the diagnosis, immunotherapeutic targets, and prognosis of CRC.


Introduction
Colorectal cancer (CRC) remains the third most prevalent cancer globally, with the most recent data estimating its incidence to be the second highest mortality rate [1]. Smoking, processed meat, alcohol intake, red meat, low intake of vegetables and fruits, body fat, and obesity were all identified as risk factors for the pathogenesis of CRC [2,3]. Despite advances in combination treatment regimens and individualized therapeutic planning over the past decade, the average survival time for advanced CRC has improved significantly [4]. However, compared to a 5-year survival rate of approximately 90% for patients with early-stage CRC, the 5-year survival rate of those with advanced distant metastases has fallen to less than 10%, suggesting that earlier diagnosis and treatment are key to effectively optimizing the prognosis of patients with CRC as well as reducing the burden of disease in the population [5]. Invasive and semi-invasive screening modalities are effective in detecting early CRC, yet studies point out that the overall screening rate for CRC does not reach the expected results, and few interventions are proven to increase the acceptance of such screening [6]. Both precise treatment and early screening for CRC call for more non-invasive early screening biomarkers as well as staging prognostic markers based on a deeper understanding of the pathogenesis of CRC [7]. Therefore, great interest still exists to further innovate the methodology in early diagnosis of CRC. Biomarkers are signposts for early cancer detection and individualized CRC treatment [8]. Most notably, not only do KRAS mutations accompanied with high risk of recurrence and metastasis after radical resection of CRC, but also suggest a poorer overall prognosis after resection of liver metastases from metastatic CRC [9,10]. Carcinoembryonic antigen (CEA) may be the most widely used clinical biomarker to predict early recurrence in postoperative patients [11]. Nevertheless, for CRC, its sensitivity and specificity of the test are low. Exploration of biomarkers that enable reliable estimation of CRC prognosis might provide far-reaching implications in supporting therapy of CRC [12]. Current bioinformatic techniques to identify molecular targets that serve as biomarkers for CRC constitute the mainstream research approach [13]. Owing to genomic profiling methods coupled with updating bioinformatics algorithms, comprehensive data association in combination with bioinformatics analysis has enabled the identification of plenty of clinical biomarkers available for non-invasive cancer screening and prognostic assessment of oncology patients [14]. Meanwhile, sophisticated mecha-nisms within the tumor microenvironment (TME) appear to be an emerging factor influencing the prognosis of patients with malignancies, with the consequent recognition that tumor-infiltrating immune cells and tumorassociated stromal cells can greatly impact on tumor progression and clinical outcome [15]. Identifying markers indicating the intricacies of the CRC TME has been a hot trend in bioinformatics, with immune-related prognostic genes being of considerable significance [16].
Hence, here, we obtained the differentially expressed genes (DEGs) of CRC by interacting multiple Gene Expression Omnibus (GEO) gene microarray datasets, furthermore, using diverse bioinformatics tools (DAVID and Tumor Immune Estimation Resource [TIMER]) to explore the possible mechanisms linking DEGs to CRC. These include GO and KEGG pathway enrichment analyses, and protein perception interactions (PPI) network as well as immune infiltration were also included. It also reveals the potential critical role of IL1RN and PRRX1 in CRC. Furthermore, mapping of DEGS function enabled us to identify more precise key genes accompanying with a deeper perception of the role of IL1RN and PRRX1 in CRC.

Methods
GEO (http://www.ncbi.nlm.nih.gov/geo) database is known as a freely accessible database containing gene expression information, such as gene chips, high-throughput gene expression data, and gene microarrays [17]. GSE41328 Table 1: Functional roles of 15 seed genes.

IL1RN
Regulating various immune as well as inflammatory reactions related to interleukin 1, specifically at the early stage of infectious or inflammatory conditions.

GALNT12
Facilitating the catalyzation of the transition of N-acetylgalactosamine from UDP-GalNAc to the surface acceptor at the first stage of O-linked protein glycosylation.

ADH6
Codifying for class V alcohol dehydrogenase belonging to the alcohol dehydrogenase family.

SCN7A
Encoding one of the many voltage-gated sodium channel proteins.

CXCL1
Acting as a chelating agent for neutrophils in the inflammatory process.

FGF18
Riching in functions regulating cell mitosis and maintaining cell survival activity, engaged in extensive biological processes including growth and invasion of diverse neoplasms.

SOX9
Intervening in the differentiation procedure of chondrocytes, and in concert with SF1, involved in the regulation of the transcribing of the AMH gene.

ACACB
Involving a pivotal step in the uptake and oxidation of fatty acids in mitochondria due to a mechanism involving inhibition of the ability of carnitine-palmitoyl-CoA transferase I to control the oxidation of fatty acids.

PRRX1
Enhancing the binding potential of serum response factors to induce cell growth and differentiation.

MZB1
Involved in positive regulation of cell population proliferation.

SLC22A3
Involved in the removal of numerous internal sources of small organic cations as well as various pharmaceutical substances or environmental toxins.

LY6E
Encoding a protein broadly engaged in regulating tumorigenesis and modulating immune function, which is located on the cell surface and supplies the anchor point for GPI.

IFITM2
The protein encoded by this gene restricts cellular entry by diverse viral pathogens.

CNNM4
Encoding a protein that assists in the transport of metal ions.

Method 3.
The DEG interactions were analyzed using STRING (STRING; http://string-db.org), an online gene interaction database, and the PPI interactions network was constructed using a "combined score > 0.4" as a screening condition [29]. The results of PPI interactions were imported into the open access bioinformatics software platform Cytoscape (http://www.cytoscape.org; version 3.6.1) for visualization [30]. The core modules and pivotal prognostic genes among PPI network were screened by MCODE in Cytoscape software [31]. The screening criteria were set with MCODE score >5, degree-cuff = 2, node score = 0.2, max depth = 100, and kScore = 2. The major functions of the seed genes were searched via NCBI (https://www.ncbi .nlm.nih.gov/gene) and were presented in a table format.   [26]. The analysis and visualization of the BP of seed genes were conducted by the Cytoscape Biological Network Gene Oncology plugin (BiNGO) [32]. Construction of hierarchical aggregations of seed genes was undertaken at the UCSC Cancer Genome Browser (http:// genome-cancer.ucsc.edu) [33]. Overall survival analysis of hub genes were performed by means of Kaplan-Meier curves in cBioPortal (https://www.cbioportal.org/) [34].
2.6. Method 6. The TIMER2.0 (https://cistrome.shinyapps .io/timer/) was used to systematically analyze the level of immune infiltration in various malignancies [36]. The relationship between IL1RN and PRRX1 expression and tumor-infiltrating lymphocytes (TILs) expression in COAD and READ was investigated through the TIMER gene module. Furthermore, the interaction between CRC and the immune system, immune cells, was studied through the online platform TISIDB (http://cis.hku.hk/TISIDB/index .php) [37]. This includes the association of IL1RN and PRRX1 with 28 TILs, 45 immunostimulants, and 24 immunosuppressive agents in CRC.

Method 7.
The HPA database (https://www.proteinatlas .org/) allows online access to human proteins mapped in cells, tissues, and organs through the integration of a variety of histological techniques (including antibody-based imaging and transcriptomics) [38]. IL1RN and PRRX1 expressions in normal colorectal tissues and CRC tissues were retrieved from the HPA database.

Result 2.
STRING and Cytoscape were used to construct a PPI network and a gene perception network of 704 DEGs (Figure 1(b)), both of which clearly showed the presence of dense regions, that is, modules of genes closely related to CRC (key genes). This network consists of 549 nodes and 1740 edges. Applying MCODE to construct the hub genes module, for which the most intensively interacting block seed gene was IL1RN (Figure 1(c)) and separating the 15 clusters (edges >5) of seeds genes further yielded 15 key genes IL1RN, GALNT12, ADH6, SCN7A, CXCL1, FGF18, SOX9, ACACB, PRRX1, MZB1, SLC22A3, CNNM4, LY6E, IFITM2, and GDPD3. Detailed information on these seed genes is contained in Table 1.

Result 4.
The analysis of the biological interacting process of the seed genes is presented in (Figure 3(b)). By applying hierarchical clustering analysis, it allows to judge that the            13 International Journal of Genomics seed gene could clearly distinguish between CRC and normal samples (Figure 3(c)). The function of seed genes was analyzed using DAVID. The results demonstrated that the gene functions in this module were mainly enriched in extracellular region, signal transduction. KEGG results indicated that seed genes are involved in alcoholic liver disease pathological process and pyruvate metabolism (Figure 3(a)). By sorting the results of the CBioPortal database survival data and conducting prognostic survival anal-ysis, it was found that four of the key genes, including ACACB, GDPD3, MZB1, and SCN7A, were significantly associated with overall CRC survival (Figures 4(a), 4(b), 4(c), and 4(d)). The correlation between ACACB, GDPD3D, and CRC disease-free survival is statistically significant (Figures 4(e) and 4(f)). IL1RN and PRRX1 showed higher 33 edges and 20 edges in the PPI network, but diseasefree survival of IL1RN along with overall survival and disease-free survival of PRRX1 did not show a statistical

Result 5.
The results of UALCAN analysis showed that the expression levels of both IL1RN and PRRX1 were significantly higher in colon and rectal cancer samples than in healthy samples (Figures 5(a), 5(b), 5(c), 5(d), 5(e), 5(f), 5(g), and 5(h)). In addition, this disparity became more obvious with the progression of tumor stage, suggesting a potential function of IL1RN and PRRX1 in tumor development and migration.
3.6. Result 6. IL1RN belongs to the interleukin 1 cytokine family, with its aberrant expression telling the incidence of carcinogenesis and immunomodulation. PRRX1 is well established as closely linked with the EMT in malignancies. However, the relationship between IL1RN and PRRX1 in CRC with TILs is unclear. The correlation between the level of immune infiltration of these two in CRC was assessed by the TIMER2.0 database, demonstrating that either IL1RN or PRRX1 correlates markedly with an elevation in TILs ( Figure 6). This was reflected    (Figure 9). All of the above results were statistically significant.

Result 7.
Elevated levels of IL1RN expression were correlated with antibody HPA001482, along with increased expression of PRRX1 correlating with antibody HPA051084. Upon further analysis of the differences between IL1RN and PRRX1 in normal colorectal tissues and CRC tissues, we found that IL1RN could not be detected in normal tissues, whereas in CRC tissues, IL1RN displayed weak staining (Figures 10(a) and 10(b)). PRRX1 was also undetectable in normal colorectal tissues but presented high or medium staining in CRC tissues. However, the data in the HPA database failed to point out location of PRRX1 concentration (Figures 10(c) and 10(d)).

Discussion
Mutations in multiple genes or somatic cells make a large part of causes that are associated with CRC heterogeneity. Prompt application of diagnostic markers for risk stratification and early detection would dramatically extend overall survival time [39]. There exists a phenomenon that contemporary diagnostic marker assays for CRC are undoubtedly shocking in number while disappointing in outcome [40]. Primarily, as the ideal screening or diagnostic biomarker is expected to be highly sensitive and specific, few similar markers fit this have been identified. Increasing attention is being paid to the role of immunotherapy on the curative side of CRC [41]. Meanwhile, studies noted that TILs were proved to be implicated in tumor immune responses, which

19
International Journal of Genomics might be a predictor of outcome in response to immunotherapy and prognosis [42,43]. Identification of specific immune markers relevant to CRC and acquisition of new immunotherapeutic targets turns to be an imperative task.
Currently, by further analysis of 15 seed genes filtered out (IL1RN, GALNT12, ADH6, SCN7A, CXCL1, FGF18, SOX9, ACACB, PRRX1, MZB1, SLC22A3, CNNM4, LY6E, IFITM2, GDPD3), we discovered that IL1RN, ADH6, SCN7A, ACACB, MZB1, and GDPD3 were very limitedly studied in the context of CRC. Among them, SCN7A, ACACB, and MZB1 showed statistically significant overall patient survival or disease-free survival. However, there exists an absence of literature related to these hub genes, and more studies on the mechanisms of CRC disease progression deserve to be given to these genes. Using GO and KEGG functional enrichment analyses, we found that hub genes are involved in tumor immunity in their functions, so we performed an immunobioinformatics database search of 15 key genes to identify potential immune infiltration marker roles of IL1RN and PRRX1.
IL1RN was the hub gene with the most edges found in this study, despite the fact that its overall survival or disease-free survival showed differences between the cancer-bearing and normal populations, but online databases suggested that the results were not statistically significant. Ma et al. had explored the ability of antagonizing IL-1 to inhibit CRC liver metastasis, but did not dive into the prospects of IL1RN in the context of CRC [44]. Wang et al. demonstrated that targeting the metabolism of amino acids like depriving methionine or targeting IL1RN might provide novel orientations in curing glioma [45]. IL1RN polymorphisms have similarly been proven to reduce the population risk of thyroid cancer risk [46]. Existing literature suggests that IL1RN is closely related to tumor immunity and tumor metabolism, of which further studies are needed to investigate its value in the prognosis of immune infiltration. According to our study, a strong correlation was shown between IL1RN and CD274, which is wellknown as PD-L1 [47]. A brunch of evidences could confirm the role of PD1/PDL1, an essential component of immune checkpoints, in regulating TIL function [47]. CD274 participates widely in the resistance of various cancers to treatment, such as chemotherapy and targeted therapies as an important immunosuppressive factor [48]. Targeting immune checkpoint blockade of PD-1/PD-L1 is well established in diverse tumors, with targeted PD-L1 emerging as a routine treatment for common malignancies, including CRC [49]. The correlation between IL1RN and CD274 implies that IL1RN is likely to be involved in PD-L1 targeted therapy in the future. In addition, we noted PRRX1 possessed the astonishing potential to be a prognostic biomarker correlated with immune infiltrates in CRC. Currently, a few publications have described the capacity of PRRX1 to induce the EMT process in CRC cells, which in turn facilitates distant CRC metastasis [50]. Moreover, studies of PRRX1 in alternative tumor contexts also focus mostly on its induction of the EMT process in tumor cells, which leads to implications of proliferation, migration, and    [51,52]. However, there exists little report on the potential of PRRX1 to interact with tumor immune infiltration and thus affect patient prognosis. By compiling data from online database, we found a remarkable correlation between PRRX1 and TILs. PRRX1 showed significant correlations with both immunostimulators and immunoinhibitors. For instance, the correlation coefficient between PRRX1 and CD86, a co-stimulatory molecule on antigen-presenting cells that had been demonstrated to act as a pivotal role in tumor immunity in pancreatic and bladder cancers, reached 0.683 [53,54]. In addition, as the tumor pathology progressed, the expression of PRRX1 in CRC tissues gradually increased. All these evidences pointed to the promising potential of PRRX1 as a marker of tumor immune infiltration. Although current study initially reveals the utility of IL1RN and PRRX1 as markers of immune infiltration, what is lacking is that this conclusion relies only on data from online databases and lacks specific experiments to validate this unique potential.
Furthermore, although our study identified hub genes as affecting CRC and revealed promising potential for being biomarkers, subgroup information, such as tumor location as well as staging, was not precisely pinpointed. Along with the disclosure of increased sequencing data, precisely targeting biomarker function to location-specific, stage-specific features would be much more likely.
Similarly, whereas the aforementioned SCN7A, ACACB, and MZB1 showed statistically significant prognostic results, further biological experiments are required to investigate their roles in the context of CRC.

Conclusion
In conclusion, 704 DEGs with specific expression in CRC were identified by bioinformatics means, basing on which 15 selected genes with enhanced differential properties were further identified, all of which may operate as diagnostic markers for CRC. Furthermore, 15 seed genes were further characterized to identify initially the potential of IL1RN and PRRX1 as markers of tumor immune infiltration in CRC tissues.

Data Availability
The data used to support and demonstrate the results of this investigation are available on the above-mentioned resources.

22
International Journal of Genomics