Construction of ceRNA Coexpression Network and Screening of Molecular Targets in Colorectal Cancer

Objective To screen some RNAs that correlated with colorectal cancer (CRC). Methods Differentially expressed miRNAs, lncRNAs, and mRNAs between cancer tissues and normal tissues in CRC were identified using data from the Gene Expression Omnibus (GEO) database. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and protein-protein interactions (PPIs) were performed to do the functional enrichment analysis. And a lncRNA-miRNA-mRNA network was constructed which correlated with CRC. RNAs in this network were subjected to analyze the relationship with the patient prognosis. Results A total of 688, 241, and 103 differentially expressed genes (diff-mRNA), diff-lncRNA, and diff-miRNA were obtained between cancer tissues and normal tissues. A total of 315 edges were obtained in the ceRNA network. lncRNA RP11-108K3.2 and mRNA ONECUT2 correlated with prognosis. Conclusion The identified RNAs and constructed ceRNA network could provide great sources for the researches of therapy of the CRC. And the lncRNA RP11-108K3.2 and mRNA ONECUT2 may serve as a novel prognostic predictor of CRC.


Introduction
Colorectal cancer (CRC) is a common malignant tumor in the gastrointestinal tract [1]. The initial syndrome of CRC is not conspicuous; but with the development of the tumor, the patient will show changes in bowel habits, blood in the stool, diarrhea, alternating diarrhea and constipation, local abdominal pain, and other symptoms. Patients with advanced symptoms often show syndromes such as anemia and weakness [2]. CRC is the third most common malignant tumor around the world, and its mortality rate is extremely high [3]. Considering the great threat to human health of CRC, a series of new diagnostic and therapeutic methods have emerged. CRC is a complex disease involving the expression and structure of genes [4]. More and more studies have shown that miRNA can play a crucial aspect in cancer progression by regulating its related targets, including mRNA and lncRNA [5].
lncRNA is a noncoding RNA with a length of more than 200 nucleotides, which acts a pivotal part in abounding actions such as dose compensation effect, epigenetic regulation, cell differentiation regulation, and cell cycle regulation [6,7]. It has been reported that upregulation of lncRNA FOXD3-AS1 suggests a lower survival rate in CRC patients. Experimental results showed that FOXD3-AS1 was overexpressed in CRC tissues and cells. Downregulation of FOXD3-AS1 expression in vitro can promote cell proliferation, invasion, and migration and promote apoptosis [8].
lncRNA-ATB and lncRNA-CCAT have strong accuracy in distinguishing CRC patients from healthy individuals [9]. miRNAs are endogenous noncoding RNAs with managerial actions, with a length of approximately 22 nucleotides that are involving posttranscriptional gene expression regulation in animals and plants [10]. miRNA-149 can be used as a target miRNA for identifying single bases in the serum of healthy and cancer patients. This method is direct and sensitive and can be used as an early diagnostic tool for colorectal cancer [11]. The expression of miR-139-5p was downregulated in the CRC cell line compared to the ordinary human colonic mucosal epithelial cell line NCM460. The study subsequently also demonstrated that overexpression of miR-139-5p in colon tumor cell lines significantly inhibited proliferation of cells in vivo and in vitro. The final study found that miR-139-5p regulates chronic inflammation by inhibiting NF-κB activity, thereby inhibiting cell proliferation and invasion in CRC [12].
A series of studies have shown that miRNAs can silence gene by binding to mRNA, and lncRNA can regulate gene expression level relying on competitively binding miRNAs [13,14]. At this study, a lncRNA-miRNA-mRNA coexpression network was constructed using two GEO datasets to screen RNAs that may be associated with CRC and supply a novel method for the diagnosis and therapy of CRC ( Figure 1).

Materials and Methods
2.1. Data Collection. The lncRNA/mRNA profile data GSE126092 was downloaded from Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/gds/?term=). And the extracted data were produced by the platform of Agilent-074348 Human lncRNA v6 4 × 180 K (GPL21047, Probe Name Version). This dataset contained the data of lncRNA/mRNA expression profiles in 10 colorectal cancer (CRC) tissues and their corresponding normal-appearing tissues (NATs). The miRNA profile data GSE126093 was also extracted from GEO database. The data were produced by the platform of Exiqon miRCURY LNA microRNA array, 7th generation (GPL18058, miRBase v18, condensed Probe_ ID version). This dataset contained ten cases of CRC tissues and their corresponding NATs.

Data Preprocessing and Screening of Differentially
Expressed mRNA, lncRNA, and miRNA (diff-mRNA, diff-lncRNA, and diff-miRNA). Limma package was used to preprocess the downloaded raw data and screen the differentially expressed mRNA, lncRNA, and miRNA. The preprocess process included background correction, normalization, and concentration prediction. The matrix data was combined with the chip platform annotation file to map the probe to the symbol. For multiple probes corresponding to an equal symbol, the final expression was decided by the mean of probes. Differential expression analysis of tumor vs. control was performed on the samples using the classical Bayesian test and corrected with Benjamini/Hochberg. p value < 0.05 and jlog 2 FCj > 2 were taken as the cut-off for the screening of diff-mRNA, diff-lncRNA, and diff-miRNA.

Functional Enrichment
Analysis. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways for diff-mRNA, mRNAs in modules that were significantly clustered in PPIs, and mRNAs in ceRNA network were carried out, via the R package clusterProfiler (version 2.4.3, http://bioconductor .org/packages/3.2/bioc/html/clusterProfiler.html). In KEGG pathway enrichment analysis, p value < 0.05 and gene count > 2 were chosen as the cut-off criteria.
2.6. Coexpression Analysis of diff-lncRNA and diff-mRNA. The Pearson correlation coefficient (PCC) was calculated between lncRNA and mRNA. And the PPC > 0:95 and p value < 0.05 were considered to be meaningful correlation. Cytoscape was used to illustrate the coexpression network.   Disease Markers 2.7. Survival Analysis. The colon cancer tumor samples from which mRNA, lncRNA, miRNA, and survival prognosis messages were extracted from the TCGA database, and the expression values and survival prognosis information of all node elements in the lncRNA-miRNA-mRNA network were extracted from the TCGA colon cancer samples. K-M survival curve was performed using Survival (version: 2.42-6, https:// cran.r-project.org/web/packages/survival/index.html). p value < 0.05 was considered as the significant threshold.

Screening the Cell Pathways with regard to Differentially
Expressed mRNA in CRC. KEGG pathway analysis was per-formed to make a deep understanding between the diff-mRNA and cell pathways. The result indicated that all 266 upregulated diff-mRNAs were enriched in 10 pathways, including cell cycle, oocyte meiosis, progesterone-mediated oocyte, and cytokine-cytokine receptor interaction. And the 462 downregulated diff-mRNAs were enriched in 21 pathways, including hypertrophic cardiomyopathy, dilated cardiomyopathy, cGMP-PKG signaling pathway, and cAMP signaling pathway (Table 1).

PPI Network with regard to Differentially
Expressed mRNAs in CRC. PPI network was constructed, which integrates large amount of known and predicted interactions of proteins. As shown in Figure 3, 354 nodes and 1,500 edges were included in the network. Module analysis was performed using MCODE, and two modules were obtained (module A and module B). Module A (score = 32:778) was constructed with 37 nodes and 590 edges, and module B (score = 15) was constructed with 15 nodes and 105 edges (Figures 4(a) and 4(b)). KEGG analysis was also performed. The result indicated that 37 diff-mRNAs in module A were enriched in 4 pathways, and 15 diff-mRNAs in module B were enriched in 8 pathways (Table 2).
3.4. miRNA-lncRNA/mRNA Prediction and lncRNA-mRNA Coexpression. The top 10 upregulated and all 8 downregulated diff-miRNAs were predicted using miRWalk 2.0 database. The edges were screened which the target genes were diff-mRNA. And a total of 308 edges of diff-miRNA-diff-mRNAs were obtained finally, including 15 diff-miRNAs and 177 diff-mRNAs. Thirty-nine edges of lncRNA-miRNA were obtained including 10 diff-miRNAs and 33 diff-lncRNAs. Furthermore, 143 target genes regulated by 10 Color key Color key (c) lncRNA-miRNA-mRNA network is shown in Figure 5, and the nodes in the network are shown in Supplementary  Table 4. The diff-mRNAs in lncRNA-miRNA-mRNA network were analyzed using KEGG pathway enrichment analysis, of which 15 pathways (Table 3).
3.6. Differentially Expressed RNAs with regard to the Prognosis of CRC. All the nodes in Figure 6 were used to do the survival analysis, of which a lncRNA RP11-108K3.2 and one mRNA ONECUT2 correlated with prognosis.

Discussion
In this study, to screen for RNAs and pathways associated with CRC, a lncRNA-miRNA-mRNA expression network was constructed using two datasets downloaded from the GEO database. A total of 688 diff-mRNAs, 241 diff-lncRNAs, and 103 diff-miRNAs were identified at the present study. A functional enrichment analysis of these 688 diff-mRNAs revealed that they were mainly enriched in 21 pathways. A PPI network analysis was also performed on these diff-mRNAs. Finally, a ceRNA network was constructed and a survival analysis was performed on the nodes to obtain two prognostic-related RNAs. The expression levels of RP11-108K3.2 and ONECUT2 were both upregulated in CRC, and low levels were associated with better prognosis, suggesting that they may play a positive role in CRC. lncRNA can participate in the initiation and development of many diseases by directly regulating proteins or indirectly regulating the target genes of related miRNA. The result of ceRNA showed that RP11-108K3.2 was regulated by hsa-miR-224-5p. Study has shown that the great expression level of RP11-108K3.2 correlated with low overall survival of CRC, and this result was consistent with our study [15]. In a study on lung adenocarcinoma, the researchers study 346 differentially expressed lncRNAs, of which RP11-108K3.2 is highly expressed in lung cancer tissues [16].    Figure 3: The protein-protein interaction network of differentially expressed mRNAs. The purple nodes represented the upregulated mRNAs, and the green nodes represented the downregulated mRNAs. The higher the degree value is, the larger the nodes.
ONECUT2 is a member of the ONECUT2 transcription factor family that interacted with hsa-miR-139-5p, hsa-miR-188-5p, hsa-miR-19b-1-5p, hsa-miR-31-5p, and hsa-miR-497-5p in a ceRNA network in this study [19]. It has been reported that ONECUT2 acts a critical position on CRC gene network and is significantly associated with the development of cancer. Interference with endogenous ONE-CUT2 expression inhibited the CRC cell line SW620 cell migration [20]. A series of experiments have shown that hsa-miR-139-5p acts as a crucial part in colon tumor. Compared with normal tissues, the hsa-miR-139-5p expression level is significantly declined in colon tumor, and its expression level is significantly correlated with tumor stage. Subsequently, experiments have shown that the direct object of hsa-miR-139-5p is BCL2, and the expression of BCL2 is negatively correlated with the expression of hsa-miR-139-5p.  The tumor metastasis and drug sensitivity of CRC could be diminished by hsa-miR-139-5p targeting the BCL2 pathway [21]. In the present study, hsa-miR-139-5p was interacted with ONECUT2, and its expression level was negatively connected with the expression level of the later one. These two RNAs may have a similar mechanism with hsa-miR-139-5p and BCL2. hsa-miR-31-5p as an infrequently expressed miRNA was associated with a stage when assessing CRC cases. The upregulated hsa-miR-31-5p makes a more advanced disease stage more likely than a lower disease stage [22]. At this research, the expression level of hsa-miR-31-5p was higher in CRC tissues than in control (logFC = 9:153176).
In addition to being significantly upregulated in CRC tissues, it has been reported to be significantly upregulated in uterine cervical cancer tissues [23].
In conclusion, at the present study, several CRC-related diff-miRNAs, diff-lncRNAs, and diff-mRNAs were obtained. A ceRNA network was also constructed to analyze the crosstalk among the identified diff-miRNAs, diff-lncRNAs, and diff-mRNAs. The lncRNA RP11-108 K3.2 and mRNA ONE-CUT2 in the network may play a crucial role in CRC, with their low expression levels being correlated with better prognosis. Although the mechanism of RP11-108K3.2 and ONE-CUT2 remains to be revealed, both of the two may be used as

Data Availability
Answer: Yes. Comment: The datasets generated during the current study are not publicly available but identified and anonymized information is potentially available on reasonable request.

Additional Points
Highlights.