Investigation of the miRNA and mRNA Coexpression Network and Their Prognostic Value in Hepatocellular Carcinoma

Purpose To identify pivotal differentially expressed miRNAs and genes and construct their regulatory network in hepatocellular carcinoma. Methods mRNA (GSE101728) and microRNA (GSE108724) microarray datasets were obtained from the NCBI Gene Expression Omnibus (GEO) database. Then, we identified the differentially expressed miRNAs and mRNAs. Sequentially, transcription factor enrichment and gene ontology (GO) enrichment analysis for miRNA were performed. Target genes of these differential miRNAs were obtained using packages in R language (R package multiMiR). After that, downregulated miRNAs were matched with target mRNAs which were upregulated, while upregulated miRNAs were paired with downregulated target mRNA using scripts written in Perl. An miRNA-mRNA network was constructed and visualized in Cytoscape software. For miRNAs in the network, survival analysis was performed. And for genes in the network, we did gene ontology (GO) and KEGG pathway enrichment analysis. Results A total of 35 miRNAs and 295 mRNAs were involved in the network. These differential genes were enriched in positive regulation of cell-cell adhesion, positive regulation of leukocyte cell-cell adhesion, and so on. Eight differentially expressed miRNAs were found to be associated with the OS of patients with HCC. Among which, miR-425 and miR-324 were upregulated while the other six, including miR-99a, miR-100, miR-125b, miR-145, miR-150, and miR-338, were downregulated. Conclusion In conclusion, these results can provide a potential research direction for further studies about the mechanisms of how miRNA affects malignant behavior in hepatocellular carcinoma.


Introduction
Hepatocellular carcinoma (HCC) is one of the most common cancers in the world. It is estimated that 840,000 new cases of HCC are acquired and at least 780,000 people die of HCC every year, and over half of the global incidence and mortality of HCC occur especially in Eastern Asian [1]. HCC has a high incidence (4.7% of new cancer cases) and the second highest cancer mortality rate (8.2% of cancer-related deaths) worldwide [2], and China accounts for 47% of the total number of HCC cases as well as HCC-related mortality [3]. The development of HCC is closely related to the infection of the hepatitis B virus (HBV) infection, followed by the hepatitis C virus infection (HCV), and related to aflatoxins, alcohol drinking, and so on [4,5]. In China, HCC has the third highest cancer incidence and has become the second leading cause of cancer-related death, second only to lung cancer [6]. Unfortunately, there are many difficulties in the diagnosis and treatment of HCC, but the most frequent is the lack of methods for early diagnosis as well as the paucity of studies about the molecular mechanisms of tumor initiation and progression. Therefore, we designed the study to explore the molecular mechanisms of HCC carcinogenesis and progression, as well as new relevant molecular markers for early diagnosis.
Recently, noncoding RNA (ncRNA) drawn extensive concern to further illustrate the molecular mechanisms of HCC. MicroRNAs (miRNAs), families of small noncoding RNAs, had been reported that can serve as molecular markers for early diagnosis of a number of tumors [7,8]. Many of miRNAs (e.g., miR-1247-3p [9] and miR-935 [10]) have been demonstrated to play a role in the progression of HCC by influencing proliferation, invasion, and metastasis of tumor cells as well as other malignant phenotypes.
MiRNAs can promote the degradation of mRNAs and inhibit their translation into proteins by binding to the 3 ′ -untranslated region (3 ′ -UTR) of target mRNAs [11]. Thus, studies on miRNAs were important for inquiring the molecular mechanisms of carcinogenesis and to seek novel biomarkers.
Microarray profiling is a kind of high-throughput technique that developed rapidly in recent years, which can be applied to detect differentially expressed miRNAs and genes in cancer and control samples [12]. To find new directions for research in miRNAs and genes, we analyzed miRNA and mRNA microarray datasets to identified differentially expressed miRNAs and genes and explored their potential relationships. Next, we identified the key miRNAs with survival analysis, network analyses, and functional enrichment.

Microarray Data Collection.
The raw miRNA and mRNA sequencing data were obtained from the GEO database (https://www.ncbi.nlm.nih.gov/geo/), which represents the largest public repository of microarray data. In this study, one gene expression profile (GSE101728) and one miRNA expression profile (GSE108724) were downloaded from the GEO.
The GSE101728 dataset was composed of seven pairs of HCC and matched adjacent tumor-free tissue sample mRNA expression profiles which were collected during the surgery from HCC patients admitted to the Zhongshan Hospital of Fudan University [13]. The GSE108724 dataset includes the profiling of the miRNA expression in seven pairs of HCC and matched adjacent tumor-free tissues from the same hospital and research team [13]. All the data were obtained in a raw status and normalized with Perl (Perl version 5.32.3) and R (version: 4.0.2).

Data
Processing. The raw data was downloaded, and the probe ID was transferred to gene symbol or miRNA name. The data from different groups was classified into the normal group and tumor group with packages in R language (R package limma), which was also used to screen the differentially expressed miRNAs and mRNA between the tumor and normal tissues. In the same time, we also calculated the log fold change (logFC), P values, and adjusted P values (adj. P val). In addition, adj. P value<0.05 and |logFC | >2 were set as the standards of differentially expressed miRNA and mRNA selection. According to the above standards, 37 differentially expressed miRNAs (15 upregulated and 22 downregulated) and 745 differentially expressed mRNAs (30 upregulated and 441 downregulated) were screened.

Prediction of miRNA Target
Genes. After extracted the differential miRNAs, packages in R language are used to predict the target genes (R package multiMiR). The packages were published in 2014 [14], but were updated in April 2020 recently. The packages were integration of fourteen databases for prediction of the miRNA target gene, including miRecords BioMed Research International tissues and liver cancer tissues in TCGA. The aforementioned methods are used to process the data, to screen out the miRNAs that are abnormally expressed in liver cancer tissues, and to perform relevant multifactor COX regression analysis and survival analysis for differential miRNAs. Based on miRNAs that have an impact on the prognosis, target genes are predicted, and an interactive network of miRNA and mRNA is constructed. For oncogenes and tumor suppressor genes in the network, we also conducted survival analysis and calculated the difference in five-year survival rates.

Results
A total of 745 differentially expressed genes were identified from GSE101728 and 37 miRNAs from GSE108724. The conditions of judgment for significant differences are the log fold change > 2 and the adjusted P value <0.05. There are 304 upregulated genes and 441 downregulated genes among these differential genes. For the selected differential miRNAs, we performed GO and transcription factors (TF) enrichment analysis. TF and GO analysis are hints for the research direction. Through the GO analysis, we can find the GO classification items of the differential miRNAs and find out which function changes may be related to the differential miRNAs. The results of TF and GO enrichment analysis of the differential miRNAs are shown in Figure 2. The results of TF analysis for differential upregulated and downregulated miRNA are shown below (Figure 2(a)-(c)). In the result of transcription factor enrichment, the blue column indicates the proportion of miRNA enriched on the transcription factor to all differential miR-NAs. The red bar represents the value of -log 10 (P value), and the yellow bar represents the threshold of statistical difference (-log 10 0.05, P = 0:05). The red bar is longer than the yellow band, indicating a statistical difference (P < 0:05 ). GO analyses cover three domains and show the top 10 significant GO enrichments according to enrichment scores [-log 10 (P value)] (Figures 2(d)-(f)).
In order to show the relationship between differential miRNAs and target genes more intuitively, we used Cytoscape software to achieve visualization. The miRNA-mRNA regulatory network was constructed that consists of 35 miR-NAs and 295 target mRNAs. Since miRNAs generally nega-tively regulate target genes, 17 upregulated miRNAs were matched with 144 downregulated target mRNAs while 18 downregulated miRNAs with 151 target mRNAs ( Figure 3). In the network, we use different colors to represent different expression levels and different functions of miRNA and mRNA.
We not only performed GO enrichment analysis for miR-NAs but also GO and KEGG enrichment analysis for differential mRNAs. GO enrichment analysis and KEGG pathway analysis of differentially expressed genes in the miRNA-mRNA network are shown in Figure 4. Representation of the genes in the GO enrichment bubble plot and circle plot displayed the count distribution in BP, CC, and MF. Bubble color intensity indicates fold enrichment of GO terms overrepresented in that cluster of genes, and the size corresponds to the number of genes enriched (count). The GO analysis results revealed that the differentially expressed genes were significantly enriched in the terms "reproductive structure development," "reproductive system development," "positive regulation of cell adhesion," etc (Figures 4(a) and 4(b)). The result of the KEGG pathway analysis was shown in the same form. The genes were significantly enriched in "MicroRNAs in cancer," "PI3K-Akt signaling pathway," and "Focal adhesion." The results of the enrichment analysis suggest that the effect of differential genes enriched in the corresponding items on the corresponding phenotype of tumor cells can be studied. For example, the KEGG enrichment analysis results of SRC show that it is enriched in the item of cellular adhesion, so basic experiments can be used to verify whether SRC has an effect on the invasion and metastasis of liver cancer cells..
If the difference in the expression of miRNAs is significantly related to the survival of patients, it means that this miRNA is likely to have greater research value. Therefore, for differential miRNAs in the network, OS analysis was performed. The data including the expression of miRNAs, follow-up time, survival state, and survival time were from the TCGA miRNA-seq dataset. Survival curves were plotted, and differences between survival curves were estimated ( Figure 5). As a result, nine survival curves are statistically different (P <0.05); among them, hsa-mir-125b was divided into hsa-mir-125b-1 and hsa-mir-125b-2 (Figures 5(c) and 5(d)).
In addition to the GEO database, we also conducted data mining on the TCGA database. Perform data processing on the data downloaded from the TCGA database screen out a total of 53 abnormally expressed miRNAs and calculate the P value. All the different miRNAs are displayed in the chart and sorted by P value (Table 1). We drew a heatmap and a volcano map to visually show the difference in the expression and distribution of 53 differential miRNAs in normal tissues and cancer tissues (Figures 6(a) and 6(b)). In order to study the impact of differential miRNAs on the prognosis of patients, we used multivariate COX regression analysis and survival analysis methods. Multivariate COX regression analysis showed that 8 miRNAs are related to the prognosis of patients (P < 0:05). At the same time, we also calculated the hazard ratio of each miRNA. A hazard ratio greater than 1 indicates a negative impact on the prognosis, and the hazard 3 BioMed Research International ratio that is less than 1 indicates a benign effect on the prognosis ( Figure 6(c)). Then, we drew survival curves for all the differential miRNAs based on the expression, the patient's survival time, and survival status and calculated the 5-year survival rate. The results showed that the 5-year survival rates of mir-9-1, mir-9-2, mir-9-3, mir-452, mir-514a-2, and mir-4800 were significantly different (Figures 6(d)-(i)).
For oncogenes and tumor suppressor genes in the network, we conducted survival analysis and calculated the difference in five-year survival rates ( Figure 8). We conducted survival analysis on the six tumor suppressor genes CDKN2B, DACT1, DUSP6, E2F3, IGFBP3, PLCE1, RASSF3, and THY1 and found that E2F3, IGFBP3, and RASSF3 genes had a significant impact on the 5-year survival rate of patients (P < 0:05). The 5-year survival rate of the oncogene GMPS high expression group was significantly lower than that of the low expression group.

Discussion
With a high mortality rate, substantial morbidity, and the increasing trend of the incidence rate of HCC worldwide, the pathogenesis, disease progression, and treatment of HCC are worthy of further study and exploration. Noncoding RNA has been an intensive research topic in molecular biology for several years and the focus of numerous studies [17,18]. MiRNAs, families of small noncoding RNAs, played important roles in nearly all biological processes. Plenty of studies indicated that the abnormal expression of miRNAs may contribute to oncogenesis and the progression of HCC by inhibiting target genes through the degradation of their target mRNAs or by inhibiting translation [19,20]. Thanks to the well-developed microarray technology, now it is easier to determine the expression levels of the miRNAs and mRNAs. We can identify differentially expressed miRNAs and genes between normal and tumor tissues and screen miRNAs that seem to play roles in tumor onset or progres-sion. In addition, the result of microarray analysis can help give direction for future research.
In the study, a total of 35 miRNAs (17 upregulated and 18 downregulated) and 295 mRNAs (151 upregulated and 144 downregulated) were screened. The transcription factors for differentially expressed miRNA were enriched in EGR1, POU2F1, SP1, MEF2A, HOXD8, etc. GO enrichment analysis of these miRNAs showed that they are significantly enriched in "regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolism", signal transduction, cell communication, and transport. The differentially expressed mRNA is enriched in the reproductive structure development, reproductive system development, and positive regulation of cell adhesion. For KEGG analysis of mRNAs, they are enriched in miRNAs in cancer, PI3K-Akt signaling pathway, focal adhesion, and Rap1 signaling pathway. Moreover, by constructing the miRNA-mRNA network and performing OS analysis, we identified miRNAs including miR-99a, miR-100, miR-125b, miR-145, miR-150, miR-324, miR-338, and miR-425, which were found to have an impact on the HCC survival rate. Thus, the network has been simplified. miR-125b-5p is one of the downregulated miRNAs in tumor tissues of HCC patients compared to normal tissues. In the miRNA-mRNA network, it has the highest connectivity with target genes, which regulates 31 upregulated genes. Among these, there are 2 genes in the top 100 of total 979 upregulated genes including KIF18B and RBM24. Little research has been done on miR-125b in HCC, but some studies showed that miR-125b can affect the metastasis of gastric cancer cells and inhibit colorectal cancer proliferation [21,22]. Recent research reported that KIF18B promotes hepatocellular carcinoma progression through activating the Wnt/β-catenin-signaling pathway [23], but the upstream      regulators of KIF18B are unknown. According to the network, the exact relationship between miR-125b and KIF18B needs to be verified through experiments. Among target genes of miR-99a, peptidase inhibitor 15 (PI15) has been reported to act as a novel blood diagnostic marker for cholangiocarcinoma, and the plasma PI15 level in HCC patients was clearly higher than normal [24]. How-ever, the physiological and pathological role of plasma PI15 is still unknown. Downregulation of GMP synthetases (GMPS), another target gene of miR-99a, can result in reduced cell viability as a p53 repression target in HCC [25]. Thus, the downregulation of miR-99a may inhibit tumor proliferation by upregulating GMPS, but it still requires experimental validation. Recent research showed

12
BioMed Research International that nuclear receptor subfamily 6, group A, member 1 (NR6A1) regulates lipid metabolism of HepG2 cells, and the positive expression of NR6A1 is a novel marker of disease progression and aggressiveness in prostate cancer patients [26,27]. Thus, the interaction between miR-99a and NR6A1 in tumor migration and invasion could be a direction for future research. In addition, a ceRNA network including miR-125b and miR-99a could also be constructed.   14 BioMed Research International For upregulated miRNA, miR-324-5P is correlated to patients' prognosis and has regulatory relationships with 4 genes. One research showed that miR-324-5p suppresses HCC cell invasion, but another study reported that lncRNA-85 promotes HCC cellular proliferation and migration by targeted binding and regulating miR-324-5p [28,29]. Thus, the effect of miR-324-5P on tumor progression might be different via different mechanisms. Among target genes, the high expression of alpha-2,6-sialyltransferase 2 (ST6GAL2), one of the top 100 of all the 1236 downregulated genes, was demonstrated to promote tumorigenesis of follicular thyroid cancer via activating the Hippo signaling pathway [30], and the downregulation of ST6GAL2 is associated with improved patient survival in breast cancer [31], but the effects of ST6GAL2 have not been reported yet on the oncogenesis and the progression of HCC. Regulator of calcineurin 1(RCAN1) is broadly expressed in the liver, placenta, and other tissues. Overexpressed RCAN1, as a

16
BioMed Research International potential target of miR-572, induced apoptosis of HCC cells and inhibited cell proliferation and invasion [32]. The regulatory relationship between miR-324-5P and RCAN1 has not been reported. Anti-Mullerian hormone receptor type 2 (AMHR2) encodes the receptor for the anti-Mullerian hormone (AMH) which results in male sex differentiation. PBX1 encodes a nuclear protein that belongs to the PBX homeobox family of transcriptional factors. The former two genes have a greater value of study than the latter. MiR-425-5p promotes tumor progression in HCC [33], gastric cancer [34], breast cancer [35] and so on. Amphiphysin (AMPH) is a critical tumor suppressor that inhibits tumor progression in breast cancer [36], osteosarcoma [37], etc. Thus, the upregulation of miR-425 may negatively regulate AMPH to promote tumor progression. The relationship between miR-425 and AMPH has been reported that miR-425 regulates cell proliferation, migration, and apoptosis by targeting AMPH in non-small-cell lung cancer [38]. However, the relationship needs to be validated in HCC in further studies. There are few reports about the roles and mechanisms of NEDD 4 binding protein 2-like 1 (N4BP2L1) and protein tyrosine phosphatase receptor type N2 (PTPRN2) for their limited value.
Some miRNAs have tumor suppressor and carcinogenic effects, and the main mechanism is the binding of miRNA and target mRNA. The combination of miRNA and mRNA will cause a decrease in the expression level of target mRNA. Some studies have also found that miRNAs can bind to target mRNA to increase the translation of target mRNA [39]. Through these mechanisms, miRNAs can regulate the expression of many genes and play a similar role to oncogenes or tumor suppressor genes. In the interaction network, we have also marked out the oncogenes and tumor suppressor genes that have been identified. miRNAs and mRNAs transcribed from these genes have potential interaction and coexpression relationships. This can provide certain research directions for future research.
We identified 37 differentially expressed miRNAs and 745 mRNAs in tumor tissues of HCC patients compared to normal controls. 481 negatively regulatory pairs were used to construct a miRNA-mRNA interaction network including 35 miRNAs and 295 mRNAs. Then, we identified 8 miRNAs that are associated with the long-term survival rate and prognosis by using survival analysis. GO and KEGG pathway analyses revealed that the abnormal expression of miRNAs and genes may participate in the regulation of cell adhesion and then induce invasion and metastasis of tumor cells. There are limitations to the study. The sample size is relatively small, which may have an impact on the trustworthiness and credibility of the result of microarray analysis. In a