m6A RNA Methylation Regulators Elicit Malignant Progression and Predict Clinical Outcome in Hepatocellular Carcinoma

Hepatocellular carcinoma (HCC) is a leading cause of cancer-related death worldwide, and N6-methyladenosine (m6A) is a predominant internal modification of RNA in various cancers. We obtained the expression profiles of m6A-related genes for HCC patients from the International Cancer Genome Consortium and The Cancer Genome Atlas datasets. Most of the m6A RNA methylation regulators were confirmed to be differentially expressed among groups stratified by clinical characteristics and tissues. The clinical factors (including stage, grade, and gender) were correlated with the two subgroups (cluster 1/2). We identified an m6A RNA methylation regulator-based signature (including METTL3, YTHDC2, and YTHDF2) that could effectively stratify a high-risk subset of these patients by univariate and LASSO Cox regression, and receiver operating characteristic (ROC) analysis indicated that the signature had a powerful predictive ability. Immune cell analysis revealed that the genes in the signature were correlated with B cell, CD4 T cell, CD8 T cell, dendritic cell, macrophage, and neutrophil. Functional enrichment analysis suggested that these three genes may be involved in genetic and epigenetic events with known links to HCC. Moreover, the nomogram was established based on the signature integrated with clinicopathological features. The calibration curve and the area under ROC also demonstrated the good performance of the nomogram in predicting 3- and 5-year OS in the ICGC and TCGA cohorts. In summary, we demonstrated the vital role of m6A RNA methylation regulators in the initial presentation and progression of HCC and constructed a nomogram which would predict the clinical outcome and provide a basis for individualized therapy.


Introduction
Hepatocellular carcinoma (HCC), one of the most common malignancies, ranks second among the leading causes of cancer-related death globally [1]. The prognosis of HCC patients is relatively poor, mainly accompanied by liver cirrhosis or diagnosed at a late stage. Although the therapies of HCC have undergone rapid progress during the past decades, ranging from surgical and local treatment to molecular-targeted therapy and immunotherapy, the prognosis is undesirable [2,3]. Therefore, it is imperative to clarify the molecular mechanisms of HCC to discover novel therapeutic targets and improve the treatment options.
Up to now, the implication of m 6 A has been studied in various cancers, such as glioblastoma [7], acute myeloid leukemia [9], breast cancer [10], and hepatocellular carcinoma [11]. However, little is known about its roles in initial presentation, development, and pathogenesis for HCC. Recently, bioinformatics research revealed that m 6 A-related genes including METTL3 and YTHDF1 were biological markers and independent prognosis factors in HCC [12]. Methyltransferase-like 14 (METTL14) was shown to be a prognosis factor for HCC and inhibited by microRNA 126 in HCC metastasis [13]. Methyltransferase-like 3 (METTL3) correlates with the poor prognosis of HCC and promotes the progression of HCC [11]. Wilms tumor 1-associated protein (WTAP) was investigated to be a poor prognosis factor and contributed to the progression of HCC via the HuR-ETS1-p21/p27 axis [14]. However, the biological functions' clinical value of other m 6 A-related genes in HCC remains unclear.
In this section, we comprehensively analyzed the expression levels of fourteen m 6 A RNA methylation regulators and clinical factors in patients with HCC from the ICGC (International Cancer Genome Consortium, https://icgc.org/), Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih .gov/geo/), and TCGA (The Cancer Genome Atlas, http:// cancergenome.nih.gov/) databases. We uncovered the invaluable role of m 6 A RNA methylation regulators in the development of HCC and constructed a signature and a nomogram for predicting the survival of HCC.

Materials and Methods
2.1. Data Collection. The profiles were downloaded for 232 patients with HCC from ICGC-LIRI-JP, 209 patients with HCC from GEO-GSE14520, and 370 patients with HCC from TCGA-LIHC (Table 1) in August 2019. And the accession ID from TCGA and ICGC database is shown in Supplement Table 1. Patients who have insufficient clinicopathological data or "0" gene expression values were not included. Since the data come from TCGA and ICGC, it is not necessary to get the study approval by the ethics committee. The patients from the ICGC dataset were defined as a training cohort, and the patients from TCGA dataset were defined as a validation cohort. All statistical analyses were performed using R statistical software (version 3.6).
2.2. m 6 A RNA Methylation Regulator Selection. The m 6 A RNA methylation regulators were collected from published articles. Then, we selected the m 6 A RNA methylation regulators which were conformity to the genes from ICGC and TCGA. Next, these m 6 A RNA methylation regulators were further analyzed with clinical factors with HCC.
2.3. The Functional Enrichment Analysis. Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway categories were used for functional enrichment analysis. GO term and KEGG pathways with a P value under 0.1 were considered indicative of a statistically significant difference.

Consensus Clustering.
Consensus clustering was performed using the ConsensusClusterPlus package in R to determine subgroups of HCC based on the m 6 A RNA meth-ylation regulators [15,16]. The cumulative distribution function (CDF) plots show consensus clustering for each k to find the appropriate k which reaches maximum stability.     Survival curve (p=0.019) Fustat Alive Dead 2.7. Nomogram Establishment. The nomogram was built for prediction of 3-and 5-year survival based on the prognostic signature and clinical factors by the R package "rms." The predictive value of the nomogram was evaluated using the calibration plot and ROC curve by the R packages "rms" and "timeROC."

The Relationship between m 6 A RNA Methylation
Regulators and Clinical Factors. The fourteen m 6 A RNA methylation regulators collected from published literature evaluated the relationship between normal and tumor tissues in TCGA and ICGC cohorts (Figures 1(a) and 1(b)). The results showed that the expression levels of most m 6 A RNA methylation regulators were significantly associated with normal and tumor tissues. The results of quantitative analyses confirmed that the expression levels of the fourteen m 6 A RNA methylation regulators in tumor tissues were significantly higher than the expression levels in normal tissues except ZC3H13 and METTL14 in TCGA cohort (Figure 1(c)), and the results in the ICGC cohort were consistent with TCGA cohort except METTL14 (Figure 1(d)). The relationship between stages and the expression levels of the fourteen m 6 A RNA methylation regulators were also analyzed, and the results showed that the expression levels in patients with stages 3 and 4 were higher than those in patients with stages 1 and 2 in TCGA and ICGC cohorts (Figures 1(e) and 1(f)).

Consensus Clustering Identified Two Subgroups in HCC.
To analyze the relationship among fourteen m 6 A RNA methylation regulators, the Spearman correlation analyses were used among fourteen m 6 A RNA methylation regulators in TCGA (Figure 2(a)) and ICGC cohorts (Figure 2(b)). The interaction of these proteins was retrieved from the STRING database (https://string-db.org/) (Figure 2(c)). To divide the patients with HCC based on consensus clustering of m 6 A RNA methylation regulators, we used a novel consensus clustering method to determine the prognostic capabilities in TCGA cohort. For each cluster number k, consensus clustering cumulative distribution function (CDF) of each final consensus matrix (FCM) was calculated (Figure 2(d)). As shown in Figures 2(e) and 2(f), we choose k = 2 to distinguish the patients with HCC more clearly. The survival analysis showed that cluster 1 patients had significantly poorer overall survival than cluster 2 patients in TCGA (P < 0:001) and ICGC (P < 0:05) cohorts. The clinical factors which included T stage, stage, grade, gender, age, and status were correlated with TCGA cohort (Figure 2(i)).

The Functional Enrichment
Analysis. GO enrichment analysis of these regulators revealed that many of them were related to the GO terms "RNA modification," "mRNA

Disease Markers
methylation," "regulation of mRNA metabolic process," and so on. KEGG analysis showed enrichment in several RNArelated pathways, including processing of capped introncontaining pre-mRNA and reversal of alkylation damage by DNA dioxygenases (Table 2).

The Prognostic
Signature Based on the m 6 A RNA Methylation Regulators. To develop a prognostic signature, univariate Cox regression analysis and LASSO penalized Cox regression analysis were used to identify independent prognostic genes for OS in HCC (Figure 3(a)). The univariate and LASSO Cox regression analyses showed that ALKBH5, HNRNPC, KIAA1429, METTL3, YTHDC2, YTHDF1, and YTHDF2 were independent prognostic genes for OS in the ICGC cohort. Then, the multivariate Cox regression analysis identified three independent prognostic genes: METTL3, YTHDC2, and YTHDF2, and the risk score = ð1:04 × the expression level of METTL3Þ + ð−0:84 × the expression level of YTHDC2Þ + ð1:04 × the expression level of YTHDF2Þ. The C-index of the signature was up to 0.71, and the AIC was 409.65. The results represented that the signature had a reasonable ability to discriminate patients of poor prognosis  7 7 7 7 7 7 7 7 7 7 7 7 7 7 7 6 5 4   Disease Markers from patients of favor prognosis. Each patient in the signature was calculated as a risk score. Using the median risk score value as the cut-off point, patients in each data portal were classified into low-risk and high-risk groups. We also figured the correlation between the prognostic signature and the overall survival of patients in the ICGC cohort (Figure 3(b)), TCGA (Figure 3(c)), and GSE14520   Figure S1C). The relationship between the risk score groups and clinical factors was further analyzed in the ICGC and TCGA cohorts (Figures 4(e) and 4(f)). It was confirmed that the differences between the high-and low-risk groups with regard to stage (P < 0:01) and status (P < 0:01) were significant in the ICGC cohort. The differences between the high-and low-risk groups with regard to stage (P < 0:05) and grade (P < 0:01) were significant in TCGA cohort. Moreover, we examined the relationship between the risk score groups, and immune cells were further analyzed in the ICGC and TCGA cohorts (Figures 4(g) and 4(h)). The results suggested that the differences between the high-and low-risk groups with regard to regulatory T cells (P < 0:001 ), naive B cells (P < 0:01), follicular helper T cells (P < 0:05 ), memory B cells (P < 0:01), and M0 macrophages (P < 0:01) were significant in the ICGC cohort. The results also suggested that the differences between the high-and low-risk groups with regard to CD8 T cells (P < 0:01), M0 macrophages (P < 0:001), and CD4 memory resting T cells (P < 0:01) were significant in TCGA cohort.

Nomogram Construction.
Based on the prognostic signature and clinical factors, such as gender, age, and stage, a nomogram was constructed (Figure 6(a)). The calibration curve was used to describe the prediction value of the

Discussion
Hepatocellular carcinoma (HCC) is a leading malignancy worldwide due to its high recurrence rate, high metastatic potential, and resistance to systematic therapy. However, the molecular mechanisms of HCC are unclear. The m 6 A RNA methylation is one of the most prevalent forms of RNA modifications. In recent decades, the highthroughput sequencing has revealed a significant role of m 6 A RNA methylation in HCC [17]. In the present study, we compared the expression levels of m 6 A RNA methylation regulators in tumor and normal tissues. The patients were divided into cluster 1 and cluster 2 according to consensus clustering. Based on the univariate Cox regression analysis and LASSO penalized Cox regression analysis, a prognostic signature was constructed with six m 6 A RNA methylation regulators and the signature was confirmed to be a significant independent prognostic. Next, the nomogram was developed with the prognostic signature and  Until now, m 6 A RNA methylation regulators have attracted the attention of the medical research community. The fourteen m 6 A RNA methylation regulators from literature were analyzed using univariate and LASSO penalized Cox regression analysis in HCC. The results showed that METTL3, YTHDC2, and YTHDF2 were independent highrisk regulators in the ICGC and TCGA cohorts. The METTL3 and YTHDC2, also called "writer," are core components of the m 6 A RNA methylation complex and involved in various biological processes. METTL3 was suggested to act as an oncogene in bladder cancer [18], breast cancer [19], ovarian carcinoma [20], and pancreatic cancer [21]. METTL3 has been also reported to be upregulated in HCC and associated with poor prognosis of HCC, and our findings confirmed the previous study that METTL3 plays an oncogenic role in HCC [11]. YTHDC2, the fifth member of the YTH protein family, was confirmed to be an oncogene in many cancers, and the expression level of YTHDC2 is high in several human cancer cells [22]. YTHDF2, termed "reader," is a first studied functional m 6 A-binding protein that mainly regulates the stability of mRNA [23]. It acts as a tumor suppressor in HCC [24],  Disease Markers acute myeloid leukemia [25], and pancreatic cancer [26]. These conclusions were consistent with the findings in our study. Here, we investigated that METTL3 and YTHDC2 were negatively correlated with the prognosis and YTHDF2 was positively associated with the prognosis. Our study also established the prognostic signature and nomogram based on m 6 A RNA methylation regulators for predicting the outcome of patients with HCC. The genomic signature and nomogram, integrated multiple biomarkers, are promising methods that would improve clinical management. Recently, a risk signature, using seven m 6 A RNA methylation regulators, was built to predict the clinical outcomes of gliomas [27]. METTL3 and YTHDF1 were identified as independent poor prognostic factors in HCC [12]. Nevertheless, the clinical factors should be also considered to ameliorate clinical therapy. In our study, the nomogram was constructed and validated based on the prognostic signature and clinical factors. Compared with other factors, the nomogram showed a more robust ability to predict the 3-and 5year OS.
There are still several limitations. At the beginning, the prognostic value of the signature was not yet confirmed by the validation experimental studies in HCC in vitro and in vivo, which was just validated by another online dataset. Second, the sample size of the training and validation cohorts is quite small. Further validations are awaited.
In all, we have performed the first signature based on the m 6 A RNA methylation regulator, as well as the construction of the nomogram based on the signature and clinical factors. Hence, this prognostic signature would be a useful marker for guiding the selection of individualizing therapy for HCC. Future studies should focus on the underlying molecular mechanisms.