Identification of a Novel Eight-lncRNA Prognostic Signature for HBV-HCC and Analysis of Their Functions Based on Coexpression and ceRNA Networks

Studies have demonstrated the prognosis potential of long noncoding RNAs (lncRNAs) for hepatocellular carcinoma (HCC), but specific lncRNAs for hepatitis B virus- (HBV-) related HCC have rarely been reported. This study was aimed at identifying a lncRNA prognostic signature for HBV-HCC and exploring their underlying functions. The sequencing dataset was collected from The Cancer Genome Atlas database as the training set, while the microarray dataset was obtained from the European Bioinformatics Institute database (E-TABM-36) as the validation set. Univariate and multivariate Cox regression analyses identified that eight lncRNAs (TSPEAR-AS1, LINC00511, LINC01136, MKLN1-AS, LINC00506, KRTAP5-AS1, ZNF252P-AS1, and THUMPD3-AS1) were significantly associated with overall survival (OS). These eight lncRNAs were used to construct a risk score model. The Kaplan-Meier survival curve results showed that this risk score can significantly differentiate the OS between the high-risk group and the low-risk group. Receiver operating characteristic curve analysis demonstrated that this risk score exhibited good prediction effectiveness (area under the curve (AUC) = 0.990 for the training set; AUC = 0.903 for the validation set). Furthermore, this lncRNA risk score was identified as an independent prognostic factor in the multivariate analysis after adjusting other clinical characteristics. The crucial coexpression (LINC00511-CABYR, THUMPD3-AS1-TRIP13, LINC01136-SFN, LINC00506-ANLN, and KRTAP5-AS1/TSPEAR-AS1/MKLN1-AS/ZNF252P-AS1-MC1R) or competing endogenous RNA (THUMPD3-AS1-hsa-miR-450a-TRIP13) interaction axes were identified to reveal the possible functions of lncRNAs. These genes were enriched into cell cycle-related biological processes or pathways. In conclusion, our study identified a novel eight-lncRNA prognosis signature for HBV-HCC patients and these lncRNAs may be potential therapeutic targets.


Introduction
Liver cancer, 90% of which is hepatocellular carcinoma (HCC), is one of the most common malignancies and the leading cause of cancer-related deaths worldwide [1,2]. Despite great advances made in therapeutic approaches, relapse, progression, and metastasis rates remain high, which leads to poor prognostic outcomes [3]. Hepatitis B virus (HBV) infection is recognized as the most important risk factor associated with the biological aggressiveness of HCC and patients' survival [4]. Therefore, it may be an important issue to screen prognostic biomarkers and therapeutic targets for patients with HBV-HCC in order to optimize treatment, prevent progression, and improve the prognosis.
Recently, more and more evidence has highlighted the important roles of long noncoding RNAs (lncRNAs) in HBV-HCC via directly influencing the expression of their neighboring genes (coexpression hypothesis) or acting as microRNA (miRNAs) sponges to negatively regulate the expression of miRNA target genes (competing endogenous RNA (ceRNA) hypothesis) [5]. For example, Chen et al. found that LINC01152 was significantly upregulated in HBV-positive HCC tissues and cells treated by HBV X protein (HBx). Overexpression of LINC01152 could increase HCC cell proliferation and promote tumor formation in nude mice by directly binding to the promoter region of interleukin-(IL-) 23 to increase its transcription [6]. Hu et al. identified that lncRNA WEE2-AS1 was overexpressed in HCC tissues and was positively correlated to HBV infection. Compared with the low expression and HBV-negative groups, HBV-HCC patients with high expression of WEE2-AS1 exhibited the worst prognosis. WEE2-AS1 accelerated the proliferation, migration, invasion, and cell cycle progression of HCC cells by upregulating the downstream Fermitin family member 3 (FERMT3) [7]. Lin et al. observed that the expression of miR143HG was markedly decreased in HCC tissues and its expression was associated with the presence of HBV infection. Highly expressed miR143HG can serve as an independent prognostic factor to predict a good prognosis. In vitro analysis revealed that miR143HG may exert its tumor suppressor roles by sponging miR-155 and then promoting the expression of miR-155 target gene adenomatous polyposis coli (APC) which can inhibit the Wnt/β-catenin signaling pathway [8]. Fan et al. demonstrated that the expression of lncRNA n335586 was significantly increased in HBV-positive HCC tissues and cells. lncRNA n335586 may contribute to the migration, invasion, and metastasis of HCC cells by facilitating the expression of its host gene creatine kinase, mitochondrial 1A (CKMT1A), through competitively binding to miR-924 [9]. These findings suggest that lncRNAs may be potential therapeutic targets and prognostic biomarkers for HBV-HCC patients. However, the lncRNAs specifically associated with HBV-HCC remain rarely reported and most of these studies focused on the prognostic roles of single lncRNA, and the attempt to identify a lncRNA prognostic signature for HBV-HCC was limited [10].
The present study is aimed at further identifying a lncRNA prognostic signature for HBV-HCC patients through using the sequencing data in The Cancer Genome Atlas (TCGA) database and a microarray dataset deposited in the European Bioinformatics Institute (EMBL-EBI) which had the larger sample size compared with the dataset used by Liu et al. [10] (44 vs. 19). Moreover, the potential functions of lncRNAs were predicted by constructing a ceRNA regulatory network, in addition to a coexpression network which was used in the study of Liu et al. [10]. Hereby, our study may provide a novel lncRNA prognostic signature and identify potential therapeutic targets for HBV-HCC patients.  [11] to serve as the validation set. This dataset included 65 samples, 44 of which were HBV-positive and had survival information.
2.3. Identification of a Prognostic lncRNA Signature Using the Training Set. The survival package (version 2.41-1; http:// bioconductor.org/packages/survivalr/) was used for all statistical analyses. First, univariate Cox regression analysis was performed to mine DEGs, DELs, and DEMs that were associated with overall survival (OS) in the training set (log-rank p value < 0.05). Second, multivariate Cox regression analysis was conducted to further select independent prognostic lncRNAs, which were used to construct a lncRNA prognostic signature model according to the following formula: where Exp lncRNA was the expression level of DELs and β lncRNA was the regression coefficient for the DELs in multivariate Cox hazard model analysis. Third, using the median risk score as the threshold, the HBV-HCC patients were divided into the low-risk and high-risk groups. The Kaplan-Meier (K-M) survival curves were used to calculate the differences in OS between the highand low-risk groups. The receiver operating characteristic (ROC) curve and area under the curve (AUC) were also applied to assess the predictive performance of the prognostic risk score.

Validation of the Prognostic
Power of the Risk Score Using the Validation Dataset. The risk score identified in the training dataset was further calculated for the validation dataset. Patients were also divided into the high-risk and low-risk groups according to the risk score. The K-M survival and ROC curves were analyzed.
2.5. The Prognostic Independence of the Risk Score. Univariate and multivariate Cox regression analyses were conducted to assess whether the prognostic performance of signature risk score was independent of other clinical characteristics (including age, gender, neoplasm histologic grade, pathologic TNM stage, vascular invasion, and recurrence) for patients with HBV-HCC.  [14] database was used to predict the interactions between signature DELs and prognostic DEMs. The starBase database (version 2.0; http://starbase.sysu.edu .cn/) [15] was used to predict the interactions between prognostic DEMs and prognostic DEGs. The DEL-DEM and DEM-DEG interactors which had opposite expression trend were integrated to construct the ceRNA network, which was visualized using the Cytoscape software (version 3.6.1; http://www.cytoscape.org/) [16]. The coexpression network was constructed based on the correlation between signature DELs and DEGs which was calculated using the tcor.test function (https://stat.ethz.ch/ R-manual/R-devel/library/stats/html/cor.test.html) in R to generate the Pearson correlation coefficients (PCC). Only the coexpression pairs with PCC > 0:4 were selected to draw the network using Cytoscape (version 3.4; http://www.cy toscape.org/) [16].

Function and Pathway Enrichment.
Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed using clusterProfiler (version 3.6.0; http://bioconductor.org/ packages/release/bioc/html/clusterProfiler.html) to reveal the functions of DEGs in the lncRNA-related coexpression and ceRNA networks [17]. FDR < 0:05 was set as the cut-off value.

Identification of a lncRNA Signature with Prognostic
Value. A total of 260 differentially expressed RNAs were identified to be significantly associated with OS by univariate Cox regression analysis, including 189 DEGs, 39 DELs, and 32 DEMs. To determine the optimal prognostic lncRNAs, multivariate Cox proportional hazards regression was further adopted to evaluate the prognostic independence of the above-identified 39 prognostic lncRNAs. As a result, 8 lncRNAs (TSPEAR-AS1, LINC00511, LINC01136, MKLN1-AS, LINC00506, KRTAP5-AS1, ZNF252P-AS1, and THUMPD3-AS1) were found to be still significant in the multivariate analysis (Table 1), indicating that these lncRNAs were independent prognostic biomarkers. As shown in Table 1, LINC00511, LINC01136, LINC00506, and THUMPD3-AS1 may be prognostic risk factors for OS because of negative coefficient and HR > 1 (that is, patients with high expression of lncRNAs had shorter survival) while TSPEAR-AS1, MKLN1-AS, KRTAP5-AS1, and ZNF252P-AS1 may be prognostic protective factors due to negative coefficient and HR < 1 (that is, patients with high expression of lncRNAs had longer survival).

Independence of the Eight-lncRNA Signature for Survival
Prediction. Univariate and multivariate Cox regression analyses were also performed to further assess whether the prognostic value of the eight-lncRNA signature was independent of other clinical variables. As expected, the risk score status was identified as a significant prognostic factor in both of the univariate and multivariate Cox regression analyses (Table 2), indicating the independence of the eight-lncRNA signature for prognostic prediction.

Discussion
In the present study, an eight-lncRNA signature (TSPEAR-AS1, LINC00511, LINC01136, MKLN1-AS, LINC00506, KRTAP5-AS1, ZNF252P-AS1, and THUMPD3-AS1) was identified as an independent prognostic biomarker for HBV-positive HCC patients. ROC curve analysis showed that the risk score established by this eight-lncRNA signature had high prediction accuracy for OS, with AUC of 0.990 for the training set and 0.903 for the validation set, respectively. Further mechanism analyses uncovered upregulated THUMPD3-AS1 may function as a ceRNA to sponge hsa-miR-450a and then regulate TRIP13. In addition, THUMPD3-AS1 may also directly regulate TRIP13 by coexpression with it. The other lncRNAs may exert protective or risk roles for the development of HBV-positive HCC mainly by coexpression with their downstream genes, such as LINC00511-CABYR, LINC01136-SFN, LINC00506-ANLN, and KRTAP5-AS1/TSPEAR-AS1/MKLN1-AS/ZNF252P-AS1-MC1R. These genes were involved in cell cycle-related GO terms or KEGG pathways.
Although TCGA data with essentially the same samples were also used as the training set to screen the lncRNA signature in the study of Liu et al. [10], there were no common lncRNAs between our identified lncRNA signature and that of Liu et al. [10] (DGCR9, GBA3, HCG4, NAT8B, NBR2, PART1, RFPL1S, SLC22A18AS, and TCL6). This may be attributed to the analysis method difference (i.e., only lncRNAs in partial module were used for univariate Cox regression in Liu et al. [10], but all DELs in our study). Furthermore, only TSPEAR-AS1 (upregulated, protective) [18] and LINC00511 (upregulated, risk) [19], with the prognosis trend in line with ours, were reported as prognostic factors for the whole HCC, while KRTAP5-AS1 (upregulated, protective; similar to ours) [20] and THUMPD3-AS1 (upregulated, protective; contrast to ours) [21] were identified to be associated with OS in papillary thyroid carcinoma and lung adenocarcinoma, respectively. No studies investigated the prognostic values of LINC01136, MKLN1-AS, LINC00506, and ZNF252P-AS1. These findings indicated that our lncRNA signature may be a first-time identified biomarker of the prognosis for HBV-HCC.
Besides, the predictive accuracy of our eight-lncRNA signature was comparable with that of 9-lncRNA signature for HBV-HCC identified by Liu       to suppress its expression and then promoted the upregulation of CDK6, a direct target of miR-29c, leading to an increase in the cell viability of breast cancer cells [28]. Lu et al. concluded that LINC00511 promoted the proliferation, sphere-formation ability, and tumor growth in breast cancer cells by functioning as a ceRNA for miR-185-3p to positively recover E2F1 protein and promote the transcription of Nanog gene [29]. Furthermore, LINC00511-hsa-miR-29b-3p-VEGFA (vascular endothelial growth factor A) [30], LINC00511-miR-765-LAMC2 (laminin subunit gamma 2) [31], and LINC00511-miR-124-3p-CCND2 (cyclin D2) [32] ceRNA axes were also reported for pancreatic ductal adenocarcinoma, tongue squamous cell carcinoma, and glioma, respectively. But the coexpression mechanism of LINC00511 was rarely reported [33]. In this study, we speculated LINC00511 may coexpress with CABYR. Although no experiments were performed to confirm their relationships, previous studies on the roles of CABYR may indirectly explain our conclusion. For example, Li et al. found that the mRNA and protein levels of CABYR-c were significantly higher in HCC tissues than those in the adjacent noncancerous tissues. Silencing of CABYR-c by antisense oligonucleotides significantly inhibited the cell growth of HepG2 cells and arresting the cell cycle [34]. Similarly, knockdown of CABYR-a/b was also reported to increase the cell apoptosis and enhance the chemosensitivity [35]. In line with these studies, our study also identified that LINC00511 and CABYR were upregulated and predicted a poor prognosis.
THUMPD3-AS1 was not included in the ceRNA network of Li et al. [21], and thus, its potential function remained unknown. In this study, we predicted that upregulated THUMPD3-AS1 may directly coexpress with TRIP13 to promote its upregulated expression or regulate it by sponging hsa-miR-450a. Although no research was conducted to verify their interaction relationships, previous studies on the expression and roles of these miRNA and mRNA may indirectly prove our speculation. Ju et al. detected that the expression of TRIP13 was upregulated in HCC tissues and cell lines. Patients with higher expression levels of TRIP13 had significantly shorter survival periods [36]. Downregulation of TRIP13 inhibited HCC cell proliferation, migration, and invasion, promoted apoptosis and cell cycle arrest at Sphase in vitro [36], and suppressed the formation of tumor in vivo [37]. The study of Weng et al. revealed that miR-450a was significantly downregulated in HCC tissues and cells. Ectopic expression of miR-450a in HepG2 cells caused an inhibition of cell proliferation, the mechanism of which was related to downregulation of DNA methyltransferase 3a [38].   :0000279~M phase  1.06E-14   EXO1, CDK1, XRCC2, KIF18A, CHEK1, ANLN, BIRC5, AURKB,  PTTG1, CEP55, UBE2C, RAD51, CCNB1, SPC25, CDCA8, MC1R,  BUB1B, RAD54B, SKA1, DSCC1, TRIP13   GO:0022403~cell cycle phase 5.37E-14   EXO1, CDK1, XRCC2, KIF18A, CHEK1, ANLN, BIRC5, AURKB,  PTTG1, CEP55, UBE2C, RAD51, CCNB1, SPC25, CDCA8, MC1R,  MTBP, BUB1B, RAD54B, SKA1, DSCC1, TRIP13   GO:0007049~cell cycle  1.17E-12   XRCC2, CHEK1, ANLN, CEP55, AURKB, PTTG1, CCNE2, SPC25,  CDCA8, MC1R, HJURP, MTBP, SKA1, TRIP13, EXO1, CDK1, KIF18A,  BIRC5, MCM2, UBE2C, RACGAP1, RAD51, CCNB1,   LINC01136 was a first-time identified oncogene in our study, and its role was unknown. We predicted it may exert tumor-promoting roles by coexpressing with SFN which had been demonstrated to be oncogenic previously as follows: SFN protein was previously identified to be upregulated in HCC via proteomics analyses [39]. SFN guided cell migration and invasion of breast cancer cells by stabilizing a complex of soluble actin [40]. Suppression of SFN expression by siRNA significantly reduced proliferation activity and the Sphase subpopulation of human lung adenocarcinoma cells and blocked tumor development in mice [41]. By analysis of ten datasets, Hu et al. found that high expression of SFN was significantly associated with worse OS in patients with ovarian cancer [42]. In agreement with these studies, we also found that SFN was upregulated and predicted a poor prognosis in HBV-HCC.
The function of LINC00506 was also not reported previously. We predicted it coexpressed with ANLN to promote carcinogenesis. ANLN was detected to be highly expressed in HCC tissues and cells infected with HBV. Higher ANLN expression was associated with a worse prognosis. Moreover, inhibition of ANLN resulted in growth restraint, colony formation reduction, and apoptosis activation [43] as well as survival time increase [44]. In accordance with these studies, we also found ANLN was highly expressed in HBV-HCC tissues and related to a poor prognosis.

BioMed Research International
The functions of TSPEAR-AS1 and KRTAP5-AS1 had been predicted based on the ceRNA network previously. For example, TSPEAR-AS1 may sponge miR-424 to regulate the expression of G protein subunit alpha L (GNAL) [18] and lncRNA KRTAP5-AS1 could act as a ceRNA to affect the functions of Claudin-4 [45]. However, their roles remain In this study, our study suggested that TSPEAR-AS1 and KRTAP5-AS1 as well as MKLN1-AS/ZNF252P-AS1 may exert protective roles for HBV-HCC by coexpressing with MC1R. The studies on MC1R mainly focused on melanoma, and the results indicated that patients had significantly longer OS in melanoma tumors with lower expression of MC1R [46], which was opposite with our expectation. Thus, further studies on MC1R in other cancers especially HBV-HCC are necessary.
There were some limitations in this study. First, the sample size was small. Therefore, large HBV-HCC cohorts should be collected to confirm the prognosis values of our lncRNA signature. Second, this study was only preliminary to predict the functions of lncRNAs in HBV-HCC and further wet experiments should be performed to verify their ceRNA (luciferase assay, overexpression or silencing) or coexpression (coimmunoprecipitation) mechanisms. Third, previous studies have demonstrated that the gene profiles induced by HBV, HCV, and other etiologies were different [47][48][49]. Thus, theoretically, a distinct signature or results using the same signature may be obtained when a similar analysis was performed [50], which should be confirmed in subsequent studies. Fourth, due to the limited description of treatment regimens in TCGA data, we could not investigate the influence of therapies (hepatectomy, radiofrequency ablation, transarterial embolization, and others) on the prognosis of HBV-HCC patients.