Construction and Analysis of lncRNA-Mediated ceRNA Network in Nasopharyngeal Carcinoma Based on Weighted Correlation Network Analysis

Increasing evidence indicated that aberrant expression of long noncoding RNAs (lncRNAs) are involved in tumorigenesis of nasopharyngeal carcinoma (NPC). The purpose of this study was to construct a lncRNA-mediated ceRNA network based on weighted correlation network analysis (WGCNA). First, modules with highly correlated genes were identified from GSE102349 via WGCNA, and the preservation of the modules was evaluated by GSE68799. Then, the differentially expressed lncRNAs and mRNAs identified from GSE12452 which belonged to the same WGCNA modules and the differentially expressed miRNAs identified from GSE32960 were used to construct a ceRNA network. The prognostic value of the network was evaluated by survival analysis. Furthermore, a risk score model for predicting progression-free survival (PFS) of NPC patients was established via LASSO-penalized Cox regression, and the differences in the expression of the lncRNAs between high- and low-risk groups were investigated. Finally, 14 stable modules were identified, and a ceRNA network composed of 11 lncRNAs, 15 miRNAs, and 40 mRNAs was established. The lncRNAs and mRNAs in the network belonged to the turquoise and salmon modules. Survival analysis indicated that ZNF667-AS1, LDHA, LMNB2, TPI1, UNG, and hsa-miR-142-3p were significantly correlated with the prognosis of NPC. Gene set enrichment analysis indicated that the upregulation of ZNF667-AS1 was associated with some immune-related pathways. Besides, a risk score model consisting of 12 genes was constructed and showed a good performance in predicting PFS for NPC patients. Among the 11 lncRNAs in the ceRNA network, SNHG16, SNHG17, and THAP9-AS1 were upregulated in the high-risk group of NPC, while ZNF667-AS1 was downregulated in the high-risk group of NPC. These results will promote our understanding of the crosstalk among lncRNAs, miRNAs, and mRNAs in the tumorigenesis and progression of NPC.


Introduction
Nasopharyngeal carcinoma (NPC) is a fatal malignancy arising from the nasopharynx epithelium. on the RNA-Seq platform of GPL11154 (Illumina HiSeq 2000), including 113 NPC tissues, was used to conduct WGCNA. GSE68799, performed on the same platform as GSE102349, consisting of 42 NPC and 4 non-NPC tissues, was used for module preservation analysis [16]. GSE32960, performed on the miRNA pro ling platform of GPL14722 (CapitalBio, Inc., Beijing, China), containing 312 para n-embedded NPC and 18 para n-embedded normal nasopharyngeal tissues, was used to identify differentially expressed miRNAs (DEmiRNAs) between NPC and normal tissues. GSE12452 performed on the mRNA pro ling platform of GPL570 [HG-U133_Plus_2] (Thermo Fisher Scienti c, Inc.), comprising 31 NPC and 10 normal nasopharyngeal tissues, was used to identify differentially expressed genes (DEGs) between NPC and normal tissues. Since these data are downloaded from the public database, the consent of the ethics committee is not required.

Weighted correlation network analysis
WGCNA is a widely used method for nding modules of highly correlated genes across microarray samples. In this study, we perform WGCNA based on GSE102349 using "WGCNA" package in R software (version 3.5.0; http://www.r-project.org). The gene expression levels were measured by log2-transformed fragments per kilobase million (FPKM). Only genes with the highest 25% variance of expression values among samples were included in WGCNA. The outliers were identi ed and removed by hierarchical clustering analysis. Proper soft thresholding power was obtained based on the criterion of approximate scale-free topology. Next, "one-step network construction and module detection" function was used to construct a signed co-expression network and identify modules associated with NPC. The preservation of the modules was examined by GSE68799. The preservation value "medianRank" and "Zsummary.pres" (Z-score) were calculated using the nPermutations of 200. In general, if Z-score is lower than 2, the module will be regarded as no evidence of preservation, while Z-score>10 will be regarded as high preservation. Furthermore, the associations between each module and the clinical traits were analyzed by Pearson's correlation test. P < 0.05 was considered signi cant.

Differential expression analysis between NPC and normal tissues
To identify DEmiRNAs and DEGs between NPC and normal tissues, we performed differential expression analysis based on GSE32960 and GSE12452 using "limma" package [17]. Before analysis, miRNA expression values were log2-transformed and quantiled normalized across multiple arrays. Gene expression values of GSE12452 were calculated by the method described in our previous study [18]. The DEmiRNAs and DEGs with false discovery rate (FDR) <0.05 and |log fold change (FC)|>0.5 were considered signi cant and used for subsequent analysis.

Identi cation of lncRNAs and Construction of the ceRNA network
The lncRNAs and mRNAs in the WGCNA modules were de ned and annotated based on the Gencode lncRNAs annotation le (gencode.v31.long_noncoding_RNAs.gtf.gz) from the GENCODE database (https://www.gencode-genes.org). Following, a lncRNA-mediated ceRNA network was constructed based on the hypothesis that lncRNAs can interact with target miRNAs and invoked miRNA sponges to regulate the activity of mRNAs [11]. According to the hypothesis, both lncRNAs and mRNAs had a negative correlation with the corresponding miRNAs. We speculated that the mRNAs may be the target genes of lncRNAs that belong to the same WGCNA module, so we rst obtained up-and down-regulated DElncRNAs and DEmRNAs in NPC by taking the intersection of the DEGs and the genes from the WGCNA modules. Then, we used the validated lncRNA-miRNA interaction data from LncBase v.2 experimental module [14] to predict target miRNAs for DElncRNAs. The target miRNAs were matched with DEmiRNAs to acquire up-and down-regulated miRNAs for the ceRNA network. Subsequently, we used the experimentally veri ed miRNA-mRNA interaction data from miRTarBase database [15] to predict target mRNAs for those up-and down-regulated miRNAs. After that, the target mRNAs were matched with the DEmRNAs in the WGCNA modules to obtain mRNAs for the ceRNA network. Thus, a ceRNA network based on the WGCNA modules was constructed and visualized by Cytoscape v3.6 software. The work ow of building the ceRNA network was provided in Fig.S1.

Survival analysis for the lncRNAs, miRNAs and mRNAs in the ceRNA network
There were 88 NPC samples with progression-free survival (PFS) data in GSE102349. PFS was de ned as the time from the date of diagnosis to the date of objective tumor progression or death from any cause [19]. There were 312 NPC samples with prognostic information in the dataset GSE32960. The prognostic information included overall survival (OS), disease-free survival (DFS) and distant metastasis-free survival (DMFS). To explore the relationship between the expression level of lncRNAs, miRNAs and mRNAs in the ceRNA network and survival time, the samples were divided into two groups of high expression and low expression according to the median value. Then, the Kaplan-Meier survival curve combined with a Log-rank test was performed to compare the survival difference between the high-and low-expression groups. Survival analysis was implemented using "survival" and "survminer" packages [20,21]. P<0.05 was considered to be statistically signi cant.
2.6 Functional and pathway enrichment analysis for the genes in the WGCNA modules and ceRNA network To explore the potential biological functions for the genes in the WGCNA modules and lncRNA-mediated ceRNA network, the genes were submitted to GO and KEGG pathway enrichment analysis using ''clusterPro ler'' package [22]. For GO analysis, the cut-off criteria were P<0.05 and q<0.05. For KEGG analysis, the cut-off criterion was P<0.05.

GSEA for the prognostic lncRNAs in the ceRNA network
To explore the potential pathways of the prognostic lncRNAs in the ceRNA network, the samples in the GSE102349 and GSE12452 were categorized into two groups of high-expression and low-expression according to the median value of the prognostic lncRNAs. Then, GSEA was performed by "clusterpro ler" package using the annotated gene set, "c2.cp.kegg.v7.0.symbols.gmt" from the Molecular Signature Database (MSigDB, http://software.broadinstitute.org/gsea/msigdb/index.jsp) as the reference gene set. False discovery rate (FDR) < 0.05 was chosen as the threshold.

Construction of risk score model
To construct a risk score model for PFS prediction, the DEGs with false discovery rate (FDR) <0.05 and |log fold change (FC)|>1 in the high preserved WGCNA modules were subjected to lasso penalized Cox regression analysis by "glmnet" package [23,24] in R. The function "cv.glmnet" was used to compute 10fold cross-validation for the Cox model. "Lambda.min" was select as the optimal λ value. The prognostic genes and their coe cients were generated to calculate the risk score (RS) for each sample. The RS formula was constructed as follows: Risk score=∑(β×expr). Here, β represents the coe cient of a gene, and expr represents the expression level of the gene. According to the median value of the RS, 88 samples in GSE102349 were classi ed into high-and low-risk groups. Then, the Kaplan-Meier survival curve combined with a Log-rank test was performed to evaluate the survival differences between highand low-risk groups using "survival" and "survminer" packages. The receiver operating characteristic (ROC) curve, the area under the curve (AUC) and the C-index were applied to assess the prognostic accuracy of the risk score model using "survivalROC" package [25]. Besides, the distribution of the risk score, survival status and expression patterns of the prognostic gene signatures were explored in GSE102349. After that, the prognostic genes were subjected to univariate Cox analysis and multivariate Cox analysis. P<0.05 was considered statistically signi cant. Using "forestplot" package [26], a forest plot was used to illustrate the results of hazard ratio (HR), 95% CI (con dence interval) and corresponding Pvalue. Finally, we compared the expression level of the lncRNAs in the ceRNA network between high-and low-risk groups in GSE102349 using two-sided student' t test. P < 0.05 was considered to be statistically signi cant.

Weighted correlation network analysis
A weighted correlation network of NPC based on GSE102349 was constructed using "WGCNA" package. After removing three samples (GSM2735294, GSM2735309 and GSM2735313) which were considered abnormal after sample clustering analysis, 110 samples were retained for further analysis (Fig.S2). We chose the soft-thresholding power of β as three, which was the lowest power for which the scale-free topology t index (scale-free R 2 ) achieved 0.9 (Fig.S3). Then, a weighted correlation network with 14 modules was constructed. The sizes of the modules ranged from 52 (cyan module) to 2045 (turquoise module) genes. A hierarchical clustering dendrogram (tree) was generated with the color assignment (Fig.1a). The genes labeled with "gray" represented background genes that were not assigned to any coexpressive gene modules. Following, the preservation of the modules were examined using GSE68799. Results showed that the 14 modules were stable because their Z-scores were all higher than two (Fig.S4).
Among them, ten modules (blue, turquoise, yellow, brown, red, cyan, magenta, black, tan and purple) were high preserved (Z-score>10). Besides, we calculated the correlations between the 14 modules and the clinical traits (Fig.1b). The results showed that three modules (cyan, turquoise and black) were signi cantly correlated with clinical stage, four modules (tan, red, turquoise and magenta) were signi cantly correlated with the event, and four modules (brown, red, turquoise and black) were signi cantly correlated with time to event (P<0.05).

Survival analysis for the lncRNAs, miRNAs and mRNAs in the ceRNA network
Kaplan-Meier survival analysis was performed to explore the prognostic value of the lncRNAs, miRNAs and mRNAs in the ceRNA network. Results showed that four mRNAs, including LDHA, LMNB2, TPI1 and UNG, were negatively correlated with PFS, and lncRNA ZNF667-AS1 was positively correlated with PFS. Patients in the high-expression group of ZNF667-AS1 had longer PFS than those in the low-expression group (Fig.4a). Additionally, hsa-miR-142-3p was found to be positively correlated with OS, DFS and DMFS (Fig.4b).

Functional and pathway enrichment analysis for the genes in the WGCNA modules
The potential biological function of the genes in the WGCNA modules was investigated by GO and KEGG pathway enrichment analysis. Results indicated that the genes in the modules were associated with different functional categories and pathways ( Figure S5a-b). For example, "Wnt signaling pathway" was enriched in the black module; "Small cell lung cancer" was enriched in the brown module; "p53 signaling pathway" was enriched in the brown and tan modules; "PI3K-Akt signaling pathway" was enriched in the brown, turquoise and yellow modules; "Epstein−Barr virus infection" was enriched in the cyan and turquoise modules; "IL−17 signaling pathway" was enriched in the green modules; "Amoebiasis" was enriched in the purple, turquoise and yellow modules; "Cell cycle", "Oocyte meiosis", "DNA replication" were enriched in the tan module; "Protein digestion and absorption" was enriched in the yellow module.
3.6 Functional and pathway enrichment analysis for the genes in the ceRNA network Through GO and KEGG pathway enrichment analysis, we also studied the potential biological functions of the genes in the ceRNA network. Results showed that the genes were signi cantly enriched in 76 GO terms for biological processes (BP), two GO terms for molecular function (MF) and one GO terms for cellular components (CC) (Table S3). For BP, enriched terms included "nucleocytoplasmic transport", "nuclear transport", "regulation of organ morphogenesis", "regulation of morphogenesis of an epithelium" and "sulfur amino acid metabolic process". For CC, the enriched term was "nuclear periphery". For MF, the enriched terms included "ribosomal small subunit binding" and "histone kinase activity". Additionally, KEGG pathway enrichment analysis indicated that "Wnt signaling pathway" and "HIF-1 signaling pathway" were signi cantly enriched. We displayed the results as 4 networks that depict the representative GO terms and KEGG pathways (Fig.5).

GSEA for ZNF667-AS1
As shown in the above analysis, of the 11 lncRNAs, only ZNF667-AS1 was correlated with PFS signi cantly. To better understand the role of ZNF667-AS1 in NPC, GSEA was performed on GSE102349 and GSE12452. The results showed that 58 pathways were enriched in GSE102349 (Table S4), and 54 pathways were enriched in GSE12452 (Table S5). Further analysis revealed that these two data sets shared 44 pathways. Fig.6 showed the top ve activated pathways and ve suppressed pathways when ZNF667-AS1 was up-regulated and down-regulated. The up-regulation of ZNF667-AS1 was associated with "intestinal immune network for IgA production", "hematopoietic_cell_lineage", "B cell receptor signaling pathway", "T cell receptor signaling pathway", "leukocyte transendothelial migration" and other pathways. The down-regulation of ZNF667-AS1 was related to "DNA replication", "cell cycle", "P53 signaling pathway", "spliceosome" and other pathways.
According to the median value of RSs, 88 samples in the dataset GSE102349 were separated into highand low-risk groups. Kaplan-Meier analysis showed that patients in the high-risk group had poorer PFS than those in the low-risk group (Fig.7c). Time-dependent ROC analysis at varying follow-up times was used to explore the prognostic capacity of the risk score model for PFS. Results showed that the AUC received 0.932, 0.943 and 0.967 at 12-, 24-and 36-months (Fig.7d). The C-index for the model was 0.891. Therefore, this model has good PFS prediction ability in GSE102349. The distribution of the risk score indicated patients in the high-risk group had a worse outcome of PFS than those in the low-risk group (Fig.7e-f), and the former tended to have overexpression of RRM2, MANSC1, MLF1, WDR54, MNS1 and NFE2L3, whereas the latter tended to have overexpression of VILL, CYP4B1, LXN, CRIP1, CNN3 and CTHRC1 (Fig.7g). Besides, multivariate Cox regression analysis indicated that MANSC1, CYP4B1, MLF1, CRIP1, MNS1, CNN3 and CTHRC1 were independent prognostic factors associated with the DFS in NPC (Fig.7h). Among the 11 lncRNAs in the ceRNA network, SNHG16, SNHG17 and THAP9-AS1 were upregulated in the high-risk group of NPC, while ZNF667-AS1 was down-regulated in the high-risk group of NPC (Fig.8).

Discussion
LncRNAs, function as ceRNAs by binding miRNAs, have been reported to be involved in the physiological and pathological processes of various diseases. One of the most effective methods to predict the function of ceRNAs was to construct a ceRNA network based on the high-throughput data together with bioinformatic tools and computational approaches [27]. However, the comprehensive analysis of lncRNAmediated ceRNA regulatory network in NPC remains scarce.
In the present study, we conducted WGCNA using RNA-Seq data of GSE102349 to enrich modules associated with NPC. Finally, we built a weighted correlation network composed of 14 modules. Through enrichment analysis, we found that the modules were related to some terms and pathways that had been previously reported by us or other researchers, such as "Wnt signaling pathway", "Small cell lung cancer", "PI3K−Akt signaling pathway", "Epstein−Barr virus infection", "Cell cycle", and "DNA replication" [18, 28,

29].
To better understand the potential roles of lncRNAs in NPC, we identi ed lncRNAs from the WGCNA modules. Combining with mRNA expression pro le data, we identi ed the up-and down-regulated lncRNAs and mRNAs in the same module. The mRNAs in the same module may be the regulatory targets for lncRNAs, so we used the lncRNAs and mRNAs in the same module to construct a ceRNA network. To improve the reliability of the ceRNA network, we identi ed the DEmiRNAs from GSE32960. Following, LncBase v.2 database was utilized to predict target miRNAs for lncRNAs, and miRTarBase was utilized to predict target mRNAs for miRNAs. Finally, we developed a ceRNA network composed of 11 lncRNAs, 15 miRNAs and 40 mRNAs from turquoise and salmon modules. It is noteworthy that the genes in the ceRNA network were signi cantly enriched in the "Wnt signaling pathway" and "HIF-1 signaling pathway" which had been showed to be relate to the development of NPC. For instance, Wang et al. [30] revealed that Wnt/β-catenin signaling (including β-catenin, cyclin D1, c-Myc, and MMP-7) and p-eIF4E expression were elevated in NPC compared with non-cancerous nasopharyngeal epithelial tissues and associated with clinical characteristics of NPC patients. Sung et al. [31] showed that HIF-1 alpha, HIF-2 alpha, CA IX, and VEGF were frequently coexpressed in NPC biopsies, and associated with poor outcomes after radiotherapy.
Among the 40 mRNAs in the ceRNAs, four mRNAs were found to have a signi cant impact on the prognosis of NPC, including LDHA, LMNB2, TPI1 and UNG. High expression of LDHA, LMNB2, TPI1 and UNG indicated unfavorable outcomes in patients with NPC. LDHA is up-regulated in NPC tissues and cells, and it was reported to be an independent adverse prognostic factor of NPC [32,33]. Through inhibition of LDHA, miR-34b-3 and miR-449a suppressed NPC progression and metastasis [34]. Therefore, the ceRNA network was reliable because the regulatory relationship between LDHA and miR-449a was indicated in our ceRNA network. LMNB2, a B type nuclear lamin, binds to the C-terminus of MCM7 and competes with the binding of the tumor suppressor RB protein, thus regulates human non-small cell lung cancer progression [35]. TPI1, an enzyme that catalyzes the interconversion of DHAP and G3P in glycolysis and gluconeogenesis, might be a novel prognostic factor to evaluate gastric cancer patients' survival [36,37]. UNG is a critical mediator of pemetrexed sensitivity in lung cancer [38]. The role and mechanism of LMNB2, TPI1 and UNG in NPC remain elusive.
Among the 15 miRNAs in the ceRNAs, hsa-miR-142-3p was demonstrated to be down-regulated in NPC, and a high level of hsa-miR-142-3p indicated favorable prognosis, which was consistent with the previous studies [39,40]. Li Y et al. revealed that miR-142-3p was epigenetically silenced by EZH2-recruited DNMT1 and suppress NPC cell metastasis and EMT through targeting ZEB2 [41]. However, Qi et al. reported that miR-142-3p inhibits the expression of SOCS6 and promotes cell proliferation in NPC [42]. Therefore, the role of miR-142-3p in the progression of NPC deserves further studied.
Among the 11 lncRNAs in the ceRNAs, ZNF667-AS1 was found to be related to the PFS of patients with NPC. ZNF667-AS1, also known as MORT, is located in 19q13. 43. Dysregulated expression of ZNF667-AS1 has been reported in many tumors, including breast cancer, cervical cancer, laryngeal squamous cell carcinoma and esophageal squamous cell carcinoma [43][44][45][46]. Li et al. found that ZNF667-AS1 reduces tumor invasion and metastasis in cervical cancer by counteracting microRNA-93-3p-dependent PEG3 downregulation [44]. Dong et al. demonstrated that aberrant hypermethylation-mediated downregulation of ZNF667-AS1 and ZNF667 correlate with progression and prognosis of esophageal squamous cell carcinoma [45]. Further, ZNF667-AS1 was revealed to be silenced by aberrant DNA methylation in 22 of 33 of TCGA cancer types [47]. In this study, ZNF667-AS1 was shown to be down-regulated in NPC compared with normal tissues, and it was down-regulated in the high-risk patients with poor prognosis.
The expression level of ZNF667-AS1 was positively correlated with PFS of NPC. Combined with the above literature, we speculated that the down-regulation of ZNF667-AS1 might also be caused by methylation in NPC, but it needs further molecular experiments to prove this hypothesis. Besides, we found that ZNF667-AS1 may regulate the expression of PRKCB and PAX5 through competitively binding to hsa-miR-574-5p. What's more, GSEA analysis showed that ZNF667-AS1 was associated with some important pathways related to tumorigenesis. Therefore, ZNF667-AS1 might act as a tumor suppressor and participate in the occurrence and development of NPC. To the best of our knowledge, the present study was the rst to demonstrate the relation between ZNF667-AS1 and NPC using bioinformatics analysis.
In the present study, we constructed a risk score model for predicting PFS of NPC patients. This model consists of 12 genes RRM2, VILL, MANSC1, CYP4B1, LXN, MLF1, CRIP1, WDR54, MNS1, CNN3, CTHRC1 and NFE2L3. Among them, RRM2, MANSC1, MLF1, WDR54, MNS1 and NFE2L3 were associated with high-risk of poor prognosis, while VILL, CYP4B1, LXN, CRIP1, CNN3 and CTHRC1 were associated with low-risk of poor prognosis. Multivariate Cox regression analysis indicated that MANSC1, CYP4B1, MLF1, CRIP1, MNS1, CNN3 and CTHRC1 were independent prognostic factors associated with the DFS in NPC. Kaplan-Meier survival analysis, ROC curve and the C-index demonstrated that the predictive capability of the risk model was successfully in GSE102349. Among the prognostic genes in the model, RRM2 encodes a subunit of ribonucleotide reductase, which catalyzes the conversion of ribonucleotides into deoxyribonucleotides. Overexpression of RRM2 predicts unfavorable prognosis for patients with NPC [48]. CYP4B1 encodes a member of the cytochrome P450 superfamily of enzymes, which catalyze many reactions involved in drug metabolism and synthesis of cholesterol, steroids and other lipids. Although CYP4B1 was detected in NPC tissues [49], the role of CYP4B1 is little known. In addition to RRM2 and CYP4B1, the relationship between NPC and other prognostic genes in the risk model has not been reported in the literature.
What's more, we analyzed the expression differences of lncRNAs in the ceRNA network between two risk groups. Results showed that SNHG16, SNHG17 and THAP9-AS1 were up-regulated in the high-risk group of NPC, while ZNF667-AS1 was down-regulated in the high-risk group of NPC. SNHG16 was regarded as an oncogene and associated with neuroblastoma, bladder cancer, colorectal cancer, esophageal squamous cell carcinoma, hepatocellular carcinoma [50][51][52]. Zhang et al. suggested that SNHG16 promotes tumor progression through acting as an endogenous "sponge" by competing with miR-140-5p, thereby regulating target ZEB1 [51]. SNHG17 was reported to promote gastric cancer progression by epigenetically silencing of p15 and p57, and it was an unfavorable prognostic factor in colorectal cancer and gastric cancer [53,54]. Extensive overexpression of THAP9-AS1 was observed in pancreatic ductal adenocarcinoma which is associated with poor clinical outcomes [55]. Function as ceRNA, THAP9-AS1 facilitated YAP expression by sequestrating miR-484, and it can bind YAP to inhibit phosphorylationmediated inactivation by LATS1 [55]. The function and mechanism of those identi ed lncRNAs in NPC warrant further study.
It should be acknowledged that there were certain limitations to the present study. Firstly, only 11 differential expressed lncRNAs were obtained from the integrated analysis of RNA-seq data and mRNA pro ling data. Some lncRNAs may be ignored, for only a portion of the possible lncRNAs was included in HG-U133 Plus 2.0 platform. Secondly, the sample size of GSE102349 was relatively limited. There were only 113 NPC samples included in GSE102349, and only 88 samples had survival information, thus research with large sample sizes, high-quality high-throughput data are required to verify our ndings in the future. Thirdly, due to the lack of important clinical information such as gender and age, we can't con rm the independence of the risk model. Moreover, due to the absence of similar NPC public datasets with survival information, external validation was not conducted. Finally, although we have constructed a ceRNA network using multiple datasets and experimental-based databases, the regulatory relationship among lncRNAs, miRNAs and mRNAs still require to verify using molecular experiments in vivo and in vitro.

Conclusions
Collectively, we constructed a ceRNA network based on WGCNA. The network can re ect the mutual regulated relationships among lncRNAs, miRNAs and mRNAs, and it can re ect the changes in their expression level between NPC and normal tissues. This ceRNA network provided novel insights into the understanding of the crosstalk among lncRNAs, miRNAs and mRNAs in the tumorigenesis and progression of NPC. Moreover, we developed a LASSO penalized Cox regression model for NPC that may contribute to predict PFS of NPC patients. Furthermore, we identi ed several potential prognostic biomarkers and high-risk biomarkers from the ceRNA network. Further evaluation of those biomarkers is needed by well-designed scienti c experiments in the future.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
The data that support the ndings of this study are openly available in the Gene Expression Omnibus database (GEO) at http://www.ncbi.nlm.nih.gov/geo/.

Competing interests
The authors declare that they have no competing interests.      penalized Cox regression of PFS, the RS for each sample was calculated. Then, according to the median value of RS, the samples were separated into high-and low-risk groups. Kaplan-Meier survival analysis for the risk score model was shown in (c). Time-dependent ROC curve analysis for the risk score model was shown in (d). Risk score distribution of samples was shown in (e). Survival pro le of samples was

Funding
shown in (f). A heat map was generated to show the expression of 12 prognostic genes in NPC (g). A forrest plot of the univariate and multivariate Cox regression analysis for the prognostic genes associated with PFS was shown in (h). NPC: nasopharyngeal carcinoma; CI: con dence interval; PFS: progressionfree survival.

Figure 8
Comparison of the expression level of the lncRNAs in the ceRNA network between high-and low-risk groups in NPC.