An Immune-Related lncRNA Pairing Model for Predicting the Prognosis and Immune-Infiltrating Cell Condition in Human Ovarian Cancer

Ovarian cancer is the second common cancer among the gynecological tumors. It is difficult to be found and diagnosed in the early stage and easy to relapse due to chemoresistance and deficiency in choices of treatment. Therefore, future exploring the biomarkers for diagnosis, treatment, and prognosis prediction of ovarian cancer is significant to women in the world. We downloaded data from TCGA and GTEx and used R “limma” package for analyzing the differentially expressed immune-related lncRNA in ovarian cancer and finally got 7 downregulated and 171 upregulated lncRNA. Then, we paired the differentially expressed immune-related lncRNA and constructed a novel lncRNA pairing model containing 7 lncRNA pairs. Based on the cut-off point with the highest AUC value, 102 patients were selected in high-risk group and 272 in low-risk group. The KM analysis suggested that the patients in the low-risk group had a longer overall survival. Future analysis showed the correlations between risk scores and clinicopathological parameters and infiltrating immune cells. In conclusion, we identified an immune-related lncRNA pairing model for predicting the prognosis and immune-infiltrating cell condition in human ovarian cancer, which thus further can instruct immunotherapy.


Introduction
As reported, there were 313,959 cases of newly diagnosed ovarian cancer (OC) and 207,252 new deaths for OC in 2020 [1]. OC is the second common cancer to cause large death among the gynecological tumors, difficult to be found and diagnosed in the early stage of disease, deficient in choices of treatment, and easy to relapse [2][3][4][5]. Therefore, future exploring the biomarkers to diagnose, treat, and predict the prognosis of OC is of significance to women.
Tumor immune-infiltrating cells are strongly correlated with cancer prognosis and response to therapy. Ye et al. reported that tumor immune-infiltrating cells, especially neutrophils, Tregs, and macrophages, affected the clinical outcome in patients with colorectal cancers and could be markers to predict the prognosis and response to therapy [6]. Qi et al. also reported that the accumulation of CD39 + CD8+ T cells in tumor microenvironment indicated poor prognosis in clear cell renal cell carcinoma and benefit of tyrosine kinase inhibitors therapy [7]. And different methods to target different immune-infiltrating cells, such as lymphocytes [8,9], dendritic cells [10], and nature killer cells [11], in the TME have been a popular and effective therapy.
As we know, long noncoding RNA (lncRNA) is a noncoding RNA with a length of more than 200 nucleotides. Studies have shown that lncRNA plays an important role in many life activities, such as dose compensation effect, epigenetic regulation, cell cycle regulation, and cell differentiation regulation, and has become a hot spot in genetic research [12]. For example, lncRNA MALAT1 plays an antiapoptotic and anti-inflammatory role in the brain microvascular system to reduce ischemic cerebrovascular and parenchymal damage, which can be a therapeutic target to minimize brain damage after stroke [13]. A recent study reported that lncRNA H19X could regulated the expression of TGF-β, regulating differentiation and survival of myofibrillar cells [14].
Here, we used a new algorithm to explore a new immune-related lncRNA pairing model for predicting the prognosis and immune-infiltrating cell condition in human OC based on The Cancer Genome Atlas (TCGA) and The Genotype-Tissue Expression (GTEx) database, which can help a lot in the improvement of prognosis prediction and immune therapy.

Materials and Methods
The data analysis steps are in Figure 1.

Data Preparation and Differentially Expressed Analysis.
The RNA-seq data and clinical information of OC patients were gotten from https://portal.gdc.cancer.gov/repository. The RNA-seq data of patients having normal ovarian tissue were gotten from https://xenabrowser.net/datapages/. Data type was HTseq-FPKM, and gene expression level in both two databases was further processed by log2 (FPKM+1). The cases without clinical information and the repeated cases were removed. The data from Ensembl (http://asia .ensembl.org) were taken for RNA-type annotation. Genes in ImmPort database (http://www.immport.org) were taken for coexpression analysis of the differentially expressed immune-related lncRNA (gene correlation coefficients >0.4 and p value <0.001). Wilcoxon signed-rank test based on "limma" R package were used for differentially expressed immune-related lncRNA (DEirlncRNA) analysis in tumor and normal tissues (false discovery rate (FDR) <0.01).

DEirlncRNA Pairing.
Considering the general applicability of the model and avoiding batch correction, lncRNA pairs matrix was constructed. One lncRNA pair had lncRNA 1 and lncRNA 2. If the expression value of lncRNA 2 is lower than lncRNA 1, consider the sample as 1; otherwise, the sample is defined as 0. Next, the matrix was further analyzed. If all the value of the lncRNA pairs in the samples were 0 or 1, the lncRNA pair was thought not be related to prognosis because there is no specific rank of pairing that cannot correctly predict the survival outcome of patients. When the proportion of expression proportion was 0 or 1 in one lncRNA pair exceeding 20% of total pairs, it was considered a significant pair.

Construction of the Prognostic Model by lncRNA Pairs.
A prediction model of lncRNA pairs was constructed by univariate, lasso, and multivariate Cox regression analysis. The model was determined by ten-fold cross-validation and p value less than 0.01. The model with the highest point of area under curve (AUC) value was selected for further analysis. The risk score was calculated according to the standardized expression value of each pair and its corresponding coefficient. The formula was score = e sum (expression of each gene pair × corresponding coefficient) . The receiver operating characteristic (ROC) curve was evaluated to determine the point of which the sum of sensitivity and specificity reached the highest, which was the cut-off point to divide patients into two groups.

Validation of the Prognostic Model of the lncRNA Pairs.
The Kaplan-Meier (KM) analysis based on the "survive" and "survminer" packages showed the differences in survival time of the two groups. The KM analysis estimates the survival curve in this way: first, calculate the probability that patients who have lived for a certain period will live for the next period (i.e. survival probability), and then multiply the survival probability one by one, that is, the survival rate of the corresponding period. R software was also used to show the risk score values and survival status of each sample in the model. The "survivalROC" package was taken for predicting survival status in ROC curves.

Clinical Evaluation of the Prognostic Model by lncRNA
Pairs. The chi-square test based on the "ComplexHeatmap" R package was taken for analyzing the relationship between the model and clinicopathological parameters ( * * * means p < 0:001, * * means p < 0:01, and * means p < 0:05). The risk scores of these clinicopathological features were compared between different groups by the Wilcoxon signedrank test. The relationship between the risk score and clinicopathological parameters was performed by univariate and multivariate Cox regression analyses to demonstrate whether the model can be used to predict the prognosis independently. The "limma" and "ggupbr" packages were used to show the results in a format of forest maps.
2.6. Evaluation of Immune-Infiltrating Cells and Immune Checkpoint Genes Expression by the Prognostic Model of the lncRNA Pairs. Using several immune-related databases, the correlation between the risk score and immune cell condition was analyzed by Spearman correlation analysis (p value <0.05). The Wilcoxon signed-rank test based on the "ggplot2" packages was conducted for evaluating the content differences of immune-infiltrating cells between the two groups, and we showed the results in a box diagram. The "ggstatsplot" R package was performed to analyze the expression differences of immune checkpoint-related genes.   Table 1 shows the coefficient, hazard ratio (HR), 95% confidence interval of HR, and p value of each lncRNA pair included in the model. Besides, we analyzed the AUC curve and found the point of which the sum of sensitivity and specificity reached the highest (Figure 3(d)), and we considered this value to be the cut-off point to divide the patients into two groups (high-risk and low-risk, Supplementary material 5). And we conducted univariate and multivariate Cox analyses, which indicated that 7 pairs were related to the overall survival (Figures 3(e) and 3(f)).

Validation of the Prognostic Model of the lncRNA Pairs.
According to the cut-off point, we divided the patients into two groups, one containing 102 high-risk patients and the other containing 272 low-risk patients (Figure 4(a); Supplementary material 5). In addition, we draw the scatter figure to show the survival status and the risk score of each patient ( Figure 4(b)). KM analysis showed that the patients with high-risk score survived shorter (p value <0.001, Figure 4(c)). The ROC curves at 1, 2, and 3 years indicated that all AUC were more than 0.7 ( Figure 4(d)). What is more, the ROC curve at 1 year compared with other common clinicopathological parameters suggested that the model we developed had a perfect prediction ability (Figure 4(e)).

Clinical Evaluation of the Prognostic Model by lncRNA
Pairs. The chi-square test suggested that age was significantly associated with the risk score (p < 0:001), but in the clinical stage, tumor grade had no differences between the two groups ( Figure 5(a)). The risk scores of these clinicopathological features comparing between different groups by the Wilcoxon signed-rank test indicated that age had a positive correlation with the risk score ( Figure 5

Discussion
Since OC causes so many deaths in the world every year, it deserves more in-depth exploration and research. TCGA and GTEx are platforms collecting clinical data, genomic variation, mRNA expression, miRNA expression, methylation, and other data of patients with or without human cancers, which is a very important data source for cancer researchers.
Here, the clinical data and mRNA expression data of OC in TCGA database and the mRNA expression data of patients without OC in GTEx database were downloaded. First, we used coexpression analysis to get the differentially expressed immune-related lncRNA. Then, we constructed a novel model to predict the prognosis of OC based on lncRNA pairing. At the same time, we got the cut-off point to divide the patients into the high-risk and low-risk group. This model was verified by ROC curves and survival analysis, as well as the relationship between risk score and other clinicopathological parameters or immune-infiltrating cells. The results indicated that this model can instruct the prognosis prediction and immune therapy of OC.
There were many studies involving the role of lncRNA in OC, where lncRNA expression data was analyzed and lncRNA model was constructed to predict the prognosis of patients with OC [44][45][46][47][48]. Nevertheless, this model needed to measure the exact expression level of the lncRNA, which requires equipment with higher quality and expert to read the results. Besides, this model was verified by multivariate Cox analysis with a p value of 0.02. In our study, we used lncRNA pairing method in OC prognosis, with which we only needed to check the relative expression of the lncRNA pair instead of the exact expression level of each lncRNA. And we did the multivariate Cox analysis with a p value <0.001. It is reported that USP30-AS1 in our model also participated in the other immune-related lncRNA model based on TCGA and GTEx database, implying the importance of USP30-AS1 in OC [48]. Besides, several studies reported the role of USP30-AS1 in carcinogenesis [48][49][50][51][52]. But there were no studies of USP30-AS1 in OC, which needed to be future explored. lncRNA UNC5B-AS1 in our model had been reported to be an oncogenic gene in OC through regulating the H3K27me on NDRG2 via EZH2 [53]. Other lncRNA in our model had been reported in other cancer types, but no reports in OC. Therefore, we also need to do a lot of work, including not only prospectively collecting clinical samples to verify our model, but also studying the specific role of lncRNA involved in the model in ovarian cancer.
What is more, we first combined lncRNA with tumor immune-infiltrating cell condition in OC. We found that the proportion of neutrophil, endometrial cell, macrophage, cancer-associated fibroblast, T cells, and mast cells was    higher in the high-risk group with p value lower than 0.05. The M2 macrophage significantly accumulates in the tumor niche and plays a role in promoting tumor development and immunosuppression, which can be a therapeutic target for treating OC [54][55][56][57]. Zhang et al. reported a targeted nanocarrier that could deliver M1-polarizing transcription factors to reprogram TME [58]. Besides, Rodriguez-Garcia et al. reported that folate receptor β+ tumor-associated macrophages had the characteristics of macrophages M2, and selective elimination of them by chimeric antigen receptor T cell could retard tumor growth and remodel the TME, which inaugurated a new era in adjuvant therapy of conventional immunotherapy [59]. All these promoted the progress of immune therapy in OC. In recent years,      [60][61][62]. Therefore, we also evaluated the ICIs expression level between two groups, which could future instruct the targeted immune  LAG3  ICOS  CTLA4  CD48  TNFRSF4  CD276  CD80  TMIGD2  IDO1  TNFRSF25  TNFRSF18  CD274 Figure 7: Evaluation of immune checkpoint genes expression by the prognostic model of the lncRNA pairs. CD244, LAG3, ICOS, CTLA4, CD48, TNFRSF4M, CD80, TMIGD2, IDO1, TNFRSF18, CD274, and CD40 were significantly lower in the high-risk group, while CD276 and TNFRSF25 were higher ( * * * means p < 0:001, * * means p < 0:01, and * means p < 0:05). 14 BioMed Research International therapy. In conclusion, the results of these studies inspire us to future explore better immunotherapy strategies in OC.

Conclusion
In summary, by collecting and analyzing the RNA-seq and clinical information of OC samples from TCGA and GTEx database, we identified a new immune-related lncRNA pairing model to predict the prognosis and immune-infiltrating cell condition in human OC, which thus further can instruct immunotherapy. However, a prospective, large-scale, multicenter clinical cohort to validate the prognostic model as well as experimental studies of cell biology to explore the role of related lncRNA in OC is needed.

Data Availability
The results data used to support the findings of this study are included within the article, and the original data are supplied as supplementary materials in the form of the tables.
The software code used to analyze the data are also included within the supplementary materials.

Conflicts of Interest
The authors declare that they have no conflicts of interest.