Bioinformatics Analysis Identifies EPAS1 as a Novel Prognostic Marker Correlated with Immune Infiltration in Acute Myeloid Leukemia

EPAS1 plays an important role in the development and progression of multiple tumor types by interacting with a series of other molecules. However, the prognostic and diagnostic values of EPAS1 in acute myeloid leukemia (AML) remain unknown. Here, we systematically explored and clarified the potential functions of EPAS1 in AML using data from Xena Browser and TCGA database. The expression of EPAS1 was significantly lower in AML patients than that in healthy people. The GO, KEGG, GSEA, and GSVA were performed to explore the potential functions and signaling pathways. The survival analysis was conducted using Cox regression analysis and the Kaplan-Meier method. Immune cell infiltration was evaluated via single-sample GSEA (ssGSEA). The results of enrichment analyses suggested that low-EPAS1 expression was related to the initiation, development, and prognosis of AML. The immune microenvironment landscape in AML was described by ssGSEA. ROC analysis of EPAS1 showed high discrimination ability between AML patients and healthy people. Kaplan-Meier method indicated that low-EPAS1 expression correlated significantly with a poor overall survival. Multivariate Cox regression analysis revealed that both age and EPAS1 expression were independent prognostic factors in AML patients. Furthermore, the nomogram based on these two variables performed well in discrimination and calibration. In summary, our study may provide new insights into the molecular mechanisms underlying AML and demonstrate the diagnostic and prognostic value of EPAS1 in AML for the first time.


Introduction
Acute myeloid leukemia (AML) is an aggressive hematologic neoplasm characterized by abnormal proliferation and differentiation of hematopoietic precursors, resulting in the accumulation of leukemic cells in the bone marrow and peripheral blood. There are estimated 19940 new AML cases and 11180 deaths in 2020, making it the most common and deadliest form of acute leukemia in adults [1]. The 5-year relative survival for AML patients declines with increasing age, from 47.5% in patients younger than 65 years old to just 8.2% in those aged 65 years and older [2]. With the continuous advancement in conditional treatments for AML, including chemotherapy, allogeneic stem cell transplantation (SCT), and supportive care, the outcome improvement in younger patients has been reported in several trials [3]. While in elderly patients who cannot tolerate intensive chemotherapy regimens, the effect is not satisfactory [3][4][5]. Thus, it is an urgent need to explore potential prognostic markers and therapeutic targets that can contribute to patient-tailored treatment plan making and clinical management.
EPAS1, the gene encoding hypoxia-inducible factor-(HIF-) 2α, plays a critical role in a wide range of pathophysiological processes. Genome-wide studies have shown that the mutation of EPAS1 is closely linked to human adaptation to high-altitude hypoxia, with a particular focus on Tibetans [6]. There is growing evidence showing that EPAS1 is related with cancer initiation and progression. Mutations in EPAS1 are considered to be drivers in pheochromocytoma and paraganglioma [7][8][9]. In renal cell carcinoma (RCC), the HIF-2α regulation is linked to the cancer development in vivo and in vitro [10][11][12]. And preclinical studies and clinical data validate that HIF-2α inhibitors have antitumor activity in RCC cell lines and heavily pretreated patients [13][14][15][16]. On the other hand, HIF-2α has been reported to suppress tumor growth in undifferentiated pleomorphic sarcoma, fibrosarcoma, and dedifferentiated liposarcoma, where EPAS1 is largely considered to be epigenetically silenced [17]. In addition, HIF-2α is found to contribute to the tumor cell apoptosis in hepatocellular carcinoma through the TFDP3/ E2F1 pathway, and lower HIF-2α expression level in tissue is always accompanied by poorer survival [18]. These findings suggest the potential role of EPAS1 as a therapeutic target and as a prognostic marker for patients with cancers. However, the underlying functions and molecular mechanisms of EPAS1 in AML are still poorly understood. In this research, we aimed to determine whether EPAS1 could serve as a prognostic marker to guide clinical decision-making in AML and explore the role of EPAS1 in AML disease mechanism, which is the key to identify novel diagnostic and therapeutic avenues.

Study Design.
Gene expression data and clinical information were extracted from the common databases. Then, the computational biology tools were applied to evaluate the diagnostic and prognostic values of EPAS1 in AML patients. The R code used for our analysis is available in Supplementary Materials. The schematic flow chart of the present study was summarized in Figure 1.

2.2.
Preprocessing for RNA-Sequencing Data. Genotype-Tissue Expression (GTEx) TOIL RSEM transcript per million (tpm) data, TCGA Pan-Cancer TOIL RSEM tpm data, and corresponding clinical information were obtained from Xena Browser (https://xenabrowser.net/datapages/) [19]. The data of 70 GTEx normal samples and 173 TCGA-AML samples were processed through TOIL in a uniform manner and then transformed into log 2 ðTPM + 1Þ for differential expression analysis and receiver operating characteristic (ROC) analysis.
The clinical data of 200 AML patients were obtained from TCGA database (https://portal.gdc.cancer.gov/repository). Cases without corresponding RNA-seq data (n = 49) were excluded. Finally, we compiled log 2 ðTPM + 1Þ gene expression data of 151 AML cases for further analysis. The data were collated into Table 1, and unavailable or unknown clinicopathological features were treated as missing values. This study is entirely based upon data generated from TCGA and meets the TCGA publication guidelines. (http://www .cancer.gov/about-nci/organization/ccg/research/structuralgenomics/tcga). This article does not contain any studies with human participants or animals performed by any of the authors.   3 Disease Markers 2.3. Differentially Expressed Gene (DEG) Analysis. According to the median expression level of EPAS1, gene expression data were divided into a high-expression group and a lowexpression group. To identify the DEGs in AML, we analyzed and compared the expression data between high-and low-EPAS1 expression groups within the DESeq2 package (version 1.28.1) [20]. Genes with jlog 2 fold changej > 1:5 and adjusted P value < 0.05 were considered to be statistically significant. The results were visualized using volcano plots and heat map.

Functional Enrichment
Analysis. Metascape (http:// metascape.org) is a network-based tool integrating gene annotation, enrichment analysis, and interactome analysis capabilities [21]. In this study, we used Metascape to conduct Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis for EPAS1related DEGs in AML. Similar terms were grouped into a cluster under the parameters that P value < 0.01, minimum count > 3, and enrichment factor > 1:5.

Gene Set Enrichment Analysis. Gene Set Enrichment
Analysis (GSEA) is a powerful and robust analytical strategy designed to reveal the collective behavior of functionally related genes by identifying biological pathways at the geneset level, which makes it easier for researchers to interpret the results in large-scale analysis [22,23]. GSEA takes all gene expression data as analysis objects to avoid the loss of key information caused by unreasonable filtering parameters in traditional enrichment analysis. In this study, GSEA was carried out using clusterProfiler (version 3.16.0) package in R (3.8.0) [24] to elucidate the significant differences in function or pathway between high-and low-EPAS1 expression groups. Gene set permutations were performed 1000 times for each analysis. The enrichment analysis was based on the conditions that adjusted P value < 0.05, FDR q value < 0.25, and jNESj > 1 .
2.6. Gene Set Variation Analysis. Gene set variation analysis (GSVA) [25] is a nonparametric analysis method to evaluate whether different metabolic pathways are enriched in different samples by converting the gene expression matrix between different samples into gene set expression matrix. The gene sets (c2.cp.v2022.1.Hs.symbols) were obtained from the MSigDB database. The 151 samples from the TCGA-AML dataset were divided into high-expression group and lowexpression group according to the median expression level of EPAS1. Enrichment scores were calculated for each gene set to evaluate the potential biological function changes of different samples using the GSVA algorithm. The criterion of significant enrichment was adjusted P value < 0.05. 2.7. Immune Infiltration Analysis by ssGSEA. Using GSVA package (http://www.bioconductor.org/packages/release/ bioc/html/GSVA.html) in R, the immune infiltration analysis was performed by single-sample GSEA (ssGSEA). We quantified the relative enrichment scores of 24 tumorinfiltrating immunocytes in AML by comparing the published signatures [26] with the gene expression profile data. The correlation between EPAS1 expression and infiltration levels of these immunocytes and the association of the highand low-EPAS1 expression groups with the infiltration of immune cells were analyzed by Spearman correlation and Wilcoxon rank sum test, respectively.

Statistical
Analysis. All statistical analyses in our study were conducted using R (v.3.6.3). We analyzed the different expressions of EPAS1 in AML and healthy control groups with the Wilcoxon rank sum test and assessed the discrimination ability of EPAS1 by receiver operating characteristic (ROC) analysis using pROC package [27]. Wilcoxon rank sum test and logistic regression were applied to analyze the correlation between clinicopathologic features and EPAS1 expression. The survival analysis was conducted using Cox regression analysis and the Kaplan-Meier method with 95% confidence intervals (CI) [28]. The hazard ratios (HRs) of overall survival (OS) in subgroups were summarized in forest plot. All hypothetical tests were two-sided, and P values less than 0.05 were considered to be statistically significant.  (c) Volcano plot of differentially expressed genes. Red dots represent the upregulated genes, and blue dots represent the downregulated genes. The gray area represents the genes whose expression levels are below the threshold (jlog 2 fold changej > 1:5, adjusted P value < 0.05).      Figures 3(a)-3(c). The results of cellular components (CC) suggested that EPAS1-related genes were mainly located in extracellular matrix. The molecular functions (MF) of these genes included extracellular matrix structural composition, molecular binding, and protein kinase activation. Moreover, we found that these genes were involved in several biological processes (BP), including extracellular structure organization, extracellular matrix organization, system morphogenesis, and tissue or organ development. The KEGG pathways for EPAS1 and its correlated genes are shown in Figure 3(d).

EPAS1-Related Signaling Pathways Based on GSEA.
To identify the key signaling pathways stimulated differentially in AML, we performed GSEA between low-and high-EPAS1 expression data sets. Significant differences (adjusted P value < 0.05, FDR q value < 0.25) in the enrichment of MSigDB collection (C2.all.v7.0.symbols.gmt) were revealed. Based on their NES, we selected biological processes and pathways that were significantly enriched in the low-EPAS1 expression phenotype, including ribosome, SRPdependent cotranslational protein targeting to membrane, AML with MLL fusions, AML with FLT3-ITD, tretinoin response and PML-RARA fusion, and HDACS deacetylate histones. The details are shown in Figures 4(a)-4(f) and Table 2. We also selected biological processes and pathways that were significantly enriched in the high-EPAS1 expression phenotype, including regulation of actin dynamics for phagocytic cup formation, complement cascade, signaling by the B cell receptor, stem cell up, cell surface interactions at the vascular wall, and immunoregulatory interactions between a lymphoid and a nonlymphoid cell. The details are shown in Figures 4(g)-4(l) and Table 3.

GSVA Analysis of EPAS1 in AML.
In order to explore the differences of curated gene sets between high-and low-EPAS1 expression groups in TCGA-AML dataset, the analyses were performed by ssGSEA using GSVA package. According to the enrichment scores, the differences in extracellular matrix-related pathways between these two groups were shown by the boxplot in Supplementary Figure 1. In the high EPAS1 expression group, the enrichment scores of extracellular matrix-related pathways, including degradation of the extracellular matrix, extracellular matrix organization, cell extracellular matrix interactions, extracellular vesicle-mediated signaling in recipient cells, and extracellular vesicles in the crosstalk of cardiac cells, were significantly higher than that in the low-EPAS1 expression group (P value < 0.05). We further showed the top 25 genes significantly differentially expressed in high-EPAS1 expression group, low-EPAS1 expression group, and normal control group by heat map in Supplementary  Figure 2 (P value < 0.05). Heat map was generated using R pheatmap package.

The Correlation between EPAS1 Expression and Immune
Infiltration. Spearman correlation test was applied to assess the correlation between EPAS1 expression level (TPM) and  10 Disease Markers immune cell infiltration levels which were quantified by comparing with immunocyte signatures using ssGSEA. The EPAS1 expression was positively correlated with macrophages and immature dendritic cells (iDCs) (Figure 5(a)). The correlation between macrophages and EPAS1 expression was strongest among the 24 tumor-infiltrating immunocytes (Spearman r = 0:373, P value < 0.001) ( Figure 5(b)). Wilcoxon rank sum test showed that the abundance of macrophages in low-EPAS1 expression group was significantly lower than that in high-EPAS1 expression group with P value < 0.001 ( Figure 5(c)).

Role of EPAS1 in AML Patient
Survival. Kaplan-Meier survival analysis revealed that OS was significantly poorer in patients with low-EPAS1 expression than those with high-EPAS1 expression (P = 0:006) (Figure 8(a)). In OS subgroup analysis, we found that low expression of EPAS1 was associated with poor OS in the subgroup of intermediate cytogenetic risk (P = 0:013) (Figures 8(b) and 8(c)). The univariate Cox regression analysis showed that low-EPAS1 expression correlated significantly with a poor OS (HR: 0.542; 95% CI: 0.352-0.836; P = 0:006). Other clinicopathologic features, including cytogenetic risk and age, were associated with poor outcome as well. At multivariate analyses, low-EPAS1 expression was still independently correlated  (Table 5).
3.9. Development and Validation of Nomogram. Low-EPAS1 expression and age were determined to be independent prognostic factors for AML by univariate and multivariate Cox regression analyses. Then, we constructed a nomogram-integrated EPAS1 and age. Higher total points represented worse survival probability at 1, 3, and 5 years (Figure 9(a)). The C-index for the nomogram was 0.714 (95% CI: 0.689-0.739). Lines in calibration plot were all close   14 Disease Markers to the ideal line, indicating that the predictions made by nomogram conformed well to the observations in AML patients (Figure 9(b)).

Discussion
Hypoxia-inducible factors (HIFs) mediate the responses to oxygen (O 2 ) concentrations in cells, which makes the regulatory role of HIFs in tumor development and progression become an intensive area of research. EPAS1 (HIF-2), a heterodimeric member of the basic helix-loop-helix-PAS domain polypeptide family, consists of α and β subunits [29]. EPAS1 shares 48% identity with HIF-1α whose regulatory functions in glucose metabolism, cell proliferation, angiogenesis, tumor invasion, and survival have been elucidated over the past decades [30]. Although it is tempting to speculate that EPAS1 is functionally similar to HIF-1α, accumulating evidence shows that EPAS1 and HIF-1α could lead to nonequivalent and even opposite effects on solid tumor development, progression, and prognosis [31]. For example, in von Hippel-Lindau-defective renal cell carcinoma (RCC), HIF-1α inhibits cell proliferation in vivo or in vitro, and the knockdown of HIF-1α promotes tumor growth accordingly [32], while EPAS1 is shown to be essential for the growth of xenograft in RCC [11]. In hematological malignancies, HIF-1α was required for the maintenance of cancer stem cells in AML, and the downregulation of HIF-1α could effectively eliminate the activity of AML colony-forming unit [33]. However, few studies have addressed the role of EPAS1 in AML. The complex interplay of genetic events has long been considered as the basis for disease classification, risk stratification, and prognostic assessment in AML [34,35]. This is basically consistent with the results of our EPAS1-related GSEA analysis, including biological processes such as leukemia-associated genetic alterations, epigenetic regulation, and immune microenvironment changes. Noteworthy, both GSEA and logistic regression analysis showed that the low-level EPAS1 expression was significantly correlated with FMS-like tyrosine kinase-3 (FLT-3) mutation, which is considered to be one of the most important drivers in AML [36], being associated with poor prognosis in AML [37]. In the absence of ligands, FLT-3 mutations constitutively activate of FLT-3 receptors, promoting cell proliferation, and survival through PI3K/AKT and other signaling pathways. Interestingly, the results of GO and KEGG analyses pointed in a similar direction. A number of studies have shown that EPAS1 is closely related to the PI3K/AKT signaling pathway. Researchers have elucidated the complex regulatory relationship between EPAS1 and PI3K/AKT signaling pathway in different cancers, in which factors such as PTEN, YY1, SCD1, and CD44 also play important roles [38][39][40].
Although little is known about the regulatory mechanism between EPAS1 and PI3K/AKT signaling pathway in hematological malignancies, we can gain information and inspiration from these similar studies in solid tumors. In addition, the results of ROC curve analysis in our study also suggested Age has been previously reported as an independent prognostic factor for AML [41], which is consistent with our results obtained by multivariate Cox regression analysis. From the statistical data in recent years, it is not difficult to  16 Disease Markers see that the majority of AML patients are elderly, whose outcomes are particularly poor [42,43]. In view of this, the development of prognostic prediction methods is particularly urgent and important. In our study, the findings from different analytic approaches, including Kaplan-Meier survival analysis, model building, and model validation, corroborated each other and highlighted the potential power for our prognostic model to enhance patient outcome predictions. This suggests that our model may help to determine prognosis and develop treatment plans accordingly in clinical practice and may aid in clinical risk stratification. AML arises in an immunosuppressive bone marrow microenvironment characterized by promoting tumor growth and immune escape [44]. Interest in the use of immunotherapy to improve AML outcomes is growing due to significant advances in immunotherapy in solid tumors. At present, several immunotherapies are under development and clinical testing [45]. In our study, there is a positive relationship between EPAS1 expression level and infiltration level of many immune cells, particularly macrophages. It is well known that macrophages are derived from monocytes under physiological conditions and regulate the innate and adaptive immune system by secreting cytokines and chemokines. In AML, a large number of immature myeloid cells proliferate, resulting in catastrophic bone marrow failure, which may explain why there was a decrease in macrophage expression. We also observed that the abundance of active dendritic cells (aDCs) in low-EPAS1 expression group was significantly higher than that in high-EPAS1 expression group, although the correlation between aDCs and EPAS1 expression was weak. When DCs are activated, they migrate to lymphoid tissues to interact with T to stimulate and control the appropriate immune response. The increased level of aDCs in low-EPAS1 expression group (AML-patient group) appears to compensate for the tumor immune dysfunction caused by macrophage depletion to some extent. Our results suggest a possible regulatory mechanism of the tumor immune microenvironment associated with low-EPAS1 expression in AML, which may be helpful to the development of immunotherapy for AML patients in the future. However, more well-designed clinical trials are needed to confirm whether and how the infiltration degree of these immune cells changes throughout the course of AML.
There were some limitations in our study. First, all the data were extracted from the public database. The small and unbalanced sample sizes were used in the analyses, so further multicenter and large sample size studies are needed. Second, the current study was performed based on RNA-sequencing data from public database; therefore, we cannot gain deeper insights of the gene function at other omics levels. Finally, it would be better to further collect clinical data on the basis of retrospective analysis for validation. We are collecting bone marrow samples from clinical AML patients for subsequent immunohistochemistry and single-cell sequencing analyses, which will serve as a supplement to our research results and enhance the persuasiveness of our article.

Conclusions
In conclusion, our study revealed that decreased EPAS1 expression could be a potential diagnostic indicator and a prognostic marker of poor survival in AML. Further experiments are needed to validate our findings and identify the underlying molecular mechanism of AML.

Data Availability
The data that support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest. Disease Markers