Identification of Potential Key Genes and Pathways in Enzalutamide-Resistant Prostate Cancer Cell Lines: A Bioinformatics Analysis with Data from the Gene Expression Omnibus (GEO) Database

Enzalutamide (ENZ) has been approved for the treatment of advanced prostate cancer (PCa), but some patients develop ENZ resistance initially or after long-term administration. Although a few key genes have been discovered by previous efforts, the complete mechanisms of ENZ resistance remain unsolved. To further identify more potential key genes and pathways in the development of ENZ resistance, we employed the GSE104935 dataset, including 5 ENZ-resistant (ENZ-R) and 5 ENZ-sensitive (ENZ-S) PCa cell lines, from the Gene Expression Omnibus (GEO) database. Integrated bioinformatics analyses were conducted, such as analysis of differentially expressed genes (DEGs), Gene Ontology (GO) enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, protein-protein interaction (PPI) analysis, gene set enrichment analysis (GSEA), and survival analysis. From these, we identified 201 DEGs (93 upregulated and 108 downregulated) and 12 hub genes (AR, ACKR3, GPER1, CCR7, NMU, NDRG1, FKBP5, NKX3-1, GAL, LPAR3, F2RL1, and PTGFR) that are potentially associated with ENZ resistance. One upregulated pathway (hedgehog pathway) and seven downregulated pathways (pathways related to androgen response, p53, estrogen response, TNF-α, TGF-β, complement, and pancreas β cells) were identified as potential key pathways involved in the occurrence of ENZ resistance. Our findings may contribute to further understanding the molecular mechanisms of ENZ resistance and provide some clues for the prevention and treatment of ENZ resistance.


Introduction
Given its rapidly increasing morbidity and mortality, prostate cancer (PCa) ranks first in new cases and second in deaths among cancers [1] and has thus attracted increasing research efforts in the fields of basic science and clinical medicine. The proliferation and progression of PCa strongly rely on androgen receptor (AR) signaling [2,3]. Therefore, androgen deprivation therapy (ADT) has emerged as the first-line choice for PCa treatment [2]. Enzalutamide (ENZ), also called MDV3100, is a secondgeneration ADT that can eventually impair AR translocation activity, inhibiting the transcription of drivers located downstream in the AR cascade and inhibiting the proliferation and progression of PCa [4]. In metastatic castrationresistant prostate cancer (mCRPC), ENZ has also exhibited excellent therapeutic efficacy. A phase 3 clinical trial demonstrated that administration of ENZ prolonged the survival time from 13.6 months to 18.4 months in endstage mCRPC patients who had suffered docetaxel failure [4,5]. However, some patients (approximately 20%~40%) observed no remarkable decrease in prostate-specific antigen (PSA) after receiving ENZ; that is, they exhibited inherent resistance to ENZ [6]. In addition, even patients who respond well still ultimately develop resistance to ENZ with long-term use [6].
The mechanisms of ENZ resistance can be roughly classified into two aspects, AR-dependent and AR-independent mechanisms [7]. The AR-dependent mechanism is based on the continuous AR activation seen after castration [7]. One reason is alterations of the AR gene, such as amplifications and mutations, which lead to either abnormally high AR expression or adverse effect on the ligand-binding site [8,9]. That is, AR amplifications increase the sensitivity of PCa cells in a low-androgen environment, while mutations elevate the affinity and specificity of the binding domain, even inducing unwanted activation by nonandrogen modules. Another mechanism is the emergence of AR splice variants, which are a series of truncated proteins derived from the alternative cleavage of AR. AR-V7 is the most famous for its strong transcription-promoting function in the AR signal cascade with a lack of androgen binding sites [10]. Other pathways, such as PI3K/AKT/PTEN, glucocorticoid receptor (GR), NF-κB/p52, and Wnt/β-catenin, are also involved in ENZ resistance [11][12][13][14]. Some of them result in drug tolerance via crosstalk or reciprocal feedback with AR signaling, while others stimulate cell proliferation and cancer progression in their own manner.
With the utilization of next-generation RNA sequencing (RNA-seq), large-scale data on significant genes and pathways related to cancer evolution have been generated. In the field of PCa ENZ resistance, a few studies have been conducted, and diverse signatures related to pathways such as Wnt signaling [15] and CREB5 [16] have been found. However, there is a paucity of efforts using systematic and integrated bioinformatics methods to uncover key genes and pathways of ENZ resistance at the genomic level. It is reasonable to believe that other crucial molecules still exist and need to be elucidated.
In this study, we performed integrated bioinformatics analysis based on data from ENZ-resistant (ENZ-R) and ENZ-sensitive (ENZ-S) cells that were acquired from the Gene Expression Omnibus (GEO) database to identify potential key genes and pathways for ENZ resistance in PCa at the genome-wide level.

Materials and Methods
2.1. Dataset Searching. Relevant genome-level microarray data were retrieved from the GEO database by using a combination of specific index words, such as "prostate cancer," "prostate adenocarcinoma," "enzalutamide," "MDV3100," "resistance," and "drug resistance." Three gene sets (GSE104935, GSE64143, and GSE120005) that contain genomic expression data of ENZ-R and ENZ-S PCa cells were further considered. Since two of the gene sets (GSE64143 and GSE120005) provided expression data for nonduplicate samples that were unfit for further bioinformatics analysis, GSE104935, which contains samples of ENZ-S and ENZ-R LNCaP cells, was ultimately screened for our current analysis.

Genomic Analysis of Differentially Expressed Genes
(DEGs). The original expression matrix of GSE104935 was acquired via the GEO website. Subsequently, raw data were arranged as files of expression matrix and sample type by use of both Microsoft Excel 2016 and Perl. Normalization of expression matrix data was executed by the R command normalizeBetweenArrays, while the DEG analysis was performed by the limma R package. The two cutoff values for the DEGs were an expression fold change ðFCÞ > 2 and a P value < 0.05. Volcano and heat map plots were generated to show the distribution and expression of identified DEGs.
2.3. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway Enrichment Analyses. With the aim of investigating the functional enrichment of DEGs, GO and KEGG term enrichment analyses were conducted on the Database for Annotation, Visualization and Integrated Discovery (DAVID) website (https://david.ncifcrf.gov) [17]. DEG symbols were uploaded to the DAVID website, and the outcomes of GO and KEGG analyses were given automatically by the web tools. The analysis of all DEGs was based on a threshold of an adjusted P value < 0.05. Bar and point plots were generated by R software.

Protein-Protein Interaction (PPI) Network Analysis.
Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (https://string-db.org/) is a website that predicts interactions among genes and proteins of interest. In the study, DEG symbols were provided as a list. The species "Homo sapiens" was chosen, and the minimum required interaction score was set to high confidence (0.700). A file containing protein interactions was automatically given and subsequently downloaded. Cytoscape, software for visualization of gene and protein networks, was utilized to construct the PPI network. MCODE, a plugin of Cytoscape, was used to identify potential hub genes.
2.5. Gene Set Enrichment Analysis (GSEA). GSEA is a computing method for exploring the statistical significance and concordant differences of defined gene sets or pathways between two biological states. In the present study, GSEA was used to deeply analyze biological information, enlightening our understanding of relevant biological events. Two files of genomic expression data and contrast information were inputted, and the analysis was carried out using GSEA 4.0.3 software.

Validation of Hub Gene Prognostic
Value. Further validation and survival analysis of identified hub genes were performed by the Gene Expression Profiling Interactive Analysis (GEPIA) database, which is a newly developed interactive web server for exploring RNA sequencing data from the TCGA and GTEx projects [18]. According to the overall survival (OS) and disease-free survival (DFS) data offered through GEPIA, figures displaying the influence of hub genes on PC survival were drawn. A P value < 0.05 was chosen as the cutoff.

Identification of DEGs.
Cell samples in GSE104935 were divided into two groups with 5 samples of ENZ-R and ENZ-S cells each. As exhibited in Figure 1, all expression data in each sample were normalized.
A total of 201 DEGs containing 93 highly expressed genes and 108 genes that were expressed at a low level were identified (Table 1) Figure 4(a), three categories were included in the GO analysis: biological process (BP), molecular function (MF), and cellular component (CC). The BP terms in which the DEGs were mainly enriched included positive regulation of gene expression, metabolic processes, negative regulation of cell migration, and flavonoid biosynthetic processes. The CC terms in which the DEGs were significantly enriched were integral components of the membrane, extracellular exosomes, endoplasmic reticulum, etc. The MF terms in which the DEGs were enriched were protein homodimerization activity, protein domain-specific binding, actin binding, glucuronosyltransferase activity, etc.

GO Enrichment Analysis. As shown in
The subsequent KEGG pathway analysis showed that the DEGs were mainly enriched in signaling pathways related to the metabolism of drugs and biological molecules, steroid hormone biosynthesis, transcriptional misregulation in cancer, and chemical carcinogenesis (Figure 4(b)).

PPI Network Construction and Hub Gene Selection.
A total of 201 DEGs were uploaded into the STRING database. The PPI network graph ( Figure 5(a)) and primary data were generated automatically. After MCODE analysis, 12 hub genes, of which 5 were upregulated and 7 were downregulated, were identified ( Figure 5(b)). The upregulated hub genes were AR, ACKR3, GPER1, CCR7, and NMU, and the downregulated genes were NDRG1, FKBP5, NKX3-1, GAL, LPAR3, F2RL1, and PTGFR.

Identification of Potential Significant Pathways by GSEA.
To further identify the potential pathways in the genesis of ENZ resistance in PCa, we performed GSEA analysis at the genomic level. In the ENZ-R cell group, the GSEA results showed that 15 gene sets were upregulated and that the hedgehog pathway was regarded as the most significant pathway (P < 0:05). In ENZ-S cells, 35 gene sets were downregulated, and 7 pathways were considered significantly affected, androgen response, p53, estrogen response, TNF-α, TGF-β, complement, and pancreas β cells ( Figure 6). The seven pathways were relatively downregulated in groups of ENZ-R cells.

The Prognostic Value of Hub Genes in PCa Patients.
Based on survival analysis results, we found that none of these hub genes affected the OS of PC patients. However, four hub genes were indicated to be of predictive value for DFS. The high expression of two genes (AR and ACKR3) predicted low DFS in patients with PC. In addition, patients with upregulation of another two genes (FKBP5 and F2RL1) had better DFS (Figure 7).
Neuromedin U (NMU) is a neuropeptide that belongs to the neuromedin family and has been shown to be related to many important activities in the nervous system [27]. A recent study showed that NMU was a novel prognostic marker of many cancer types [28]. Li et al. found that NMU was highly expressed and related to poor clinical outcomes in hepatocellular carcinoma [29]. Overexpression of NMU enhanced drug resistance in breast and lung cancer cells, whereas NMU silencing sensitized resistant cells [30].
Coagulation factor II (thrombin) receptor-like 1 (F2RL1) is known as a protease-activated receptor that is abundant in neurons, where it functions in pain, inflammation, and release of neurotransmitters [31]. Previous works have indicated that protease-activated receptor 2 (PAR2, encoded by  BioMed Research International F2RL1) functions in the regulation of carcinogenesis in diverse cancers. In hepatocellular carcinoma, PAR2 was reported to promote cell proliferation and distant metastasis by inducing EMT and to predict poor clinical outcome. Prostaglandin F2 alpha receptor (PTGFR), a membrane receptor for prostaglandin F2 alpha, has been reported to be related to tumorigenesis and progression in endometrial adenocarcinoma. Overexpression of PTGFR was found to be a novel marker in endometrial adenocarcinoma and renal cell carcinoma [32,33]. In prostate cancer, the next-generation sequencing analysis performed by Alkhateeb et al. identified PTGFR as a potential biomarker to predict progression [34]. Romanuik et al. performed a long serial analysis of gene expression (LongSAGE) libraries and confirmed PTGFR as a key gene that was associated with cell proliferation and in vivo progression.
Galanin (GAL) is a reported neuropeptide secreted by sensory neurons located in the gastrointestinal system [35]. GAL may play a dual role in cancer regulation because it exhibits different expression profiles in cancers. The GAL mRNA level was observed to be elevated and was significantly associated with tumor stages in colon cancer [36], while it was also indicated as a tumor suppressor correlated with low disease-free survival in head and neck squamous cell carcinoma [37].
Lysophosphatidic acid receptor 3 (LPAR3) is one of the G protein-coupled receptors that is specifically triggered by lysophosphatidic acid and is related to proliferation and aggressiveness in certain cancers [38]. Increased expression of LPAR3 was proven to increase malignancy in breast and ovarian cancers [39,40]. LPAR3 was reported as a part of     9 BioMed Research International the ZEB1-AS1/miR-133a-3p/LPAR3/EGFR axis that promotes thyroid cancer progression by regulating PI3K/Akt/mTOR signaling [41]. Recent bioinformatics analysis has shown that LPAR3 is one of the hub genes in high-grade prostate cancer.
Although the above 5 hub genes are closely related to cancer regulation, their detailed functions in ENZ resistance in PCa remain unclear, and more experiments are needed in future research.
According to the GSEA results, hedgehog (HH) signaling was considered a key pathway functioning in ENZ-R cells. Research conducted by Cai et al. indicated that overexpression of the m6A methyltransferase METTL3 could upregulate GLI1, one of the main transcription factors in HH, and could enhance PCa proliferation [42]. In addition, HH has been validated as a novel therapeutic target for PC. Yang et al. designed a novel microtubule destabilizer (QW-296) that was combined with MDB5, a newly synthesized HH inhibitor, to treat taxane-resistant PC cells (PC3-TXR and DU145-TXR), and the combination achieved better anticancer efficacy than single-drug administration [43]. Sun found that an HH blocker (GANT61) and PLC knockdown synergized to impair the cell growth and colony forming ability of PCa cells and augmented sensitivity to ENZ. The above discoveries provide evidence in support of our GSEA outcome that HH signaling might be a potential key pathway in ENZ-R development. More efforts should be conducted to research the detained interaction between HH and ENZ resistance.
Previous research indicated that there was an interaction between the estrogen response and ENZ. Abazid et al. found that ENZ treatment could decrease the expression level of the estrogen receptor in PCa, and the estrogen receptor increased ENZ sensitivity to AR (+) triple-negative breast cancer [44,45]. A study by Maughan et al. indicated that p53 inactivation was correlated with a poor response to antiandrogen drugs in CRPC [46], which is consistent with our result that the P53 pathway is downregulated in ENZ-R cells.
There are some inherent limitations in our work, resembling those of similar studies. First, we only acquired and analyzed one data matrix, GSE104935, for the analysis, since the other candidate gene sets (GSE64143 and GSE120005) contain expression data for nonduplicate samples. In the future, more genomic sequencing data for ENZ-R PC cells are urgently needed. Second, the results of our present bioinformatics analysis are based on RNA-Seq data from cell lines but not from clinical samples, which may weaken the persuasiveness of our conclusion. We will design further verification experiments on clinical PCa tissues in the future research. Third, molecular biology experiments for verifying the function of the identified hub gens and pathway are not performed, and they will be addressed as the main goal in our subsequent studies.

Conclusion
In conclusion, we identified 201 DEGs, 12 hub genes, and 8 pathways by integrated bioinformatics analysis. We speculate that these candidate genes or pathways are likely to play different roles in the generation and development of ENZ resistance. These key genes and pathways deserve more exploration and validation in future works.

Data Availability
The data of GSE104935 used in our study are available via GEO (https://www.ncbi.nlm.nih.gov/geo/) or by appropriate request to the authors.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.