TPX2 Serves as a Cancer Susceptibility Gene and Is Closely Associated with the Poor Prognosis of Endometrial Cancer

Background Endometrial cancer (EC) is a common tumor of the genital tract that affects the female reproductive system but with only limited treatment options. We aimed to discover new prognostic biomarkers for EC. Methods We used mRNA-seq data to detect differentially expressed genes (DEGs) between EC and control tissues. Detailed clinicopathological information was collected, and changes in the mRNA and protein levels of hub DEGs were analyzed in EC. Copy number variation (CNV) was also evaluated for its association with the pathogenesis of EC. Gene set enrichment analysis (GSEA) was conducted to enrich significant pathways driven by the hub genes. Cox regression analysis was used to select variables to create a nomogram. The nomogram was calibrated by applying the concordance index (C-index), and net benefits of the nomogram at different threshold probabilities were quantified using decision curve analysis (DCA). Results Differential expression analysis identified 24 DEGs as potential risk factors for EC. Survival analysis revealed that TPX2 expression was related to worsening overall survival in patients with advanced EC. A high CNV was associated with the overexpression of TPX2; this suggested that modifications in the cell-cycle pathway might be crucial in the advancement of EC. Moreover, an individualized nomogram was developed for TPX2 incorporating clinical factors; this was also evaluated for its ability to predict EC. Calibration and DCA analyses confirmed the robustness and clinical usefulness of the nomogram. Conclusion We offer novel insights into the pathogenesis and molecular mechanisms of EC. The overexpression of TPX2 was related to a poorer prognosis and could serve as a biomarker for predicting prognostic outcomes in EC patients.


Introduction
Endometrial cancer (EC) is the sixth most frequent malignancy among women worldwide [1]. In the early stages of the disease, the 5-year relative survival rate is more than 95%. However, the survival rates of patients in the advanced stages range from 20% to 40% [2]. Although surgery, chemotherapy, and radiotherapy can be used to treat EC patients, there is still a lack of effective therapeutic targets.
us, there is a clear need of new diagnostic and prognostic biomarkers for new treatment strategies for patients with EC. e targeting protein for Xenopus kinesin-like protein 2 (TPX2) is a key factor that ensures the correct assembly of the mitotic spindle and is located on chromosome 20q11.1 in humans [3]. Although TPX2 is closely associated with the spindle pole during mitosis, it disappears after the completion of cytokinesis [4,5]. Similar to other proteins that regulate mitosis, TPX2 is overexpressed in a range of different cancers and is generally associated with a poor prognosis. e increased protein expression of TPX2 reportedly improves the proliferative, invasive, and migratory abilities of colorectal and cervical cancers [6,7]. However, the downregulation of TPX2 in hepatocellular tumors could suppress proliferative, invasive, and migratory abilities via the PI3K/AKT/mTOR pathway [8]. However, there is lack of studies on the exact role of TPX2 in the development of EC.
Here, we explored the effects of TPX2 expression on overall survival (OS) in EC patients and further investigated the molecular mechanisms underlying its differential expression. Following univariate and multivariate COX regression analyses, we constructed a TPX2 nomogram with independent prognostic factors, effectively predicting 1-, 3-, and 5-year overall survival (OS) in patients with EC. In summary, TPX2 is associated with the pathogenesis of EC and serves as a marker for the prognostic evaluation of EC.

Patients and Specimens.
We retrieved messenger RNA expression datasets (GSE63678 and GSE17025) from the Gene Expression Omnibus (GEO) database (https://www. ncbi.nlm.nih.gov/gds/) to screen for candidate genes. GSE63678 featured 7 EC and 5 normal endometrial tissues. GSE17025 contained 91 EC and 12 normal endometrial tissues. Subsequently, the clinical data and expression profiles of the EC patients (n � 575; EC, 552; normal endometrial tissue, 23) were retrieved from e Cancer Genome Atlas (TCGA) database (https://cancergenome.nih. gov) so that we could evaluate prognostic markers.
Endometrial samples were collected from 68 surgical patients who underwent surgery at the Department of Gynecology of Benxi Central Hospital between March 2020 and March 2021. e 68 samples comprised 25 endometrial carcinomas, 5 serous carcinomas, 1 clear cell carcinoma, and 37 normal endometrial tissues (proliferative endometrial specimens, 15; secretory endometrial specimens, 5; and atrophic endometrial specimens, 17). e median ages of patients with malignant and normal endometrial samples were 57.4 (41-75) and 54.9 (31-80) years, respectively. e patients had not received radiotherapy, chemotherapy, or hormone therapy, and all patients provided written and informed consent. Experiments were approved by the Clinical Research Ethics Committee of the Benxi Central Hospital of China Medical University (23/04/2020; reference: 20200309-1).

Screening for Differentially Expressed Genes (DEGs).
e GEO2R tool (https://www.ncbi.nlm.nih.gov/geo/geo2r/ ) was used to screen for DEGs in both the GSE63678 and GSE17025 datasets. Differential expression analyses of mRNAs in EC and control tissues from the TCGA datasets were conducted using the Bioconductor Linear Model for Microarray Analysis (LIMMA) R package. Genes with a | log2(fold-change)| of >2.0 and p < 0.05 were considered as potential DEGs and were subjected to further analysis. e Venny v2.1 web-based tool (https://bioinfogp.cnb.csic.es/ tools/venny/index.html) was utilized to identify candidate DEGs among the three datasets.

Survival Analysis and Expression Validation of Hub
DEGs.
e TCGA-EC samples (n � 552) were assigned to low-and high-expression groups based on the cutoff point (the median expression value of the DEGs). Kaplan-Meier survival curves were plotted using the survival R package. To further evaluate whether the candidate DEGs can serve as prognostic factors for EC, we performed univariate and multivariate Cox regression analyses. e mRNA expression levels of each DEG were verified with Oncomine (https:// www.oncomine.org/) microarray data sheets according to the fold-change and threshold p value of 2 and 1 × 10 −4 , respectively.

Western
Blotting. Sixty-eight endometrial specimens were used to screen the proteins encoded by hub genes by western blotting with a monoclonal TPX2 antibody (1 : 500, EPR23180-4; Abcam, USA). β-Actin was employed as an internal control.

Histology and Immunostaining.
Immunohistochemical staining was performed using tissue microarrays (TMAs; EMC1351; Superbiotek, Shanghai, China) containing EC and paracancerous tissue samples (n � 17) for differential expression analysis of hub genes. e experimental methodology used here was previously described by Lei et al. [9].

CNV Data Analysis.
e Affymetrix SNP 6.0 platform on the Genomic Data Commons (GDC) Cancer Browser (https://portal.gdc.cancer.gov/) was used to retrieve TCGA-EC copy number variation (CNV) data. Genes that were fully located in the significantly aberrant CNV regions were then identified by alignment with the genome. (GSEA). GSEA was conducted to identify enriched genes related to high-expression levels of the hub genes in EC patients. In particular, we used the GSEA preranked function; the genes were sorted according to fold-change, the number of permutations was fixed at 10,000, and the size of a gene set was limited to 2000 genes.

Prognostic Value Analysis.
Prognostic information and clinicopathological data, such as age, tumor grade, Fédération Internationale de Gynécologie et d'Obstétrique (FIGO) stage, and the histological types of 552 patients were obtained from the TCGA-EC cohort. e TCGA-EC samples (n � 552) were randomly divided into a training group (n � 276) and a verification group (n � 276). ere were no significant differences between the training and verification groups in terms of age, tumor grade, FIGO stage, or histological type (Supplementary Table 1). Using the combination group (n � 552), we performed survival analysis to investigate the relationship between the mRNA expression of each hub gene and its corresponding CNV and clinical outcomes. Next, univariate and multivariate Cox regression 2 Genetics Research analyses were conducted to identify whether the hub gene was independent risk factor. Based on the independent prognostic factors identified in the final multivariate Cox regression analysis, we used a nomogram to predict OS among EC patients in the training group. e nomogram was visually assessed using a calibration plot that compared the predicted and actual survival probabilities of EC patients. e prognostic performance of the nomogram was determined by the area under the ROC curve (AUC), which can range from 0.5 (no discrimination) to 1 (perfect discrimination). Furthermore, we used decision curve analysis (DCA) to compare a nomogram that included all independent prognostic factors with only one independent prognostic factor [10]. e DCA was used to calculate the clinical net benefit of each model compared to all or no strategies. e best model was the one with the highest net benefit as calculated.
2.9. Statistical Analysis. Perl scripting tool v5.26.3, R software v3.5.3, and R Studio v1.1.463 were employed for statistical analysis. e Student's t-test was applied to compare the mean values between the groups, and all data were checked for normal distribution and homogeneity of variance using the Shapiro-Wilk test and the Levene test, respectively. Fisher's exact test was employed to determine the association between hub gene expression and clinicopathological characteristics in TMA-EC samples. e optimal cutoff age for patients with TCGA-EC was calculated with X-tile software v3.6.1 according to the survival status [11]. e effect of hub genes on OS in EC patients was assessed using Kaplan-Meier survival curves and the log-rank test. Significance level was set at p < 0.05. Figure 1 shows a flowchart depicting the flow of work in this study. e GSE63678 dataset contained 70 upregulated and 50 downregulated DEGs (Figure 2(a)), while the GSE17025 dataset contained 580 upregulated and 350 downregulated DEGs (Figure 2(b)). In the TCGA Uterine Corpus Endometrial Carcinoma (UCEC) dataset, we identified 1087 upregulated genes and 411 downregulated genes (Figure 2(c)). Figure 2(d) shows a Venn diagram depicting the 24 candidate genes that were differentially expressed (see also Supplementary Table 2).

Identification of the Hub DEGs in EC Patients.
Univariate regression analysis indicated that 16 candidate DEGs were risk-associated genes for EC. Multivariate regression analysis demonstrated that TPX2 (hazard ratio (HR): 1.035; 95% confidence interval (CI): 1.025-1.045; p � 6.07E − 13) and testis-specific Y-encoded-like protein 5 (TSPYL5) (HR: 1.034; 95% CI: 1.007-1.061; p � 0.013) were independent risk factors of EC pathogenesis (Table 1). Oncomine coexpression analysis revealed that the expression patterns of these two hub genes were in good agreement with our initial analysis (Figure 2(e)). Furthermore, survival analysis showed that only the overexpression of TPX2 could contribute to the prognosis of EC patients (Figure 2(f )). Moreover, the levels of TPX2, as confirmed by western blotting, were significantly higher in EC tissue than in normal endometrial tissue (0.893 ± 0.102 vs. 0.438 ± 0.062, p < 0.001) (Figure 2(g)). ese findings are consistent with the mRNA data. In addition, IHC showed that TPX2 (1 : 250, EPR23180-4; Abcam) was primarily localized in the cytoplasm and nucleus (Figure 3(a)). Compared with the adjacent tissues, the levels of TPX2 were considerably higher in the nucleus in EC tissues (Figure 3(b)).

Correlation between TPX2 Expression and Clinicopathologic
Features. Next, we used TCGA-EC mRNA expression data relating to TPX2 and clinicopathologic features to perform differential expression analysis and survival analysis.

CNV Analysis of TPX2 in EC Patients
. CNV mapping onto the entire genome revealed that the TPX2 segment was considerably amplified in the endometrial tumor group when compared with that in the normal endometrial group; these findings were consistent with the TPX2 mRNA expression data (Figures 4(a) and 4(b)). Moreover, the copy number amplification of TPX2 in EC tissue was also related to a poor prognosis, especially with FIGO stage (Figures 4(c) and 4(d), Supplementary Figure 2).

GSEA Identification of TPX2-Related Signaling
Pathways. GSEA was performed by comparing high-and low-TPX2 expression groups to investigate the potential function of TPX2 in EC. e enriched gene sets with a false discovery rate of <0.25 and p < 0.05 were considered statistically significant. As shown in Figure 4(e), the top three enriched phenotypes in the high-TPX2 expression group were "cell cycle," "oocyte meiosis," and "spliceosome," while the pathways enriched in the TPX2-low expression group were "alpha-linolenic acid metabolism," "complement and coagulation cascades," and "linoleic acid metabolism" (Supplementary Figure 3).

OS Prediction and Evaluation.
To further evaluate whether TPX2 can serve as a prognostic factor, we performed univariate and multivariate Cox regression analyses to compare EC patients with high and low levels of TPX2 expression. Apart from TPX2, we also tested the effect of other covariates, such as age, tumor grade, FIGO stage, and histological type. Multivariate Cox regression analysis indicated that TPX2 (HR: 1.033; 95%CI: 1.023-1.043; p < 0.001), age (HR: 2.114; 95% CI: 1.211-3.689; p < 0.01), and FIGO stage (HR: 2.706; 95% CI: 1.726-4.240; p < 0.001) were independent risk factors and had better prognostic value (Table 2). Subsequently, we constructed a nomogram to predict the 1-, 3-, and 5-year OS of EC patients using TPX2, age, and FIGO stage (

Discussion
EC is the most frequent gynecological malignancy and the sixth most frequently diagnosed form of cancer globally, with more than 417,000 new cases and 97,000 deaths in 2020 [12]. EC can be divided into two pathogenetic types according to the occurrence of hyperlipidemia, obesity, and hyperestrogenism [13]. Histopathologically, type I EC is characterized by good endometrial differentiation, progesterone sensitivity, and a better prognosis; endometrioid carcinoma is the most common histological type. In contrast, type II EC is characterized by poor differentiation, progesterone resistance, and a worse prognosis; serous carcinoma is the most common histological type [14]. e early detection of EC is associated with favorable OS and excellent quality of life postsurgery, whereas patients with an advanced disease lack effective treatment. Although adjuvant chemotherapy, radiotherapy, and targeted therapy have significantly prolonged the OS of patients with advanced EC, the prognosis remains poor [15,16]. Furthermore, current diagnostic biomarkers fail to predict the progression of EC.
us, there is a need to identify reliable biomarkers for the early diagnosis and prognosis prediction of EC.
Owing to the genetic heterogeneity of cancer, we need comprehensive data to identify biomarkers to achieve precision diagnosis and treatment. We integrated two GEO-EC datasets and TCGA-EC mRNA-seq data for DEG screening. Twenty-four genes were differentially expressed in all three databases. Univariate and multivariate COX regression analyses revealed that TPX2 and TSPYL5 were independent    risk factors for EC. We investigated the effects of these risk factors on the OS of EC patients in the TCGA-EC cohort. ese findings revealed that only the overexpression of TPX2 could contribute to poor prognosis, thus suggesting that TPX2 may represent a hub gene for the accurate prediction of prognostic outcomes in EC patients.
We exploited CNV data from TCGA datasets to compare differences between EC and normal endometrial tissues with respect to TPX2 CNV fragments. e copy number of TPX2 fragments was considerably higher in EC patients when compared with that in normal controls; we also investigated the corresponding TPX2 mRNA expression data. Recent technological advances in DNA sequencing have enabled a more detailed understanding of the molecular changes that define gynecological tumors [17], and the association between TPX2 overexpression and copy number amplification has been reported in malignant tumors of the ovaries and cervix [18,19]. Here, we revealed a correlation between TPX2 overexpression and copy number amplification in EC patients, especially those with FIGO stages III and IV; these stages are strongly associated with a poor prognosis. ese data indicate that TPX2 copy number gains may play a key role in carcinogenesis and disease progression.
Although the precise role of TPX2 in EC tumorigenesis has yet to be fully elucidated, there is evidence to suggest that TPX2 plays a role in the pathogenesis of various cancers via immune infiltration, the AKT pathway, and by regulating TP53 activity [20][21][22]. One of the most significant findings to emerge from the present study is that the distribution and expression patterns of TPX2 in the nuclei of EC cells were considerably higher than those in the adjacent tissues. Furthermore, GSEA demonstrated significant enrichment of     genes related to the cell cycle within the group of patients with high TPX2 expression levels. TPX2 is a mitotic regulator that participates in the microtubular formation of spindles for chromosomal division during the cell cycle [23][24][25]. Degeneration of the cell cycle is a common occurrence in human cancer; several reports have shown that TPX2, along with other mitotic regulators, and particularly Aurora-A, synergistically promote chromosomal instability in tumor cells by impairing appropriate spindle assembly and by inducing mitotic errors [26][27][28][29]. Furthermore, the excessive expression of TPX2 also affects the microtubule cytoskeleton in a manner that is independent of Aurora-A binding; this can alter the structure and distribution of organelles in retinal pigment epithelial cells [30]. e aforementioned pathways may also play functional roles in the pathogenesis of EC and genetic predisposition; however, additional investigations are needed to verify this hypothesis.
In addition, we focused on the integration of TPX2 and clinicopathological factors to predict a poorer prognosis in patients with EC. Although the aforementioned risk factors are closely related to a poor prognosis in EC, none of these factors can be used alone to predict the prognosis of EC. In this study, we demonstrated that TPX2, age, and FIGO stage had a better prognostic value for EC. We developed a novel nomogram and found that TPX2 plays an important role in predicting the 1-, 3-, and 5-year OS rates of EC patients.
is study has some limitations that need to be considered. e single nucleotide polymorphisms (SNPs) in TPX2 that confer increased susceptibility to EC need to be      analyzed further. Our research group has been collecting serum samples from EC patients admitted to the Obstetrics and Gynecology Department of Benxi Central Hospital so that we can screen for high-risk SNPs. In addition, more cytological studies are warranted to investigate the roles of TPX2 in the pathogenesis of EC and to verify the association between TPX2 and EC progression.

Conclusion
In summary, TPX2 overexpression was considerably associated with a poorer prognosis in patients with EC. CNV alterations in TPX2 might be a potential mechanism for its overexpression during the development and progression of EC. TPX2 can serve as a prognostic biomarker for predicting OS in EC patients and may facilitate the development of novel gene-targeted therapy for this disease.
Data Availability e datasets used and/or analyzed in this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Jun Wang and Hui He conceived and designed the experiments. Hua Zheng and Shuying Meng collected samples. Yanan Zhang and Yatian Han analyzed the data. Hanbing Yan and Zhe Su performed the western blotting and pathology experiments. All authors were involved in critically revising the manuscript.  Figure 6: DCA curves of the nomogram. A comparison of the DCA curves for 1-, 3-and 5-year overall survival in EC patients among the training group, verification group and combination groups, respectively. e none plot represents the assumption that no patient presented 1-, 3-or 5-year survival; whereas all plot represents the assumption that all patients presented 1-, 3-or 5-year survival at a specific threshold probability. e x-axis represents the threshold probabilities, and the y-axis shows the net benefit. DCA: decision curve analysis; EC: endometrial cancer. 12 Genetics Research