Machine Learning Developed a Programmed Cell Death Signature for Predicting Prognosis, Ecosystem, and Drug Sensitivity in Ovarian Cancer

Background Ovarian cancer (OC) is the leading cause of gynecological cancer death and the fifth most common cause of cancer-related death in women in America. Programmed cell death played a vital role in tumor progression and immunotherapy response in cancer. Methods The prognostic cell death signature (CDS) was constructed with an integrative machine learning procedure, including 10 methods, using TCGA, GSE14764, GSE26193, GSE26712, GSE63885, and GSE140082 datasets. Several methods and single-cell analysis were used to explore the correlation between CDS and the ecosystem and therapy response of OC patients. Results The prognostic CDS constructed by the combination of StepCox (n = both) + Enet (alpha = 0.2) acted as an independent risk factor for the overall survival (OS) of OC patients and showed stable and powerful performance in predicting the OS rate of OC patients. Compared with tumor grade, clinical stage, and many developed signatures, the CDS had a higher C-index. OC patients with low CDS score had a higher level of CD8+ cytotoxic T, B cell, and M1-like macrophage, representing a related immunoactivated ecosystem. A low CDS score indicated a higher PD1 and CTLA4 immunophenoscore, higher tumor mutation burden score, lower tumor immune dysfunction and exclusion score, and lower tumor escape score in OC, demonstrating a better immunotherapy response. OC patients with high CDS score had a higher gene set score of cancer-related hallmarks, including angiogenesis, epithelial–mesenchymal transition, hypoxia, glycolysis, and notch signaling. Conclusion The current study constructed a novel CDS for OC, which could serve as an indicator for predicting the prognosis, ecosystem, and immunotherapy benefits of OC patients.


Introduction
Ovarian cancer (OC) is the leading cause of gynecological cancer death and the fifth most common cause of cancerrelated death in women in America [1].A total of 19,880 cases are estimated to be initially diagnosed with OC, and 12,810 patients die from this malignancy in America in 2022 [2].Despite many management approaches that have been used for the therapy of OC patients, including surgery, chemotherapy, and endocrine therapy, the clinical outcomes of OC patients are still poor, with the 5-year survival rate for OC patients less than 50% [1].In addition to the tumornode-metastasis staging system, there are few clinical markers used to predict the prognosis of OC patients.High recurrence and drug resistance remained the main reasons leading to the poor clinical outcomes for OC patients [3].Drug resistance and tumor relapse are the main reasons for the treatment failure [4].Due to the lack of typical clinical symptoms in the early stage, many OC patients have advanced disease or distant metastases by the time OC is initially diagnosed.A recent study showed that immunotherapy could be a promising modality for many malignancies, especially for the advanced malignancies [5].However, the evidences about OC response to immunotherapy and the biomarkers for predicting the immunotherapy response are limited.
According to the triggering mechanism, cell death could be divided into accident cell death and programmed cell death (PCD) [6].As far as we know, PCD could be divided into 15 subtype patterns, including pyroptosis, ferroptosis, necroptosis, autophagy, immunologic cell death, entotic cell death genes, cuproptosis, parthanatos, lysosome-dependent cell death, intrinsic apoptosis, extrinsic apoptosis, necrosis, anoikis, apoptosis-like morphology and necrosis-like morphology [6][7][8].Pyroptosis could regulate tumor cell proliferation, metastasis, and affect immune response [9].Previous study showed that cuproptosis could regulate the microenvironment and affect prognosis in several types of cancer [10].Increasing evidences highlight the vital role of ferroptosis in reversing drug resistance [11].As a key player in cellular and body metabolism, autophagy is associated with the progression and prognosis of cancer [12].As an emerging hallmark in health and diseases, anoikis plays a vital role in tumor progression and drug resistance [13].Due to the vital role of these PCD in cancer, a comprehensive understanding of the prevalence of PCD-related genes in OC and their correlation to patient's prognosis, ecosystem, and therapeutic response may yield many interesting findings.
In our study, we developed a 21 gene-based cell death signature (CDS) for predicting the prognosis of OC patients in the TCGA cohort.The CDS was verified using five testing cohorts, including GSE14764, GSE26193, GSE26712, GSE63885, and GSE140082 cohort.We then explored the correlation between CDS and the prognosis, immune infiltration, ecosystem, and signaling pathway in OC, offering insights into prognosis prediction and immune landscape in OC.

Machine-Learning Algorithms Developed a Prognostic
CDS.To obtain the differentially expressed genes (DEGs) in OC among PCD-related genes, we used the "limma" package using |LogFC| ≥ 1 as the cutoff.After obtaining potential prognostic biomarkers with univariate Cox analysis, we then summited these prognostic biomarkers to integrative analysis procedure with 10 machine-learning algorithms, including random survival forest, elastic network (Enet), Lasso, Ridge, stepwise Cox, CoxBoost, partial least squares regression for Cox (plsRcox), supervised principal components (SuperPC), generalized boosted regression modeling, and survival support vector machine, with which we could develop an accurate and stable prognostic CDS.The signature generation procedure was as follows: (1) Prognostic biomarkers were generated using univariate Cox regression in the TCGA dataset; (2) then, 101 algorithm combinations were performed on the prognostic signature to fit prediction models based on the leave-one-out cross-validation framework in the TCGA dataset; (3) all models were detected in five GEO cohorts (GSE14764, GSE26193, GSE26712, GSE63885, and GSE140082); (4) for each model, the Harrell's concordance index (C-index) was calculated across all TCGA and GEO datasets, and the model with the highest average C-index was considered optimal.Similar machine learning algorithms could be seen in previous studies [16][17][18].The parameter tuning details about the R scripts in our study are available on the GitHub website (https://github.com/Zaoqu-Liu/IRLS).

Evaluation of the Performance of CDS.
Based on the expression of genes in CDS and their corresponding coefficients, we then calculated the CDS score of each OC case.To separate OC cases into low and high CDS score groups, we applied the "surv_cutpoint" function of the R package "survminer" to determine the cutoff.Using the "pROC" package, we then generated time C-index curves.C-index curves were used to compare the performance of CDS in predicting the clinical outcome with 54 prognostic signatures (mRNA and lncRNA-related signatures, Supplementary 2) that have been developed for OC.By searching "prognostic model AND ovarian cancer" or "prognostic signature AND ovarian cancer" in Pubmed (https://pubmed.ncbi.nlm.nih.gov/) on February 1, 2023, we obtained a total of 540 signatures that have developed for OC.We used Excel to generate 54 random numbers from 1 to 540, and these 54 random numbers corresponding to the items were selected for further comparison with our prognostic signature.To identify the risk factor for the prognosis of OC, we then conducted univariate and multivariate Cox analysis.Using "nomogramEx" R package, we then developed a predicted nomogram considering CDS, tumor grade, and tumor stage.When the calibration curve is considered a perfectly calibrated model, the predicted value will fall on the diagonal 45°in the figure.

Immune Infiltration Analysis.
The correlation between CDS score and immune cells was analyzed with immunedeconv, an R package integrating seven state-of-the-art algorithms, including CIBERSORT, MCPcounter, QUANTISEQ, XCELL, CIBERSORT-ABS, TIMER, and EPIC [19].By using "estimate" R package [20], we then calculated the immune and ESTIMATE score of each OC case.Single sample gene set enrichment analyses were used to explore the score of immune cells and related functions of each OC case.The normalized enrichment score (|NES| > 1), nominal p-value < 0.05 (NOM p-value), and false discovery rate-adjusted q-value < 0.25 were considered as significant pathway enrichment.
2.5.scRNA-Seq Analysis.scRNA-seq data were processed with the Seurat R package (version 4.0) [21].Those genes detected in more than three cells, cells with more than 200 detected genes, or cells with a mitochondrial proportion of less than 20% would be selected for further analysis.The top 2,000 highly variable genes of each sample were normalized using the ScaleData function based on variance stabilization transformation.The dimensionality of the principal component analysis was reduced using the RunPCA function.We chose dim = 30 and clustered the cells into different cell groups using "FindNeighbors" and "FindClusters" functions.The resolution was 0.5.T-SNE (t-distributed stochastic neighbor embedding), a nonlinear dimension reduction method in Seurat, was applied to map high dimensional cellular data into a 2D space, grouping cells with similar expression patterns and separating those with different expression patterns.The CDS score of each cell was calculated using the AddModuleScore function.Based on the ligand-receptor information, we used the single-cell gene expression matrix to unravel the communication between different cell subtypes, which was contained in CellChat software with default parameters, modeling the communication probability and identifying significant communications.

Integrative Machine-Learning Algorithms Developed a
Prognostic CDS.To develop an accurate and stable prognostic CDS, we then submitted these 38 potential prognostic biomarkers to an integrative machine-learning procedure, including 10 methods mentioned above.Finally, we obtained a total of 101 kinds of prognostic models and their C-index of training and testing cohort (Figure 2(a)).The model constructed by StepCox (n = both) + Enet (alpha = 0.2) method was considered the optimal prognostic model as they had the highest average C-index of 0.59 (Figure 2(a)).The prognostic CDS was developed by 21 PCD-related genes, and the CDS score of each OC patient was calculated with the formula: risk score = (−0.267362533(a) and 3(b), univariate and multivariate Cox regression analysis suggested that CDS-based risk score acted as an independent risk factor for the OS rate of OC in TCGA, GSE14764, GSE26193, GSE63885, and GSE140082 cohort (all p<0:05).Actually, many prognostic signatures have been developed for OC.To compare the predictive value of CDS with other prognostic signatures, we randomly collected 54 prognostic signatures (Supplementary 2) and calculated their C-index.Interestingly, the C-index of our CDS was higher than most of these prognostic signatures in the TCGA cohort (Figure 3(c)).Similar results were obtained in the GSE26193 and GSE140082 datasets.In the GSE29193 cohort, the C-index of our CDS was higher than 50 of these prognostic signatures (Supplementary 4).And the C-index of our CDS was higher than 48 of these prognostic signatures (Supplementary 4).To predict the 1-, 3-, and 5-year OS rate of OC, we then constructed a nomogram considering CDS-based risk score, clinical stage, and tumor grade (Figure 3(d)).Compared with the idea curve, nomogram-based calibration curves had a relative well predictively value in the 1-, 3-, and 5-year OS rate in OC (Figure 3(e)), with the AUC of 0.710 (Figure 3(f)).Moreover, the decision curve analysis (DCA) also suggested that the predictive benefit of nomogram was higher than risk score, tumor grade, and clinical stage (Figure 3(g)).
3.4.The Distinct Ecosystem in OC Patients with Different CDS Scores.A significant correlation was obtained between CDS-based risk score and the abundance of immune cells (Figure 4(a)).Interestingly, the CDS-based risk score showed a negative correlation with the abundance of immunoactivated cells, such as CD8 + T cells, B cells, and macrophage M1 (Figure 4(b)-4(d), all p<0:05).As shown in Figure 4(e), ssGSEA analysis revealed that OC patients with low-risk scores had a higher abundance of immunoactivated cells, including B cells, CD8+ T cells, neutrophils, NK cells, and TIL (all p<0:05).Further analysis revealed that OC patients with low-risk scores had a lower stromal score, higher immune score, and higher ESTIMAE score (Figure 4(f ), all   Analytical Cellular Pathology Analytical Cellular Pathology p<0:05).A higher gene set score correlated with CC chemokine receptor, cytolytic, parainflammation, and Tcell costimulation was obtained in OC patients with higher CDS scores (Figure 4(g)).Moreover, the level of most of the human leukocyte antigens-related genes was higher in OC patients with low-risk group (Figure 4(h), p<0:05).Based on these findings, we may suggest that the immune environment in OC patients with low and high CDS scores is significantly distinct.As cells exert their functions by interacting with other cells, we then explore the interesting ecosystem between OC patients with different CDS scores.As shown in Figure 5

CDS Could Predict the Therapy Response in OC.
As the ecosystem in OC patients with different risk scores is significantly distinct, the immunotherapy response of OC patients with different risk scores may be different.To verify this, we then applied several approaches to evaluate the predictive value of CDS score in immunotherapy response.Immune checkpoints played a vital role in immune escape from cancer.The data showed that the expression of most of the immune checkpoints was higher in OC patients with highrisk scores (Figure 6(a), all p<0:05).TMB was suggested as a predictive biomarker for predicting the responses to immunotherapy, and a high TMB score indicated a better response to immunotherapy [22].IPS was a superior predictor of response to anti-CTLA-4 and anti-PD-1 antibodies, and high IPS indicated a better response to immunotherapy [23].We found that OC patients with low-risk scores had a higher TMB score, higher PD1 immunophenoscore, CTLA4 immunophenoscore, and PD1 and CTLA4 immunophenoscore (Figures 6(b) and 6(c), all p<0:05).A high TIDE score indicates a greater likelihood of immune escape and less effectiveness of ICI treatment [24].The data showed that OC patients with high-risk scores had a higher immune escape score, TIDE score, T-cell exclusion score, and T-cell dysfunction score (Figures 6(d) and 6(e), all p<0:05).We also used two immunotherapy cohorts to verify our results.In the GSE91061 cohort, patients with high-risk scores had a poor OS rate, and the response rate was significantly higher in patients with low-risk scores (Figure 6(f ), p<0:05).Moreover, the response rate in low-risk score group was significantly increased compared with the high-risk score group 12 Analytical Cellular Pathology (Figure 6(f), p<0:05).The risk score in PD/SD patients was significantly higher than that in PR/CR patients (Figure 6(f), p<0:05).Moreover, high-risk scores indicated a poor OS rate in the IMigor210 cohort (Figure 6(g), p<0:05).Compared with patients with high-risk scores, patients with low-risk scores had a higher response rate (Figure 6(g), p<0:05).The risk score in PD/SD patients was significantly higher than that in PR/CR patients in the IMigor210 cohort (Figure 6(g), p<0:01).As the vital role of chemotherapy, targeted therapy, and endocrinotherapy for the treatment of OC.We then explored the IC 50 value of common drugs in OC patients.As shown in Figure 7 (a)-7(h), OC patients with high-risk scores had a lower IC 50 value of tamoxifen, cyclophosphamide, epirubicin, paclitaxel, dasatinib, foretinib, osimertinib, and ibrutinib, suggesting that OC with high-risk scores may have a better sensitivity to chemotherapy and targeted therapy (all p<0:05).3.6.The Distinct Difference in Cancer-Related Hallmarks in OC Patients with Different CDS Scores.We finally performed gene set enrichment analysis (GSEA) to explore the potential mechanism mediating the difference of OC patients in clinical outcome, ecosystem, and therapy response.As shown in Figure 8 (a)-8(l), OC patients with high-risk score had a lower gene set sore correlated with apoptosis, higher gene set sore correlated with angiogenesis, epithelial-mesenchymal transition (EMT), glycolysis, hypoxia, IL2-STAT5 signaling, IL6-JAK-STAT3 signaling, mitotic spindle, NOTCH signaling, P53 pathway, TGF-Beta signaling, and P13K-AKT-mTOR signaling (all p<0:05).

Discussion
In our study, we developed a prognostic CDS by the combination of StepCox(n = both) + Enet(alpha = 0.2) method in the TCGA dataset.The CDS acted as an independent risk factor for the OS rate in OC and showed stable and powerful performance in predicting patients' OS rate.These findings were also verified in GSE14764, GSE26193, GSE26712, GSE63885, and GSE140082 cohort.Moreover, CDS could serve as an indicator for predicting the ecosystem and immunotherapy benefits of OC patients.Among these 21 CDS genes (TPM3, SYNCRIP, CALM1, CASP2, IER3IP1, AGFG1, SSBP1, CDKN1B, BRPF1, RB1, BRD4, GBP1, FLOT2, PPP1R13L, FPR1, TGM2, LIG3, COL5A2, LRP1, SEC22B, and PDK4), many have been reported to play a vital role in the development of OC.REDD1 could regulate CASP2-dependent cell death of OC by inhibiting mTOR [25].BRPF1 played a vital role in the development and progression in OC [26].A previous study suggested RB1 as an immune-related prognostic biomarker and promising target in OC [27].BRD4 amplification promoted an oncogenic gene expression program in high-grade serous OC and conferred the sensitivity to bromodomain and extra-terminal motif inhibitors [28].Serum exosomes LRP1 accelerated the migration of OC patients [29].
Targeting immune checkpoints and activation of antitumor immunity play a vital role in eradicating tumor cells [30].Immunotherapy offers hope to OC patients with unresectable cancers [31].However, the evidences about OC response to immunotherapy and biomarkers for predicting the immunotherapy response are limited.A high TIDE score indicates a greater likelihood of immune escape and less effectiveness of ICI treatment [24].IPS is a superior predictor of response to anti-CTLA-4 and anti-PD-1 antibodies, and high IPS indicates a better response to immunotherapy [23].In our study, we also explored the role of CDS in predicting the immunotherapy benefit of OC patients.The data showed that OC patients with low CDS scores had a lower immune escape score, lower TIDE, higher TMB, and higher IPS scores, suggesting that OC patients with low CDS scores may benefit more from immunotherapy.
To explore the potential mechanism mediating the difference of OC patients in clinical outcome, ecosystem, and therapy response, we then performed the GSEA analysis.The results showed that OC patients with high-risk scores had a lower gene set sore correlated with apoptosis and a higher gene set score correlated with angiogenesis, EMT, glycolysis, hypoxia, IL2-STAT5 signaling, IL6-JAK-STAT3 signaling,  18 Analytical Cellular Pathology mitotic spindle, NOTCH signaling, P53 pathway, TGF-beta signaling, and P13K-AKT-mTOR signaling.Angiogenesis was correlated with tumor metastasis and as therapeutic targets in OC [32].OC cells produce chemical resistance by regulating glycolysis, which affects T-cell function [33,34].Notch signaling was pivotal for various physiological processes in OC, including immune responses and tumor progression [35].A previous study showed that hypoxia in the microenvironment could affect the immunotherapy outcome of OC [36].Many signatures have been developed for predicting the clinical outcome of OC patients.In order to compare the predictive value of our CDI with other signatures.We randomly collected 54 prognostic signatures (Supplementary 2) and calculated their C-index.Interestingly, the C-index of our CDS was higher than most of these prognostic signatures in the TCGA, GSE76427, and GSE140082 cohorts, suggesting that the value of our CDI in predicting the clinical outcome of OC patients was better than many prognostic signatures.However, the AUC value of ROC and the Cindex of our CDI was not very high, and some of the genes may not be detected by each patient.Therefore, the application of our CDI in predicting the prognosis of OC still needs more clinical OC samples to verify.Moreover, whether the CDI was suitable for other cancers beyond OC need to be further explored.Analytical Cellular Pathology Some limitations and shortcomings remain in our study.All data were obtained from public databases, and it would be better to validate this prognostic model using clinical data.Moreover, it would be better to explore the mechanism of CDS in the progression of OC.Due to complex models with high-dimensional data, the prognostic signature may fail to generalize to new and unseen data.

Conclusion
The current study constructed a novel CDS for OC, which could serve as an indicator for predicting the prognosis, ecosystem, and immunotherapy benefits of OC patients.

FIGURE 1 :
FIGURE 1: Workflow of our study.

6
(a), we identified 23 clusters of 7 OC single-cell samples and 6 main types of cells, T/NK cells, myeloid cells, epithelial cells, fibroblasts, B cells, and endothelial cells (Figure5(a)).And the expression of cell markers is shown in Figure5(b).Based on the expression pattern of cell markers, T/NK cells could be reclustered into CD8+ cytotoxic T, CD8+ exhausted T, NK, CD4+ exhausted T, and CD4+ naïve T (Figures5(c) and 5(d)).And myeloid cells could be clustered into M1-like macrophages, M2-like macrophages, monocyte (mono), plasmacytoid DCs (pDCs), and conventional dendritic cells (cDCs) (Figures5(e) and 5(f)).By using the AddModuleScore function, we then obtained the CDS score of each OC sample and divided them into high and low CDS score groups (Figure5(g)).To cover the interacting ecosystem of the cells of the high CDS score microenvironment and low CDS score microenvironment, we then used CellChat to construct a cell-cell communication network via known ligand-receptor pairs within these cells in OC samples.Figure5(h) shows the cell interactions in high and low CDS score environments.Notably, the low CDS-derived B cells, CD8 + cytotoxic T, and M1-like macrophage possessed a higher number of ligand-receptor pairs, whereas the CD4 + exhauster T and CD8 + exhauster T possessed fewer ligandreceptor pairs (Figure5(h)).

FIGURE 3 :RR
FIGURE 3: Evaluation of the performance of CDS in predicting the clinical outcome of OC patients.((a) and (b)) Univariate and multivariate Cox regression analysis considering grade, stage, and CDS in training and testing cohort.(c) C-index of CDS and other 54 established signatures in evaluating the prognosis of OC patients.(d) Predictive nomogram constructed using CDS, grade, and stage.((e) and (f )) Calibration and ROC curve evaluating the predictive value of nomogram in the overall survival rate of OC patients.(g) DCA demonstrating the good potential of the nomogram for clinical application.

FIGURE 4 :
FIGURE 4: Correlation between immune microenvironment and CDS in OC.(a) The correlation between CDS and the immune cell infiltration is based on several state-of-the-art algorithms.((b)-(d)) The correlation between CDS and the abundance of CD8 + T cells, B cells, and M1like macrophage.(e) ssGSEA analysis showing the level of immune cells in OC patients with different CDS scores.(f ) The stromal score, immune score, and ESTIMAE score in OC patients with different CDS scores.((g) and (h)) The gene set score correlated with immunerelated functions and human leukocyte antigens-related genes in OC patients with different CDS scores.* p<0:05, * * p<0:01, * * * p<0:001.

FIGURE 5 :FIGURE 6 :
FIGURE 5: Single-cell analysis revealing the distinct ecosystem in OC patients.(a) tSNE plot of 34 cell clusters and 6 major cell types from 7 OC single-cell samples.(b) Dotplot showing average expression levels of cell marker genes of major cell types.((c) and (d)) SNE plot of subcell types of T cells and expression pattern of cell markers in dotplot.((e) and (f )) SNE plot of subcell types of myeloid cells and expression pattern of cell markers in dotplot.(g) CellChat revealing cell-cell communication network via known ligand-receptor pairs in OC samples with different CDS scores.(h) The low CDS-derived B cells, CD8 + cytotoxic T, and M1-like macrophage possessed a higher number of ligand-receptor pairs, whereas the CD4 + exhauster T and CD8 + exhauster T possessed fewer ligand-receptor pairs.
Compared with tumor grade and clinical stage, the C-index of CDS was higher (Figure2(h)-2(l)) in the training and testing cohort, demonstrating the predictive value of CDS in predicting the OS rate of OC patients was higher than tumor grade and clinical stage.However, we could not evaluate the predictive value of CDS in predicting the OS rate of OC patients in the GSE26712 cohort due to the missing data about tumor grade and clinical stage.