Construction and Validation of a Ferroptosis-Related Prognostic Model for Gastric Cancer

Background Gastric cancer (GC), an extremely aggressive tumor with a very different prognosis, is the third leading cause of cancer-related mortality. We aimed to construct a ferroptosis-related prognostic model that can be distinguished prognostically. Methods The gene expression and the clinical data of GC patients were downloaded from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus database (GEO). The ferroptosis-related genes were obtained from the FerrDb. Using the “limma” R package and univariate Cox analysis, ferroptosis-related genes with differential expression and prognostic value were identified in the TCGA cohort. Last absolute shrinkage and selection operator (LASSO) Cox regression was applied to shrink ferroptosis-related predictors and construct a prognostic model. Functional enrichment, ESTIMATE algorithm, and single-sample gene set enrichment analysis (ssGSEA) were applied for exploring the potential mechanism. GC patients from the GEO cohort were used for validation. Results An 8-gene prognostic model was constructed and stratified GC patients from TCGA and meta-GEO cohort into high-risk groups or low-risk groups. GC patients in high-risk groups have significantly poorer OS compared with those in low-risk groups. The risk score was identified as an independent predictor for OS. Functional analysis revealed that the risk score was mainly associated with the biological function of extracellular matrix (ECM) organization and tumor immunity. Conclusion In conclusion, the ferroptosis-related model can be utilized for the clinical prognostic prediction in GC.


Introduction
According to the latest global cancer statistics, gastric cancer (GC) ranks fifth in the incidence of cancers and is the third leading cause of cancer-related mortality [1]. e overall survival of patients with GC varies widely in different regions of the world. For example, the 5-year survival rate is 31% in the United States, 19% in the United Kingdom, and 26% in Europe [1]. e conditions mentioned above indicated that GC is a disease with high heterogeneity [2]. e conventional system for prognosis prediction, such as histological grade and tumor stage, is becoming increasingly difficult to cover the clinical diversity of GC [3]. erefore, developing a novel prognostic model is urgent.
Cancer cells often have defects in the execution of cell death, which is one of the main reasons leading to treatment resistance [4]. Ferroptosis is a newly discovered form of regulating cell death, which was driven by the accumulation of lipid peroxidation and lethal reactive oxygen species [5]. In recent years, ferroptosis has gradually become a promising therapeutic method for inducing cancer cell death [6]. Ferroptosis has been observed associated with tumor prognosis [7] and a prognostic ferroptosis-related gene signature has been successfully established in hepatocellular carcinoma [8]. In GC, some genes have been found to play a vital role in regulating ferroptosis. For example, CDO1 [9] and ALOX15 [10] can promote ferroptosis in human GC cell while SCD1 [11], PLIN2 [12] and GDF15 [13] are opposite. However, whether these ferroptosis-related genes are correlated with the prognosis of GC patients still remains to be explored.
Herein, we constructed a ferroptosis-related prognostic model based on mRNA expression profiles and clinical data of GC patients from TCGA and validated it in a meta-GEO cohort. Analyses of functional enrichment and tumor microenvironment were also conducted to explore the potential mechanisms.

Materials and Methods
e flowchart of this research is shown in Figure 1.

Data Collection and Preprocessing.
e gene expression and corresponding clinical information of stomach adenocarcinoma (STAD) samples were downloaded from the UCSC Xena browser (https://xenabrowser.net/) [14]. 375 STAD samples with expression and clinical data were obtained. After removing samples with 0-days follow-up duration and incomplete clinical information, 317 STAD samples and 32 adjacent samples were obtained and used for the primary cohort. Gene expression data and corresponding clinical information of GSE66229 [15], GSE15459 [16], and GSE34942 [17] datasets, totally including 556 GC patients, were retrieved from the Gene Expression Omnibus (GEO) database. e above three GEO datasets were performed on the same microarray platform of [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array. After removing 11 patients with incomplete clinical information or 0-days follow-up duration, the remaining 545 GC patients were consolidated as a GC meta-GEO cohort.
e batch effects were removed by the "combat" function of the "sva" package of R. Finally, 317 GC patients from the TCGA-STAD cohort and 545 GC patients from the meta-GEO cohort were enrolled. e clinical characteristics of the patients above are detailed in Table 1. FerrDb (http:// www.zhounan.org/ferrdb/) collected 259 ferroptosis-related genes including driver, suppressor, and marker [18]. e confidence levels of genes involved in ferroptosis were assigned to 4 degrees including validated, screened, predicted, and deduced. e species involved included humans, mice, rats, and drosophila. To ensure the accuracy and stability of the model, 121 human-related and validated ferroptosis-related genes were obtained and provided in Supplementary Table S1.

Identification of Differentially Expressed and Prognostic
Genes. In the primary cohort, ferroptosis-related genes with differential expression between tumor tissues and adjacent tissues were identified utilizing the "limma" R package according to the following cut-off value: false discovery rate (FDR) < 0.05. rough univariate Cox analysis, the association between expression levels of ferroptosis-related genes and GC patients' overall survival (OS) was explored. Overlapping ferroptosis-related genes with differential expression and prognostic value in the primary cohort were subjected to construct a prognostic model.

Construction and Validation of the Prognostic Ferroptosis-Related Model.
Based on the expression of prognostic DEGs and survival data, the LASSO Cox regression analysis by R package "glmnet" was performed to further select the most useful prognostic markers and the penalty regularization parameter lambda was chosen based on 10 cross-validations.
rough multiplying the expression level of a gene by its corresponding Cox regression coefficient, the risk score for each patient was calculated using the following formula: risk score � e sum (each gene's expression × corresponding coefficient) . e patients were separated into high-and low-risk groups based on the median value of the risk score. e "Rtsne" package and the prcomp function in the "stats" package were used to perform the t-SNE and PCA analysis to explore the distribution of high-and low-risk groups. Kaplan-Meier survival curves and a time-dependent ROC curve analysis were applied to compare the survival between the above two groups and evaluate the model's predictive ability using the "survivalROC" package in R, respectively.

Functional Enrichment Analysis.
e enrichment analysis of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) was carried out according to the DEGs to explore different molecular mechanisms and between high-and low-risk patients by utilizing the "clus-terProfiler" R package. e P values are adjusted using the BH method to control the FDR.
2.5. Calculation of Immune Score, Stromal Score, and ESTI-MATE Score. ESTIMATE (Estimation of Stromal and Immune cells in Malignant Tumor tissues using expression) algorithm was used to evaluate the ratio of the immunestromal component in the tumor microenvironment (TME) through utilizing "estimate" R package, which generates three scores including Immune Score (reflecting the level of immune cells infiltrations), Stromal Score (reflecting the  presence of stroma), and ESTIMATE Score (reflecting the sum of both) [19]. e higher the respective score is, the larger the ratio of the corresponding component in TME exists.
2.6. Immune Cells Infiltration and Immune-Related Pathways between Two Risk Groups. e infiltrating level of 16 immune cells and the activity of 13 immune-related pathways in each GC patient were quantified using single-sample gene set enrichment analysis (ssGSEA) [20] in the "gsva" R package. e annotated gene set file is presented in Supplementary Table S2.

Quantitative Real-Time PCR.
Total RNA was extracted from cells with TRIzol reagent (Invitrogen, China) in accordance with manufacturer's protocol. Reverse transcription was carried out according to the manufacturer's instructions using the PrimeScript RT Reagent Kit (Takara, China). e SYBR PrimeScript RT-PCR Kit (Takara) was applied for the analysis of quantitative reverse transcriptionpolymerase chain reaction (qRT-PCR). e 2 −ΔΔCt statistic was used to calculate the expression levels of genes. e concrete sequences of different primers used in this study are included in Supplementary Table S3. 2.9. Statistical Analysis. Student's t-test was applied to identify the differentially expressed ferroptosis-related genes between tumor tissues and adjacent tissues and evaluate the difference of Immune Score, Stromal Score, and ESTIMATE Score between risk groups. e Chi-squared test was used to compare the difference of proportion composition. e difference in ssGSEA scores of immune cells or pathways between the risk groups was evaluated by the Mann-Whitney test with P values adjusted by the BH method. e OS between groups was compared by using the Kaplan-Meier analysis with the log-rank test. And the identification of independent predictors of OS was conducted by the analysis of univariate and multivariate Cox regression. All statistical analyses were performed with R software (Version 3.6.3) or GraphPad Prism software (Version 8.0). All P values are two-tailed with a P value less than 0.05 was considered statistically significant.

Identification of Prognostic Ferroptosis-Related DEGs in the TCGA Cohort.
A total of 80 DEGs were identified related to ferroptosis in GC, and 10 of them were correlated with OS ( Figure 2(a)). Among the 10 prognostic ferroptosis-related DEGs, upregulated and downregulated genes accounted for half in tumor tissue, which was visualized using a heatmap ( Figure 2(b)). According to the univariate Cox regression analysis, all of the 10 genes were significantly associated with the OS of GC patients, of which 6 were risk genes (HR > 1) and 4 were protective genes (HR < 1) (Figure 2(c)). e correlation between the above 10 prognostic ferroptosisrelated DEGs is shown in Figure 2(d).

Construction of a Prognostic Model in the TCGA Cohort.
rough LASSO Cox regression analysis, 8 predictors most contributing to the OS of GC patients were screened out (i.e., TCFBR1, MYB, NFE2L2, ZFP36, TF, SLC1A5, NF2, and NOX4) based on the optimal value of λ ( Figure S1) and were subjected to construct a ferroptosis-related prognostic model using the following formula: risk score � e (0.062 * ex- . Utilizing the median risk score as the cut-off value, the patients in the TCGA primary cohort were stratified into a high-risk group (n � 158) or a low-risk group (n � 159) (Figure 3(a)). e chisquared test indicated that the higher risk group had a higher proportion of advanced tumor grade and stage in the TCGA cohort (Table 2). PCA and t-SNE analysis revealed that the patients in different risk groups were divided into two directions (Figures 3(b) and 3(c)). As presented in Figure 3

Validation of the Prognostic Model in the Meta-GEO
Cohort. In order to test the robustness of the model developed by the TCGA cohort, we calculated the risk score of each patient in the meta-GEO cohort by the same formula we obtained from the TCGA cohort. en, according to the median value of the prognostic signature score, the patients from the meta-GEO cohort were divided into high-risk (n � 272) or low-risk groups (n � 273) as well (Figure 4(a)). In the meta-GEO cohort, the high-risk group also had a higher proportion of advanced tumor stage ( Table 2). PCA and t-SNE analysis also confirmed a reliable clustering ability of risk score (Figures 4(b) and 4(c)). Similarly, patients in the high-risk group tended to suffer an earlier death and have a significantly shorter survival time than the lowrisk group (Figure 4(d), P � 0.001). In addition, ROC analysis was performed, with the AUC values of 1, 3, and 5 years being 0.646, 0.623, and 0.629, respectively.

Functional Analyses in the TCGA and the Meta-GEO
Cohort. To clarify the biological functions and pathways correlated with the risk score, the enrichment analysis of GO enrichment and KEGG pathway was implemented based on the DEGs between the high-risk and low-risk groups in TCGA-STAD and meta-GEO cohort. According to GO enrichment analysis, the DEGs between risk groups from the TCGA and meta-GEO cohorts were mainly enriched in extracellular matrix (ECM) organization (P. adjust <0.05, Figures 6(a) and 6(c)). KEGG pathway analysis also showed that the ECM-receptor interaction pathway was significantly enriched in both cohorts (P. adjust <0.05, Figures 6(b) and 6(d)).

Estimation of the Proportion of Immune-Stromal
Component. GO and KEGG enrichment analyses illustrated that the differential functions and pathways between risk groups were mainly concentrated on extracellular matrix organization and ECM-receptor interaction pathway.
erefore, it is necessary to further explore the contents of the immune-stromal components in TME. According to the ESTIMATE algorithm, Immune Score, Stromal Score, and ESTIMATE Score (the sum of them) were significantly higher in high-risk groups (P < 0.05,  Figures 7(a) and 7(b)) indicating that there exist more immune-stromal components in TME of the high-risk groups.

Immune Cells Infiltration and Immune-Related Pathways.
To further discuss the differences in immune status between high-and low-risk groups, we estimated the enrichment    Journal of Oncology scores of immune cells and immune-related functional pathways. In the TCGA cohort, the immune cell subpopulations of B cells, iDCs, macrophages, mast cells, neutrophils, pDCs, T helper cells, and TIL were significantly upregulated in the high-risk groups (all adjusted P < 0.05, Figure 8(a)). As for the immune-related pathways, APC costimulation, CCR, parainflammation, and type II IFN response were significantly upregulated in the high-risk group, while MHC class I was opposite (all adjusted P < 0.05, Figure 8(b)). Combined with the meta-GEO cohort, the alterations of macrophages, mast cells, neutrophils, TIL, APC costimulation, CCR, parainflammation, type II IFN

Discussion
e heterogeneity of GC makes it important to develop stable prognostic indicators. In this study, we developed a ferroptosis-related model containing 8 signatures for predicting the prognosis of GC based on the data from TCGA and validated its predictive efficiency in a meta-GEO cohort. Both in the above two cohorts, patients with GC in the high-risk group harbored a shorter survival than those in the lowrisk group. Functional analyses illustrated that the alteration of the risk score was mainly associated with extracellular matrix organization and ECM-receptor interaction pathway. Besides, it was also related to the characteristics of TME. Compared with the low-risk group, the high-risk group harbored higher Immune Score, Stromal Score, and ESTI-MATE Score, suggesting that there existed higher levels in immune cells infiltration and stroma component but lower tumor purity in the TME of high-risk group. It is known that TME is typically characterized into three classes [21,22]: (1) immune-inflamed: immune cells exist adjacent to tumor cells, (2) immune-excluded: immune cells exist around stroma but are not penetrating the tumor, (3) immunedesert: lacks immune cell infiltration. In the current study, according to the exhibitions of high-risk group with a higher abundance of immune cell infiltration and larger ratio of stroma component but poorer prognosis, it was reasonable to speculate that the TME of the high-risk group was in accordance with the immune-excluded subtype. In this context, although the TME displays an abundant infiltration of immune cells, they cannot effectively penetrate the tumor parenchyma to eliminate tumor cells. erefore, it was not surprising that the high-risk group tended to carry a poorer prognosis.   We noticed that TIL were higher in the high-risk group while CD8+ T cell had no difference, suggesting that other types of cells in TIL may differ between risk groups. TIL are comprised of CD8+ T cells, CD4+ T cells, regulatory T cells, tumor associated macrophages, tumor associated neutrophils, myeloid derived suppressor cells, and natural killer cells, which interact with each other to exert antitumor or protumor effects [23]. Most cancers harbored a longer survival with a high amount of CD8+ T cells [24], while increased expression of protumor cells such as regulatory T cells, tumor associated macrophages, tumor associated neutrophils, and myeloid derived suppressor cells usually portend worse outcome [25]. In the current study, the high-risk group has a higher level of macrophages, mast cells, and neutrophils infiltration both in the TCGA and meta-GEO cohort. Studies have shown that tumor-associated macrophages, associated with poor prognosis of GC patients [26], can promote tumor cell proliferation [27], invasion [28], metastasis [29], and immune escape [30]. Increased levels of mast cells foster immune suppression and independently predict reduced OS of GC patients [31]. Tumor-infiltrating neutrophils can also restrain normal T-cell immunity and were associated with GC worse survival [32]. Apart from increased infiltration of protumor immune cells, high-risk groups also had a higher level of type II IFN response characterized by attenuated antitumor immunity. Overall, the enhancement of protumor immunity and the impairment of antitumor immunity may also be the cause of the poor prognosis. e number of ferroptosis-related DEGs between GC and normal samples accounted for 66.1%, indicating a potential role of ferroptosis in GC. ere were 8 signatures constituting the risk model, including TGFBR1, MYB, NFE2L2, ZFP36, TF, SLC1A5, NF2, and NOX4. According to the annotation from FerrDb (http://www.zhounan.org/ ferrdb/) [18], TGFBR1, MYB, TF, SLC1A5, and NOX4 are drivers that promote ferroptosis, while NFE2L2, ZFP36, and NF2 are suppressors that prevent ferroptosis. ey were reported to be involved in ferroptosis. TGFBR1, a type I receptor for TGF-β, can weaken erastin-induced HK-2 cell ferroptosis [33]. MYB is a well-known protooncogene protein which has been reportedly associated with the alterations of ROS [34]. In GC cells, inhibition of c-Myb can suppress the transcription of CDO1, therefore enhancing GSH generation, preventing ROS generation, and ultimately restraining erastin-induced ferroptosis. TF can transport iron into cells. Knockout of TF led to decreased ROS and ferroptosis [35]. SLC1A5 is a cell surface transporter responsible for the transport of neutral amino acids [36]. Overexpression of SLC1A5 can attenuate ferroptosis suppression mediated by miR-137 in melanoma cells. NOX4 participates in the generation of ROS and inhibition of it can block ferroptosis. NFE2L2, also namely NRF2, is mainly involved in the antioxidant response [37]. It served as the ferroptosis suppressor in multiple cancers including hepatocellular carcinoma [38], glioblastomas [39], head and neck cancer [40], and ovarian cancer [41]. ZFP36, regulating cell response to lipid peroxidation and oxidative stress, can lead to resistance to ferroptosis when overexpressed in liver fibrosis [42]. Score * * * * * * * * * * * * * * * * * * * *

Risk
Low High interaction between epithelial cells medicated by E-cadherin could suppress ferroptosis through Merlin-YAP signaling and inactivation of NF2, the Merin-encoding gene, enabled cancer cells to cause ferroptosis [43]. How these genes regulate ferroptosis and shape TME in GC remains to be further explored. is study had some limitations. Firstly, our model was constructed and validated based on retrospective data. Prospective clinical validation is needed henceforth. Secondly, the correlation between risk score and TME in GC has not been investigated experimentally.

Conclusions
In conclusion, our study constructed a ferroptosis-related prognostic model for GC, which was an independent factor associated with OS. However, the underlying mechanisms between ferroptosis-related genes and TME in GC still remain to be explored.

Data Availability
e data of this study are from TCGA and GEO databases.

Disclosure
Xiaotao Jiang and Qiaofeng Yan are the co-first authors.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Kunhai Zhuang, Fengbin Liu, Xuan Chen, and Peiwu Li designed the study. Xiaotao Jiang, Jiahua Huang, and Linling Xie carried out the analysis and wrote the manuscript. Qiaofeng Yan and Shijie Xu performed the experiment. Yi Wen, Shuting Tang, and Kechao Nie participated in the coordination of the study and interpretation of results.  Table S1: 121 human-related and validated ferroptosis-related genes. Table S2: the annotated gene set file used in ssGSEA.