System Analysis of ROS-Related Genes in the Prognosis, Immune Infiltration, and Drug Sensitivity in Hepatocellular Carcinoma

Hepatocellular carcinoma (HCC) is an aggressive malignant tumor with a poor prognosis. Reactive oxygen species (ROS) play an important role in tumors; however, the role of ROS-related genes is still unclear in HCC. Therefore, we analyzed the role of ROS-related genes in HCC via bioinformatics methods. Firstly, a prognosis model was constructed using LASSO Cox regression and multivariate analyses. We also investigated the potential function of the ROS-related genes and the correlation with immune infiltration, tumor stemness, and drug sensitivity. ICGC database was used for validation. Secondly, we further analyzed the role of 11 ROS-related genes in HCC. As a member of ROS gene family, the role of STK25 has remained unclear in HCC. We explored the biological function of STK25 using in vitro experiments. The present study was the first to construct a ROS-related prognostic model in HCC. The correlation of ROS-related genes with immune infiltration, tumor stemness, and drug sensitivity was dissected. Furthermore, we demonstrated that STK25 knockdown could increase the proliferation, migration, and invasion capacity of HCC cells.


Introduction
Hepatocellular carcinoma (HCC) is the primary pathological type of liver cancer and is one of the most common malignancies worldwide [1,2]. Liver cancer ranks the second leading cause of cancer-related death due to lack of effective treatment [3]. Currently, the mainstay treatment of HCC is surgical excision, liver transplantation, interventional, chemoradiotherapy, and targeted drug therapy. The early diagnosis of HCC is difficult, and hence, the majority of HCC patients suffer from a poor prognosis with a high recurrence rate. Therefore, it remains clinically essential to identify the novel and effective diagnostic markers for HCC. Chronic liver diseases, such as hepatitis B virus, liver cirrhosis, alcoholic liver disease, and nonalcoholic fatty liver disease, are the major risk factor for HCC [4]. HCC is a highly heterogeneous disease, and its mechanism of HCC is still not completely understood. Abnormal expression and mutation of genes contribute to the progression of HCC [5]. However, the underlying mechanisms of HCC development and the key driving factors of carcinogenesis are still unclear, which impedes the development of targeted treatment [6].
Reactive oxygen species (ROS) are regarded as reactive oxygen metabolites and oxygen-containing materials, including superoxide anion (O 2 -) and hydroxyl radical (OH -) as well as nonradical molecules, such as hydrogen peroxide (H 2 O 2 ) [7]. At normal concentrations, ROS serves as the second message that participates in a diversity of signal transduction and regulates cell growth, differentiation, and proliferation. Nevertheless, oxidative stress is the consequence of the imbalanced redox state accompanied by ROS production exceeds cell capacity for ROS scavenging that has been implicated in HCC occurrence [8]. Previous studies have indicated that ROS play a vital role in the progression of HCC through the induction of autophagy [7]. However, a comprehensive analysis of the role of ROS-related genes in HCC has not been reported.
Serine/threonine-protein kinase 25 (SK25), also known as YSK1 or SOK1, plays important roles in different biological processes, such as the regulation of cell migration and modulation of Golgi morphology [9][10][11][12]. Some studies have shown that SOK1 can be activated by chemical anoxia induction, which is dependent upon the generation of ROS [13,14]. In addition, a previous research has indicated that SOK1 promotes the apoptotic response to ROS with marked ROS production and severe ATP depletion [12]. However, as an ROS-related gene, the role of STK25 in liver cancer has not been reported.
In the present study, we systematically investigated the expression and clinicopathological characteristics of ROSrelated genes and constructed ROS-related gene prognostic model in HCC patients. We further demonstrated the relationship between ROS-related genes and tumor-infiltrating immune cells. An ROS-related gene risk model can be used as a prognostic biomarker to predict immune microenvironment in HCC patients. Moreover, focusing on the clinicopathological and immunological characteristics of ROSrelated genes in HCC may optimise tumor immunotherapy.

Public mRNA Expression Datasets.
The mRNA expression of 371 HCC patients and the clinicopathological information of HCC samples were extracted from TCGA (https://portal.gdc.cancer.gov/) and ICGC (https://dcc.icgc .org) databases. In TCGA database, among the 371 patients, six patients were excluded due to lack of survival time, and 365 patients were eventually included in the study ( Table 1). All the raw count data were analyzed to identify differential expression genes (DEGs) in HCC samples and matched noncancerous samples by the package "limma" of the R software. | log 2FC| = 0 and P < 0:05 were set as the cut-off point. The mRNA data and clinical information of 231 liver cancer samples were downloaded from the ICGC database (http://dcc.icgc.org/projects/LIRI-JP). Then, we summarized 49 ROS-related genes from the Molecular Signatures Database (MSigDB) v7.2 (https://www.gsea-msigdb .org/gsea/msigdb/index.jsp) [15] for further analysis. They were listed in Supplementary Table S1. 38 detailed immune checkpoint genes are listed in the Supplementary Table S2. 2.2. Construction of a ROS-Related Gene-Based Signature. All the raw data were analyzed by the "limma" package in the R software and identified the DEGs between HCC tissues and noncancerous tissues. Univariate regression analysis of OS was conducted to screen ROS-related prognostic genes. Benjamini & Hochberg (BH) correction was used to adjust the P value. Then, the STRING database (version11.0) was used for constructing overlapping differential prognostic gene interaction networks [16]. In order to minimize the risk of overfitting, LASSO-penalized Cox regression analysis was used to develop the prognosis model [17,18]. The LASSO algorithm was used for variable selection, combined with "glmnet" R package for shrinkage. In the regression model, we took the normalized expression matrix of the candidate differential prognostic genes as the independent variable and the overall survival and state of the patients in TCGA cohort as the response variable. The penalty parameter (model parameter) of the model is calculated by ten ties cross-validation according to the minimum standard. The risk score was calculated based on the normalized expression level of each gene and its corresponding regression coefficient. The coefficient of each gene is listed in Supplementary  Table S3. The formula was built as follows: score = e sum ð each gene ' s expression × corresponding coefficientÞ. Patients were classified into high-risk and low-risk groups depending on the median risk score. Principal component (PCA) analysis was performed using "PRCOMP" function of "stats" R package based on the expression of gene signature. In addition, the "Rtsen" R package was adopted for t-distributed stochastic neighbour embedding (t-SNE) analysis to investigate the distribution of diverse groups. Kaplan-Meier curves were utilized to estimate the differences in OS between the two groups. The time-dependent ROC curve was plotted to illustrate the sensitivity and predictive ability of gene signature based on the "survival valroc" R package.
2.3. The Construction of Nomograms. Nomogram models were constructed based on the expression of ROS-related prognostic genes by using the "rms" and "survival" packages in R [19]. Then, calibration curves were plotted to estimate the consistency between actual and predicted survival.

Functional Enrichment Analysis.
To explore the potential functional features of ROS-related genes in HCC, we performed Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) analyses based on ROSrelated DEGs (|log 2FC | ≥1, FDR < 0:05) between different risk groups via the "clusterProfiler" R package.  2 Oxidative Medicine and Cellular Longevity 2.5. Immune Infiltration Analysis. Then, we adopted the "gsva" R package to investigate single-sample gene set enrichment analysis (ssGSEA) and to calculate the relationship between risk score and 16 immune-infiltrating cells and 13 immune-related pathways. We further explored the connection between ROS-related DEGs and tumor purity and immune-infiltrating cells by the TIMER and TISIDB databases based on a previously published statistical deconvolution method from gene expression profiles. Tumor Immune Estimation Resource (TIMER) (http://cistrome.org/TIMER/) [20] is an ideal resource that contains 10897 samples across 32 cancer types from TCGA, and it can be adopted to comprehensively analyze immune infiltration levels of diverse cancer types. TISIDB (http://cis.hku.hk/TISIDB/) is also an online web with different types of data which can be used to explore the relationship between the tumor and immune infiltration [21].
2.6. The GEPIA. GEPIA (Gene Expression Profiling Interactive Analysis) (http://gepia.cancer-pku.cn/) has been a valuable and highly cited resource for gene expression analysis based on tumor and normal samples from TCGA and the GTEx databases [22]. GEPIA2021 is a standalone extension with multiple deconvolution-based analyses for GEPIA. They deconvolute each sample tool by TCGA/GTEx with the bioinformatics tools CIBERSORT, EPIC, and quanTIseq [19].
2.7. The UALCAN Analysis. UALCAN is a comprehensive, user-friendly, and interactive web resource for analyzing cancer OMICS data. It is built on PERL-CGI with highquality graphics using JavaScript and CSS [23]. UALCAN can be used to explore the expression profile and patient survival information for genes and to evaluate epigenetic regulation of gene expression by promoter methylation.
2.8. The Kaplan-Meier Plotter Analysis. The Kaplan-Meier plotter can be used to estimate the effect of 54 k genes on survival in 21 cancer types including liver cancer (n = 365) [19,24]. Sources for databases are from GEO, TCGA, and EGA. Oxidative Medicine and Cellular Longevity working solution (10 μM) was added into cells in 6-well plates and incubated for 2 hours. After that, the culture medium was removed, and 1 ml 4% paraformaldehyde was added to each well at room temperature for 15 min. Then, the cells were washed with a washing solution (PBS solution containing 3% bovine serum albumin (BSA)) three times. The permeability solution (0.3% Triton X-100 solution) was added and incubated for 15 min at room temperature. Then, 0.5 ml Click reaction solution was added to each well and incubated for 30 min in the dark room. Finally, 1 ml 1X Hoechst33342 solution was added to 6-well plates and incubated for 10 min in the dark room. The staining was observed under a microscope.

Statistical Analysis
Student's t-test was employed to compare gene expression between HCC tissues and noncancerous tissues. Chi-square test was used to compare proportions between the high-risk group and the low-risk group. The Mann-Whitney test was applied to compare the ssGSEA score of immune cells or pathways among different risk groups. The OS in different groups was performed by Kaplan-Meier (KM) survival analysis with a log-rank test. Univariate and multivariate analyses were used to evaluate independent risk factors for OS. The association of immune checkpoint genes with the risk score was performed by R. F-test (one-way ANOVA) was utilized to identify the expression of 11 ROS-related genes in immune cells between tumor tissues and normal tissues. All statistical analyses were conducted using R software 3.4.2 and SPSS 23.0 software, and P < 0:05 was considered statistically significant.

Results
The workflow of the present study is shown in Figure 1      Oxidative Medicine and Cellular Longevity model. We further conducted functional enrichment and immune infiltrating analyses. Differential expression analysis and survival analyses were performed on 11 ROS-related genes, and the relationship between the expression of the genes and TNM stages was calculated. In addition, we explored the expression of 11 ROS-related genes among different molecular and immune subtypes. Finally, only G6PD and STK25 were retained with P < 0:05. Although, the role of G6PD has been reported in HCC, the role of STK25 has remained unclear. In vitro experiments were carried out to explore the role of STK25 in liver cancer.  Supplementary  Table S1 were drawn from the MSigDB database. 38 ROSrelated genes were differentially expressed between HCC tissues and paired noncancerous tissues by DEG screening analysis based on the "limma" package in the R software. In addition, based on TCGA database, we identified 19 differentially prognostic genes by performing univariate analysis for OS. The results of univariate regression analysis are shown in Figures 2(a)-2(c) (P < 0:05). The genes included are CAT, MSRA, NQO1, FTL, PRDX1, TXN, GLRX2, PRDX6, PRNP, ABCC1, PFKP, CDKN2D, ERCC2, G6PD, STK25, OXSR1, GSR, SRXN1, and TXNRD1. Based on the result of univariate analysis, a protein-protein interaction (PPI) network was built to analyze the association between 19 ROS-related genes, and the finding revealed that STK25 is one of the hub genes (Figures 2(d) and 2(e)).

Constructing a ROS-Related DEG Prediction
Model in TCGA Datasets. LASSO Cox regression was performed to construct a prognostic model, and based on TCGA database, 11 ROS-related genes were retained by the coefficient values, which include CDKN2D, G6PD, GLRX2, GSR, MSRA, OXSR1, PFKP, PRDX1, PRDX6, SRXN1, and STK25. Patients were dichotomized into high and low groups based on the median cut-off value. Patients in the high-risk group were linked to tumor grade (P < 0:001) ( Table 2). Next, we performed PCA and t-SNE to explore the distribution of patients with different risk scores. The results showed that patients were grouped into two directions, and the highrisk group had a shorter survival time than the low-risk group (P < 0:05) (Figures 3(a)-3(d)). This was supported by the results of Kaplan-Meier analysis, which showed that the OS time was shorter in the high-risk group compared with the low-risk group (P = 2:319e − 06) (Figure 3(e)). Furthermore, ROC further demonstrated the possibility and accuracy of OS prediction based on the risk score. The AUC values of 1-year survival, 3-year survival, and 5-year were 0.793, 0.713, and 0.684, respectively (Figure 3(f)). Next, based on the ICGC database, we performed LASSO Cox regression, PCA, t-SNE, survival analysis, and ROC curve to demonstrate the aforementioned results. Our findings indicated that the high-risk group had a shorter survival time than the low-risk group, and the low-risk group had a longer survival time than the high-risk group (P < 0:05) (Figures 4(a)-4(d)). Survival analysis revealed the high-risk group had a shorter OS time than the low-risk group (P < 0.05) (Figure 4(e)). The results of ROC manifested that the AUC values of 1-year survival, 2-year survival, and 3year survival were 0.793, 0.713, and 0.704, respectively (Figure 4(f)). In the ICGC database, only a few liver cancer patients showed 5-year survival, and no results were obtained with R analysis. The ROC of 1, 2, and 3 years was performed to confirm the diagnostic value of the model.

Prognostic Significance of the 11-Gene Signature in HCC.
Patients were grouped into two groups (high-risk and lowrisk) based on the median risk value. According to the univariate and multivariate Cox regression analyses of TCGA database, the 11-gene signature was an independent prognostic factor for OS. The risk scores of univariate regression and multivariate analyses of OS were HR = 3:448 (2.479-4.796) (P < 0:001) and HR = 3:145 (2.179-4.296) (P < 0:001) (Figures 5(a) and 5(b)). Then, we performed the univariate and multivariate Cox regression analyses based on the ICGC cohort. The results also verified that the 11-gene signature was an independent prognostic factor for OS. The risk scores of univariate and multivariate analyses of OS were HR    6 Oxidative Medicine and Cellular Longevity

Functional Analysis in HCC.
In addition, we determined the potential mechanisms of 11 ROS-related genes in HCC based on TCGA and ICGC databases. The "clusterProfiler" R package was adopted to conduct GO and KEGG enrichment analyses. GO analysis indicated that B cell-medicated immunity, lymphocyte-mediated immunity, immunoglobulin-    Oxidative Medicine and Cellular Longevity immune status, and ssGSEA was used to quantify enrichment score for different subsets of immune cells as well as related functions and pathways. Our results also showed that the scores of aDCs, iDCs, macrophages, Tfh, Th1_cells, Th2_ cells, and Treg in the high-risk group were higher compared with the low-risk group (Figure 8(a)). Immune infiltration of HCC with 11 ROS-related DEGs and the scores of APC_co_ stimulation, CCR, checkpoint, HLA, and parainflammation were higher in the high-risk group than that in the low-risk group. The scores of type_I_IFN_response and type_II_ IFN_response were lower in the high-risk group than that in the low-risk group (Figure 8(b)). Consistent with the aforementioned results, the association between risk score and immune status was investigated using the ICGC database. Our findings indicated that the scores of DCs, macrophages, and Th2_cells were higher in the high-risk group than in the low-risk group (Figure 8(c)). The score of MHC_class_I was higher, and that of type_II_IFN_response was lower in the high-risk group than in the low-risk group (Figure 8(d)).

Correlation of Risk Score with the Expression of Immune
Checkpoint Genes for Liver Cancer. Currently, several genes are involved in the immune response, which are considered immune checkpoint genes. We investigated the expression of 38 immune checkpoint genes (listed in Supplementary   Table S2) in the high and low-risk groups of the model and performed survival analysis of differentially expressed genes (DEGs). Our results showed that 31 immune checkpoint genes were DEGs (P < 0:05) (Figure 9(a)). Furthermore, survival analysis indicated that high expression of CD80, LDHA, TNFRSF4, and YTHDF1 was associated with poorer OS in the high-risk group compared with the low-risk group (P < 0:05) (Figures 9(b)-9(e)). These findings suggested that the genes in the model possibly regulate tumor immune response by modulating immune checkpoint activity, thus providing a potential target for immunotherapy.            Figure S1).

The Expression of 11 ROS-Related Genes in Different
Immune Cells between Tumor Tissues and Matched Adjacent Tissues. Since the expression of ROS-related genes was significantly correlated with immune-infiltrating cells, we speculated whether the levels of ROS-related gene expression were different in diverse immune cells in HCC tissues and corresponding adjacent tissues. By analyzing the GEPIA database, the findings indicated that the expression of 11 ROS-related genes was a statistical difference in HCC tissues (Figure 11(a)) (P < 0:05). At the same time, we also explored the expression of 11 ROS-related genes in different immune cells between HCC tissues and paired noncancerous tissues, and our results revealed that the levels of 11 ROSrelated gene expressions were significantly different in B naive cells, macrophage M0, and neutrophils ( Figure 11(b)) (P < 0:05). The above findings further demonstrated that ROS-related genes are related to tumor immunity.   [26,27]. Some studies confirmed correlations between CNA signatures and cancer characteristics [28,29]. Therefore, we investigated the copy number alterations of ROS-related genes and the relationship of CNAs with immune-infiltrating cells. Our findings manifested that the majority of genes had CNA mutations (Figure 12(a)) (P < 0:05), and the CNAs of G6PD, GLRX2, PFKP, PRDX1, PRDX6, and STK25 were related to immuneinfiltrating cells, especially in B cells, CD8+ T cells, macrophages, neutrophils, and dendritic cells (Figures 12(b)-12(l)) (P < 0:05).

11 ROS-Related Genes Are Correlated with Tumor
Stemness, TME, and Drug Sensitivity. Stemness represents the self-renewal, dedifferentiation with the ability to form other cell types in certain specific tissues [30]. A previous study indicated that tumor progression was not only affected by genetic changes, but also by tumor microenvironment [31]. Stemness index is associated with components of TME, and genetic variations in tumor cells can interfere with antitumor immunotherapy [32].          (Figures 13(c)-13(e)). In addition, some studies indicated there was a relationship between the level of ROS and multidrug resistance in cancer [33][34][35]. Therefore, we investigated the association of 11 ROS-related gene expressions with chemotherapy drugs. As shown in Figure 13

Relationship between 11 ROS-Related Gene Expressions and Tumor Subtypes in HCC.
To further dissect the role of ROS-related genes in liver cancer, we divided liver cancer patients into three kinds of molecular subtypes based on TCGA database, namely, iCluster:1, iCluster:2, and iCluster:3, and five types of immune subtypes, namely, C1, C2, C3, C4, and C6. We explored whether the expression of ROS-related genes was a statistical difference among different subtypes. Our findings showed that the expression of G6PD, GLRX2, MSRA, PFKP, PRDX1, PRDX6, and STK25 was significantly different in three molecular subtypes (P < 0:05) (Figure 14(a)). We also found that the expression of CDKN2D, G6PD, GLRX2, MSRA, PFKP, PRDX1, PRDX6, and STK25 was a significant difference among five immune subtypes (P < 0:05) (Figure 14(b)). The above results further confirmed that ROS-related genes were involved in the occurrence and development of liver cancer.

11 ROS-Related Gene Expressions and Survival Analysis in HCC.
The aforementioned results have shown that the model can predict the survival and prognosis of liver cancer patients, and it is closely related to tumor-infiltrating cells, TME, and immune checkpoints. To further study the role of 11 ROS-related genes in liver cancer, based on the UAL-CAN database, we found that CDKN2D, G6PD, GLRX, GSPPFKP, PRDX1, PRDX6, SRXN, and STK25 expression was higher in liver cancer tissues than that in adjacent normal tissues, and MSRA and OXSR1 expression was lower in liver cancer tissues compared with paired noncancerous tissues (P < 0:05) (Figure 15(a)). Then, we also explored the correlation between the expression of 11 ROS-related genes and TNM stages in liver cancer, and the findings

23
Oxidative Medicine and Cellular Longevity was associated with poor overall survival (OS) via the analysis of GEPIA cohort (Figure 15(c)). Then, we further demonstrated the relationship between the expression of 11 ROSrelated genes and survival of liver cancer patients based on the Kaplan-Meier plotter database, and the findings revealed that overexpression of CDKN2D (P = 0:0099), G6PD (P = 1:1e − 07), GSR (P = 0:00019), MSRA (P = 0:0013), PFKP (P = 0:00016), PRDX1 (P = 0:025), SRXN1 (P = 0:00054), and STK25 (P = 0:004) was correlated with short overall survival time ( Figure S2). P < 0.0001 P < 0.0001 P < 0.0001 P < 0.0001 P < 0.0001 P < 0.0001 P < 0.0001 P < 0.0001 P < 0.0001 P < 0.0001 P<0.0001    To further verify the expression of STK25 in liver cancer tissues and paracancerous tissues by the GEO database and clinical specimens, the expression of STK25 was confirmed based on the GEO database. The results showed that STK25 expression was higher in liver cancer tissues compared with paired noncancerous tissues ( Figure 16(a)). Consistent with the results of TCGA and GEO databases, we detected STK25 expression in 9 pair tissues by performing IHC assays. Our findings revealed that the expression of STK25 was higher in the liver cancer tissues compared with the paracancerous tissues ( Figure 16(b)).
4.14. STK25 Knockdown Promoted Apoptosis and Inhibited the Proliferation, Migration, and Invasion Capacity of Liver Cancer Cells. The analysis of public database data showed that STK25 was correlated with survival and prognosis of patients with liver cancer. We further demonstrated the role of STK25 in liver cancer by performing in vitro experiments. CCK8 assay revealed that STK25 knockdown significantly suppressed the proliferation of liver cancer cells (Figure 16(c)). In line with the result of CCK8 assay, EdU assays indicated that STK25 knockdown markedly decreased the proliferation capacity of liver cancer cells (Figure 16(d)). Wound healing assay and transwell migration assays were utilized to test cell migration. We observed that STK25 knockdown dramatically suppressed the migration capacity of liver cancer cells (Figures 16(e)-16(g)). In addition, transwell invasion assays were performed to estimate the invasion ability of cells, and the findings showed that STK25 knockdown significantly inhibited the invasion capacity of liver cancer cells (Figure 16(h)). A previous study demonstrated that the dysregulation of the balance between apoptosis and proliferation of cells can lead to hepatocarcinogenesis [36]. We found that STK25 knockdown significantly increased the apoptosis of liver cancer cells (Figures 16(i) and 16(j)). These results confirm that STK25 has a significant role in the progression of liver cancer.

Discussion
With advances in high-throughput sequencing technology, increasing number of prognostic biomarkers and therapeutic target are being identified. However, immune-related prognostic biomarkers of HCC are still limited. Some studies have reported that the elevated levels of ROS correlate with tumorigenesis [37,38]. To further clarify the role of ROS-related genes in HCC, we firstly constructed a prognostic model based on ROS-related DEGs. This is the first research to explore the prognostic value of 49 ROS-related genes in HCC. Next, a prognostic model, consisting of 11 ROSrelated DEGs developed through multivariate regression and LASSO Cox regression analyses. Functional enrichment analysis indicated that immune-related pathways were enriched. ROS are by-products of cellular metabolism, including hydroxyl radicals, superoxide anions, singlet oxygen, and hydrogen peroxides [38]. Several studies have demonstrated that ROS are related to tumors [37,39]. An increased level of ROS can impair DNA, protein, and lipids and cause genetic instability and tumorigenesis [40,41]. Furthermore, the elevation of ROS levels can activate prosurvival signaling pathways, decrease the activation of tumor suppressor pathways, enhance glucose metabolism, and lead to tumor mutations [7,42]. A study has reported that ROS impair mitochondrial function and oxidative stress, which led to DNA damage and hepatocarcinogenesis [43].