A New Inflammation-Related Risk Model for Predicting Hepatocellular Carcinoma Prognosis

Background Hepatocellular carcinoma (HCC) is characterized by a poor prognosis. Inflammation has a vital role in the formation and development of HCC. However, the prediction of HCC prognosis using inflammation-related genes (IRGs) remains elusive. In this study, we constructed a new IRG risk model to predict the HCC prognosis. Results HCC-related RNA expression profiles and their corresponding clinical data were downloaded from TCGA and ICGC databases to explore the IRGs' predicting ability. Seven hundred thirty-seven IRGs from GeneCards were used as candidate genes to construct the model. The associations of overall survival (OS) with IRGs were evaluated using the log-rank test and univariate Cox analysis, and 32 out of 737 IRGs showed predicting the potential for HCC prognosis. These IRGs were further analyzed using the least absolute shrinkage and selection operator (LASSO) and multivariate Cox analyses. Finally, 6 IRGs were included in an IRG risk model. Based on the cut-off of the risk score calculated according to the IRG risk model, HCC samples were divided into the high-risk and the low-risk groups. The OS of patients was lower in the high-risk group than in the low-risk group (P < 0.05). The area under the receiver operating characteristic curve (AUC) of the risk score was 0.78 for 3-year survival. Univariate Cox and multivariate Cox analyses revealed that the risk score was an independent risk factor for HCC prognosis. The KEGG and GO enrichment analysis results further showed that the risk scores were closely related to inflammatory and immune pathways. In addition, the ssGSEA demonstrated that several immune cells and some immune-related pathways were negatively correlated with the risk score. Conclusions The new IRG risk score was an independent risk factor for HCC prognosis and could be used to assess the immune status of the HCC microenvironment.


Introduction
Liver cancer is one of the most common malignancies and the fourth leading cause of cancer-related mortality worldwide [1]. Each year, there are approximately 841,000 new cases and 782,000 patient deaths from this disease, and these numbers continue to rise [2]. As the most common form of liver cancer [1,3], hepatocellular carcinoma (HCC) is the third-largest cause of cancer-related deaths worldwide [4]. Due to the high heterogeneity of HCC, diagnosis and prognosis are challenging.
HCC is a prototypical inflammation-associated cancer [5]. It has been estimated that 90% of HCC occurs from the underlying chronic i4nflammation of the liver, the induction of fibrosis, and subsequent cirrhosis [6]. The liver has a strong self-repair ability. Differentiated liver cells can reenter the cell cycle to repair the cells after acute liver injury [7]. However, the persistence of inflammation, sustained cell death, compensatory proliferation, activation of nonparenchymal cells, and an altered immune response can promote liver fibrosis and, in turn, lead to tumorigenesis [6][7][8]. Along with the occurrence of tumors, inflammatory signals, cellular stress, epigenetic modifications [9], mitochondrial stress signaling [10], and the immune system may change [11], thus posing significant challenges to the treatment and prognosis evaluation of HCC.
Some studies have demonstrated that some inflammation markers are associated with the prognosis of HCC.
The most common biomarkers for HCC are serological parameters, including thrombocytosis, leukocytosis, hypoproteinemia, and plasma fibrinogen [12,13]. Lin et al. [14] have constructed an inflammation-related risk for prognostic prediction in HCC. However, the AUC of the risk score for 3-year survival was only 0.605. In this study, we constructed a new inflammation-related risk model based on inflammation-related genes (IRGs) from the GeneCards to more accurately predict the HCC prognosis.

Methods
2.1. Data Sets. IRGs were extracted from GeneCards (https:// www.genecards.org/), and 737 IRGs with a relevance score > 3 were selected for further analysis. Three hundred seventy-one HCC samples' RNA sequencing data and related clinical information were extracted from TCGA database through the Genomic Data Commons (GDC) tool (https://portal.gdc.cancer.gov). Another 231 HCC samples' RNA sequencing data and clinical information were obtained from the ICGC website (https://dcc.icgc.org/ projects/LIRI-JP). To confirm detection reliability, genes with read counts equal to 0 in more than 25% of the samples were removed from further analysis. Patients whose survival time was <0.1 months or with incomplete survival data were removed. Finally, 335 tumor samples from TCGA cohort and 231 tumor samples from the ICGC cohort were enrolled in this study.

Construction of the Inflammation-Related Genes Risk
Model. We firstly conducted univariate Cox regression and log-rank test using the survival R package to identify IRGs associated with the overall survival (OS) in TCGA cohort. The IRGs with P < 0:05 in both two analyses were retained. Next, data were analyzed by the least absolute shrinkage and selection operator (LASSO) regression using the glmnet R package. Subsequently, the multivariate Cox regression analysis stepwise method was performed using the survival package and My.stepwise R package to establish an optimal IRG risk model. The risk score was computed as follows: risk score = ∑ 6 i Xi × Yi (X: coefficients, Y: gene expression level). We used the survminer R package to determine the optional cut-off value of the risk score. The HCC samples were finally divided into high-risk and low-risk groups based on the cut-off value.

Performance Assessment Inflammation-Related Risk
Model. The Kaplan-Meier survival analysis and log-rank test were performed using the survival and survminer R packages to display and compare the OS of the high-risk and the lowrisk groups. Principal component analysis (PCA) was used to examine the difference between the two risk groups. The R package timeROC was used to establish a timedependent receiver operating characteristic curve (ROC) to check the accuracy of the risk score in predicting the outcomes of HCC patients. To explore the prognostic factors of HCC, univariate and multivariate Cox analyses were conducted.

Functional Enrichment
Analysis of the DEGs between the Two Risk Groups. We used DESeq2, edgeR, and limma R packages to select differentially expressional genes (DEGs) between high-risk and low-risk groups in TCGA cohort. The screening criteria were |log 2Fold − Change | >1 and P < 0:05. Based on these DEGs, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed using the ClusterProfiler package.

Comparison of the Immune Status between Two Risk
Groups. The ESTIMATE R package was used to calculate the immune score, stromal score, and ESTIMATE score, reflecting the tumor purity and the characteristics of the tumor microenvironment (TME) [15]. The gsva R package was utilized to conduct the single-sample gene set enrichment analysis (ssGSEA) to calculate the scores of infiltrating immune cells and evaluate immune-related pathways' activity in TCGA and ICGC cohorts.

Acquisition of Immunotherapeutic Cohorts and
Collection of Clinical Information. To understand whether the risk score could be used to predict the efficacy of immunotherapy, we found the IMvigor210 cohort (http:// research-pub.Gene.com/imvigor210corebiologies) after a systematic search of the public databases. The IMvigor210 cohort investigated the effectiveness of anti-PD-L1 antibody (pembrolizumab) in patients with advanced urothelial cancer. The complete transcriptome data and detailed clinical information were enrolled. 2.7. Statistical Analysis. One-way ANOVA and Kruskal-Wallis tests were used for multiple comparisons [16]. The Wilcoxon test was used to test the significant difference between the two groups. The Kaplan-Meier survival analysis was used to generate survival curves, and the significance of differences was compared using the log-rank test. Hazard ratios (HRs) and 95% confidence interval (CI) were calculated using univariate Cox and multivariate Cox analyses. Two-sided P < 0:05 was considered statistically significant. The R 4.1.1 software was used to perform all data processing.

Clinical Cohorts.
The workflow chart of this study is shown in Figure 1. A total of 566 HCC patients and 348 patients with advanced urothelial cancer were included in the analysis. Data of 335 HCC patients from TCGA database, data of 231 HCC patients from the ICGC database, and 348 patients with advanced urothelial cancer from the IMvigor210 cohort were included in the analysis. TCGA cohort was used to construct the IRG risk model, and the ICGC cohort was used for validation. The IMvirgor210 cohort was used to evaluate the predictive value of the risk model for immunotherapy efficacy. The detailed clinical information of HCC patients are shown in Table 1, and the clinical information of patients in the IMvirgor210 cohort are shown in Table 2.

Conduction and Validation of the Inflammation-Related
Genes Risk Model. Using univariate Cox analysis and log-rank test, 32 out of 737 IRGs showed predicting prognosis ability in HCC (Figures 2(a) and 2(b)). Then, the 32 IRGs were screened by the LASSO regression analysis, and 13 IRGs were obtained for further analysis (Figures 2(c) and 2(d)). To increase model stability, multivariate Cox analysis was performed by stepwise method. Finally, six IRGs were selected to form the IRG risk model (Figure 2(e)); three IRGs (SSP1, ADAMTS5, and EPO), with HRs > 1, were associated with increased risk, while the other three IRGs (CXCR3, TNFRSF13C, and CRYAA) were protective genes with HRs < 1 in TCGA cohort. The risk score was estimated as follows: risk score = (0.11026×expression level of SPP1) + (-0.18773×expression level of CXCR3) + (0.30501×expression level of ADAMTS5) + (0.12363×expression level of  The C-index for TCGA and ICGC cohorts was 0.753 and 0.680, respectively. In TCGA cohort, the risk score's optimal cut-off value (0.230) was determined by the survminer R package (Figure 2(f)). Based on the cut-off value, 335 tumor samples were divided into the high-risk and the low-risk groups. In the same way, 231 tumor samples from the ICGC cohort were also divided into two risk groups, and the cutoff value was -5. 10.
In TCGA and ICGC cohorts, the proportion of dead patients was higher, and the survival time was shorter in the high-risk group than in the low-risk group (Figures 3(a) and 4(a)). The OS of patients was significantly lower in the high-risk group than in the low-risk group (P < 0:05) (Figures 3(b) and 4(b)). PCA based on the six IRGs showed that patients in two risk groups were distributed in different regions (Figures 3(c) and 4(c)). Next, time-dependent ROC analysis was applied to evaluate the sensitivity and specificity of the risk model. The area under the ROC (AUC) of the risk score was 0.82 for 1-year, 0.78 for 2-year, and 0.78 for 3-year survival in TCGA cohort ( Figure 3(d)). In the ICGC cohort, the AUC was 0.71 for 1year, 0.71 for 2-year, and 0.65 for 3-year survival ( Figure 4(d)). Besides, in TCGA cohort, the AUC of the risk score for 3-year survival was larger than that of other clinical features (the AUC of other clinical features for 3-year survival was less than 0.60) (Figure 3(e)). In the ICGC cohort, the AUC of the risk score for 3-year survival was similar to the AUC of tumor stage (the risk score: 0.65, tumor stage: 0.66), and higher than that of other clinical features (the AUC of other clinical features for 3-year survival were less than 0.60) (Figure 4(e)).
Then, we extracted patients' clinical information from TCGA cohort (age, gender, family history, inflammation grade, and tumor stage) and the ICGC cohort (age, gender, and tumor stage). In order to learn the independent prognostic value of the risk score, clinical information was analyzed in combination with the risk score by univariate and multivariate Cox regression analyses. In univariate Cox analysis, the risk scores of both cohorts were significantly corre-

Correlation of the Risk Score with Clinicopathologic
Features. The boxplot was used to display the relationship of the risk score with the different clinical features. Patients aged < 62 years (median age of the patients in TCGA cohort) were defined as younger patients. The differences of the risk score in different age groups, genders, and family histories were not statistically significant in TCGA cohort (all P > 0:05, Figures 5(a)-5(c)). However, the risk scores were significantly correlated with inflammation grades and tumor stages and elevated with increasing inflammation grades and tumor stages in TCGA cohort (all P < 0:05) (Figures 5(d) and 5(e)); the same results were obtained in the ICGC cohort (P < 0:05 To explore the relationship of the risk score with the OS in patients with different clinical features, we conducted the Kaplan-Meier analysis and log-rank test. The results showed that the OS of the patients with different ages (Figures 6(a) and 6(b)), genders (Figures 6(c) and 6(d)), family histories (Figures 6(e) and 6(f)), inflammation grades (Figures 6(g) and 6(h)), and tumor stages (Figures 6(i) and 6(j)) was negatively correlated with the risk score. The OS of the patients was lower in the high-risk group than in the low-risk group of TCGA cohort (P < 0:05). In the ICGC cohort, the relationship of the risk score with the OS of patients with different ages, genders, and tumor stages was similar to those in TCGA cohort. Although the P values in the younger group and the stages III-IV were above 0.05, this may be due to the small sample size (Figures 6(k)-6(p)). Overall, these

Functional Enrichment Analysis of the DEGs between
Two Risk Groups in TCGA Cohort. To further explore the differences in the gene functions and pathways between the two risk groups, we used the DESeq2, edgeR, and limma R packages to select the DEGs according to specific criteria (|log 2Fold − Change | >1 and P < 0:05). The results revealed that 167 DEGs were upregulated (Figure 7(a)), and 99 genes were downregulated (Figure 7(b)) in the high-risk group. PCA based on the DEGs showed patients in two risk groups were distributed in different regions (Figure 7(c)). The heatmap suggested that the expression levels of the DEGs were obviously different (P < 0:05, Figure 7(d)).

Comparison of the Immune Status between Two Risk
Groups. We used the ESTIMATE algorithm to evaluate the immune, stromal, and ESTIMATE scores of TME [15]. In TCGA cohort, the results showed that the immune score was significantly lower in the high-risk group than in the low-risk group (P < 0:05). Yet, the differences in the stromal scores and ESTIMATE scores between the two risk groups        were not statistically significant (P > 0:05, Figure 8(a)). In the ICGC cohort, the immune scores were also lower in the high-risk group, although the difference was not statisti-cally significant (P > 0:05). Also, the stromal score and ESTI-MATE score were comparable between the two risk groups (Figure 8(b)).

BioMed Research International
Next, we compared the immune cells and pathways between the two risk groups. In TCGA cohort, the highrisk group generally had significantly lower levels of infiltration of immune cells, including B cells, natural killer (NK) cells, CD8 + T cells, T helper cells, tumor-infiltrating lymphocytes (TIL), and neutrophils cells than the low-risk group (all P < 0:05, Figure 8(c)). Also, cytolytic activity, inflammation, the type I IFN response, T cell costimulation, T cell coinhibition, and HLA were lower in the high-risk group than in the low-risk group (P < 0:05) (Figure 8(d)). When assessing the scores of the immune cells and pathways between the two risk groups in the ICGC cohort, similar results were obtained (Figures 8(e) and 8(f)).
3.6. Assessment of the Predictive Value of the Risk Score for the Efficacy of Immunotherapy. The above results showed that immune cells levels and immune signaling pathways were lower in high-risk groups than in low-risk groups.

22
BioMed Research International These findings suggested that the risk score might predict the effect of immunotherapy. Therefore, we used the IMvir-gor210 cohort to assess the predictive value of the risk score for the efficacy of immunotherapy [17]. The survminer R package also determined the cut-off value (2.292). The low-risk group showed a significant clinical benefit and obviously prolonged survival (Figure 9(a)). (P < 0:05) Patients with partial response (PR) exhibited a higher risk than patients with complete response (CR), although the difference was not significant (P > 0:05). Yet, patients with progressive

27
BioMed Research International disease (PD) exhibited a higher risk than patients with stable disease (SD) (P < 0:05, Figure 9(b)). We also observed that the percentage of the patients with PD was higher in the high-risk group than in the low-risk group (60% vs. 49.6%), although the difference was not statistically significant (P > 0:05, Figure 9(c)). These findings suggested that the risk score could be used to assess the efficacy of immunotherapy.

Discussion
Inflammation is closely related to liver cancer, especially chronic hepatitis, which is one of the crucial factors in liver cancer formation. Chronic inflammation damages liver epithelial cells, leading to substantial cell proliferation [18]. At the same time, inflammation induces reactive oxygen species (ROS) and the damage of deoxyribonucleic acid (DNA), which increases the frequency of DNA mutations. When the rate of cell proliferation increases, along with DNA mutations, the incidence of malignant transformation increases accordingly [18]. However, so far, only a few studies have reported on the characteristics of IRGs in liver cancer tissues. In the present study, we constructed a new IRG risk model to predict HCC prognosis. The IRG risk model was constructed using 6 IRGs, including secreted phosphoprotein 1(SSP1), C-X-C motif chemokine receptor 3 (CXCR3), ADAM metallopeptidase with thrombospondin type 1 motif 5 (ADAMTS5), erythropoietin (EPO), TNF receptor superfamily member 13C (TNFRSF13C), and crystallin alpha A (CRYAA).
SPP1 is expressed in most human tissues, including the brain, vascular tissue, kidney, and liver [19,20], and has been reported to regulate cancer cell proliferation through the activation of the MAPK pathway. It can also mediate HCC metastasis by inducing MMP-2 production/activation and NF-κB translocation [21,22].
EPO is a hypoxic-reactive cytokine [23]. Hypoxia is a common feature of TME, which increases the activity of a

29
BioMed Research International hypoxia-inducible factor (HIF) binding to a cis-acting DNA hypoxia response element (HRE) to activate EPO transcription [23,24]. EPO can involve in the development of tumors by promoting tumor angiogenesis and the growth of tumor cells [25].
The exact role of ADAMTS5 in HCC remains unclear. Li et al. [25] have reported that low expression of ADAMTS5 protein is associated with HCC progression and poor prognosis. However, multiple online sites show high ADAMTS5 expression as an independent risk factor for the development of HCC [26]. Furthermore, Zhu et al. [26] confirmed that ADAMTS5 might promote tumor metastasis through biological processes affecting the extracellular matrix (ECM). These data are consistent with our findings.
CRYAA, also known as HspB4, is a well-known antiapoptotic protein [27,28]. Studies have shown that its role in tumors depends on the type of tumor [29][30][31][32]. Some studies have suggested that HSPB4 can promote retinoblastoma and sebaceous adenocarcinoma progression through antiapoptotic effects [33,34]. In contrast, HspB4 is expressed at a moderate level in the normal pancreas and significantly downregulated in pancreatic cancer [30]. It can delay the progression of the tumor by regulating the activity of ERK MAP kinase [35]. However, further experiments are needed to verify the role of HSP4 in the formation and development of HCC.
CXCR3 is a chemokine receptor that is mainly expressed on CD4 + and CD8 + T cells and partly expressed on other cells, including epithelial cells [36,37]. In the CD4 + subpopulation, CXCR3 is most abundant in proinflammatory Th1 cells [38][39][40][41]. Thus, the expression level of CXCR3 is related to the abundance of immune cell infiltration in TME [42].
TNFRSF13C, also known as B-cell activating factor receptor (BAFF-R), is expressed almost exclusively on B cells [43]. It is a key receptor involved in B cells' successful survival and maturation, which determines that its expression levels are closely correlated with the abundance of B cells in TME [44]. A recent study has found a significantly lower percentage of B cells expressing BAFF-R in HBV-associated HCC patients than non-HCC patients, suggesting an important role for BAFF-R in developing HCC in HBV-infected patients [45].
In summary, CXCR3 and TNFRSF13C were associated with immune cell abundance in TME, while ADAMTS5 was closely related to the change of stromal components in TME. SPP1 and EPO promote tumor progression in a

30
BioMed Research International variety of ways. The mechanism of CRYAA in liver cancer has not been reported, which also provides a new direction for further understanding of the development of HCC. The risk score calculated by the IRG risk model in our study was identified as an independent risk factor for HCC prognosis. The higher the risk score, the worse the prognosis and the lower the OS. Lin et al. [14] also reported an IRG risk model with eight IRGs used for prognostic prediction (the risk score =0.118×expression level of SLC7A1 +0.114×expression level of RIPK2+0.113×expression level of NOD2+0.022×expression level of ADORA2B +0.058×expression level of MEP1A+0.051×expression level of ITGA5+0.016×expression level of P2RX4+0.018×expression level of SERPINE1 In the present study, the AUCs of a risk score for 1-year survival, 2-year survival, and 3-year survival were above 0.750 in TCGA cohort; in the ICGC cohort, the AUCs of the risk score for 1-year survival and 2-year survival were above 0.700, and the AUC of the risk score for 3-year survival was 0.650, which is similar to the risk score in Lin et al.'s study. These findings indicated that the risk model in our study had a higher predicting value for HCC than the risk model presented by Lin et al.
Besides, the results of KEGG and GO enrichment analysis based on the DEGs between two risk groups revealed the DEGs were associated with inflammation, immune, and stromal-related pathways. These findings suggested that TME, including immune and stromal, was different in the high-risk group compared to the low-risk group. TME is defined as the cellular and physical environment surrounding the primary tumor, which has cellular components like inflammatory and immune cells, stromal components like ECM proteins, soluble cellular factors, etc. [46,47] In TME, tumor purity is a concept closely related to the immune cells and stromal cells, which refers to the proportion of the tumor cells in the tumor tissue [48]. Yoshihara et al. [15] used an ESTIMATE method to deduce the tumor purity. ESTIMATE score is the combination of stromal score and immune score, the primary basis for predicting tumor purity [15]; the lower ESTIMATE score, the higher tumor purity [48]. In the present study, the tumor purity was similar between the two risk groups, but the immune score was lower in the high-risk group than in the low-risk group. And most of the immune cells and immune-related pathways were reduced in the high-risk group, suggesting low immune levels. The same results were obtained in the ICGC cohort. It is well known that some immune cells have important roles in antitumor immunity, such as CD8 + T cells [49], NK cells [50], and B cells [51]. The decrease in immune response was closely related to the poor prognosis of HCC, which may also be one reason for the poor prognosis of patients in the high-risk group [52].
Since the risk score was closely associated with the low immune infiltrating, we sought to further explore whether the risk model could be used to assess the immunotherapy efficacy in tumor patients [17]. We used the risk score to analyze the tumor patient in the IMvigor210 cohort. The risk score was closely associated with patient prognosis, and in the high-risk group, 60% of patients suffered disease progression. These results suggested that risk scores can partly predict the efficacy of tumor immunotherapy. The higher the risk score, the worse the efficacy of immunotherapy. Therefore, it is necessary to actively monitor the patients with high-risk scores and combine multiple treatments to achieve a better antitumor effect.
Our study has significant clinical application values. The inflammation-related risk score could be used as an independent risk factor for predicting the outcomes of HCC patients. This model can be applied to identify tumors with low immune levels and indicate the efficacy of immunotherapy. Strengthening the study of these six IRGs may advance the understanding of tumorigenesis.
However, this study also has several limitations. First, this risk model was not confirmed by the prospective experimental data. Second, we failed to validate the predictive value of the risk model for the immunotherapy efficacy in an HCC-related immunotherapy cohort due to the fewer data on the HCC-related immunotherapy cohort.

Conclusions
The IRG risk model consisting of 6 IRGs is closely related to the prognosis of HCC, which could predict the HCC prognosis more accurately than Lin et al.'s risk model. Therefore, the evaluation of patients with the risk model was a more practical approach that furthered our understanding of the status of immune infiltration in tumor tissue. This approach can also help us to better evaluate the efficacy and to guide immunotherapy. In our future study, we plan to further validate the expression of 6 IRGs in tumor tissues and validate the clinical application value of this risk model in a prospective cohort.

Data Availability
The data could be downloaded at https://portal.gdc.cancer .gov/ and https://dcc.icgc.org/projects/LIRI-JP, and the code used in this study is available from the corresponding author on reasonable request.

Ethical Approval
All raw data were publicly obtained from TCGA and ICGC, so there was no need for approval on behalf of the ethics committee and participation.