Key Markers Involved in the Anticolon Cancer Response of CD8+ T Cells through the Regulation of Cholesterol Metabolism

Background T cell-mediated antitumor immune response is the basis of colorectal cancer (CRC) immunotherapy. Cholesterol plays an important role in T cell signal transduction and function. Apolipoprotein E (APOE) plays a major role in cholesterol metabolism. Objective To screen and analyze key markers involved in the anticolon cancer response of CD8+ T cells through the regulation of cholesterol metabolism. Methods Based on the median cutoff of the expression value of APOE according to the data downloaded from The Cancer Genome Atlas and Gene Expression Omnibus database, patients were grouped into low and high expression groups. Differences in clinical factors were assessed, and survival analysis was performed. Differentially expressed genes (DEGs) in the high and low expression groups were screened, followed by the analysis of differences in tumor-infiltrating immune cells and weighted gene coexpression network analysis results. The closely related genes to APOE were identified, followed by enrichment analysis, protein–protein interaction (PPI) network analysis, and differential expression analysis. Immunohistochemical staining (IHC) was used to detect the expression of CD8 in CRC tissues. Results There were significant differences in prognosis and pathologic_N between the APOE low and high expression groups. A total of 2,349 DEGs between the high and low expression groups were selected. A total of 967 genes were obtained from the blue and brown modules. The probability of distribution of CD8+ T cells differed significantly between the two groups, and 320 closely related DEGs of APOE were screened. Genes including the HLA gene family, B2M, IRF4, and STAT5A had a higher degree in the PPI network. GEO datasets verified the prognosis and the related DEGs of APOE. IHC staining verified the relationship between the distribution of CD8+ T cells and APOE expression. Conclusion Genes including the HLA gene family, B2M, IRF4, and STAT5A might be the key genes involved in the anticolon cancer response of CD8+ T cells through the regulation of cholesterol metabolism.


Introduction
Colon cancer is a major health problem worldwide because it is the second leading cause of cancer-related deaths [1]. It is reported that by 2030, especially in developed countries, the number of new cases will increase to 2.2 million, and cases of death will increase to 1.1 million [2]. Although surgical resection combined with radiation therapy and chemotherapy is the main treatment for colon cancer, the 5-year survival rate is only approximately 28% [3]. Excessive dietary fat is associated with the induction/exacerbation of various diseases, including colon cancer [4]. In addition, numerous studies have illustrated that a "Western" style diet with highfat content is an important factor in the increased incidence of colon cancer [5][6][7]. A high-fat diet is also widely involved in various pathological conditions, such as obesity and metabolic diseases, which may promote the development of colon cancer [8]. Hyperlipidemia is often divided into four types: hypercholesterolemia, hypertriglyceridemia, mixed hyperlipidemia, and low high-density lipoprotein cholesterol. A previous study indicated that hyperlipidemia is a risk factor for colon cancer [9].
T cell-mediated antitumor immune response is the basis of tumor immunotherapy, which is associated with a favorable prognosis [10]. However, some tumors evolve to acquire immunosuppressive properties and escape the attack by T cells through various mechanisms in the tumor microenvironment [11]. us, reactivating the cytotoxicity of T cells is of great clinical interest in cancer immunotherapy. Cholesterol plays an important role in T cell signal transduction and function. For instance, Yang et al. found that the antitumor response of mouse CD8+ T cells can be enhanced by regulating cholesterol metabolism [12]. Ma et al. showed that high cholesterol can facilitate T cell immune checkpoint expression, which makes it easier for T cells to lose their antitumor function [13]. However, these studies only focused on T cell gene changes and endoplasmic reticulum stress and did not examine the impact of tumor cells.
It has been reported that high cholesterol levels can promote the proliferation of stem cells, thereby increasing the growth rate of intestinal tumors by 100-fold [14]. RORα/ hdac [15], CD36 [16], ACAT1 [17], and so on are involved in the regulation of the antitumour response of CD8+ T cells by modulating cholesterol metabolism. However, the key markers involved are still unclear. us, this study aimed to screen and analyze the key markers involved in the anticolon cancer response of CD8+ T cells through the regulation of cholesterol metabolism. e workflow of this study is shown in Figure 1.

Data Sources and Data Preprocessing. Gene expression
RNAseq data [log 2 (fpkm+1)] and clinical data of colon adenocarcinoma were obtained from e Cancer Genome Atlas database [18] (https://xenabrowser.net/). After sequencing data and clinical information matching, a total of 389 tumor samples were obtained (data version: 07-19-2019). RNA-Seq was annotated based on the annotation file of the Gencode database [19] (V23, https://www. gencodegenes.org/). In the expression profile analysis of Ensembl_ID, the mapping probe was used to calculate the gene expression value (obtained from the annotation files of the chip platform and microarray dataset) to Symbol_ID. e average value was taken as the level of Ensembl_ID expression when multiple probes matched one Symbol_ID. Genes with "protein_coding" annotations were then extracted as mRNAs.

APOE Expression-Related Clinical and Survival Analysis.
Apolipoprotein E (APOE) plays a major role in cholesterol metabolism [20].
us, in this study, to screen the key markers involved in the anticolon cancer response of CD8+ T cells via cholesterol metabolism regulation, the expression level of APOE was calculated using the Survminer package of R (version: 0.4.8, https://CRAN.Rproject.org/ package�survminer), followed by the median cutoff of the expression value. en, based on the median cutoff of the expression value, patients were grouped into low and high expression groups. In addition, the Kaplan-Meier survival curve analysis and log-rank test were performed to compare the prognosis between the two groups based on the survival information of the samples. Clinical factors (age, sex, TNM stage, and pathologic_stage) were compared between the low and high expression groups using the chi-squared test; differences were considered significant at a threshold of P < 0.05.

Differential Expression Analysis.
e typical Bayesian method in the limma package [21] (version 3.40.6) was used to analyze differentially expressed genes (DEGs) between high and low expression groups. e Benjamini and Hochberg method was used to adjust the P values. Adjusted   Journal of Oncology P value < 0.05 and |logFC| > 0.263 were used as the cutoff criteria for screening DEGs. Finally, the ggscatter function of the ggpubr package [22] in R (version: 0.2.2) was used to draw a volcano plot.

Screening of APOE Expression-Related Modules and
Genes. e weighted gene coexpression network analysis (WGCNA) algorithm was used to screen the modules and genes related to APOE expression based on the expression values of DEGs. e network construction and module screening procedures included dataset consistency analysis and gene coexpression correlation matrix, adjacent function, phase difference between nodes, and correlation analyses between network modules and diseases. A heat map was used to visualize the correlations of each module.

Screening of Immune-Related Genes.
e CIBERSORT algorithm [23], a useful method for obtaining highthroughput characteristics of 22 cell types in complex tissues, was used to analyze the abundance of tumor-infiltrating immune cells with the parameters of perm � 100 and QN � F. Subsequently, the abundance of all types of immune cells was tested in the high and low expression groups. en, the Pearson correlation coefficient between DEGs and CD8+ T cells was calculated, and immune-related genes were obtained with a cutoff of P < 0.05 and |r| > 0.3. e module genes obtained from WGCNA were intersected with these immune-related genes, and the overlapping genes were redefined as closely related genes of APOE.
2.6. Enrichment Analysis. Based on the closely related genes of APOE, the cluster Profiler package [24] in R was used to perform Gene Ontology (GO) [25] and Kyoto Encyclopedia of Genes and Genomes (KEGG) [26] pathway enrichment analyses with a cutoff value of P < 0.05 and count ≥2.

Protein-Protein Interaction (PPI) Network Analysis.
e STRING database [27] was used to evaluate the PPIs encoded by closely related genes of APOE. e PPI score was set at 0.9. Subsequently, the PPI network of T cell-related genes was analyzed using Cytoscape software [28].

Analysis of DEGs Related to APOE and CD8+ T Cells.
R software ggplot2 (version: 3.2.1, https://CRAN.R-project. org/package�ggplot2) and GGpubr ggpubr (version 0.2.2, ttest provided by https://CRAN.R-project.org/ package�ggpubr) were used to analyze the differences of genes between the two groups. A box diagram was drawn for display with a threshold of P < 0.05.

Datasets Validation.
Two GEO datasets, GSE71187 and GSE39582, with prognostic information were selected from the GEO database to validate the result of APOE expressionrelated survival analysis. e series_matrix.txt data containing 99 tumors and 12 para-cancers was obtained after pretreatment of GSE71187. Furthermore, the series_matrix.txt containing 566 tumors and 19 para-cancers was obtained after pretreatment of GSE39582. e expression of APOE in the two datasets was extracted. R Software Survminer (version 0.4.8) package calculated APOE expression to obtain optimal cutoff. e expression value > optimal cutoff was regarded as high expression, and the expression value ≤ optimal cutoff was regarded as low expression. en, the prognosis of patients in the two groups was obtained. Combined with the survival information of the samples, the k-M curve was plotted, and log-rank was used to test its significance.
Patients in GSE71187 and GSE39582 were divided into two groups according to the median APOE expression. e expressions of transcription 5A (STAT5A), histocompatibility leukocyte antigen (HLA) gene family (HLA-E, HLA-C, HLA-B, etc.), beta-2 microglobulin (B2M), and interferon regulatory Factor 4 (IRF4) were tested in high and low APOE expression groups. R software GGplot2 (version 3.2.1, https://CRAN.R-project.org/package�ggplot2) and GGpubr ggpubr (version 0.2.2, the t-test provided by https://CRAN. R-project.org/package�ggpubr) were used to analyze the differences of genes between the two groups. A box diagram was drawn for display with a threshold of P < 0.05.

CRC Specimens and IHC Staining.
e study protocol was approved by the Human Ethics Committee of Huzhou Central Hospital (NO: 20210207). Forty CRC patients were recruited, and the basic information was presented in Supplementary Table 2. e expression of APOE in blood was detected by using immunoturbidimetry. APOE higher than 53.0 mg/L was included in the APOE-high group, and APOE lower than 29.0 mg/L was included in the APOE-low group. Further detection of CD8 expression was conducted by using immunohistochemistry. Immunohistochemistry test kit, antigen repair solution, and CD8 antibody were all purchased from Beijing Zhongshan Jinqiao Biotechnology Co., Ltd. e tissue sections were scanned at low magnification (X100) to select the areas with high CD8 positive cell density. en, the positive rate of CD8 cells in five fields was determined at high magnification (X200) to calculate the average value. e positive rate of CD8 cells was expressed as mean ± standard deviation. Using SPSS 13.0 statistical software, the Pearson χ2 test was performed.

Comparison of Clinical Factors and Analysis of DEGs between the High and Low Expression Groups.
e Kaplan-Meier survival curve analysis and log-rank test were performed to compare the prognosis between the two groups based on the survival information of the samples. e results showed that there were significant differences in prognosis between the low and high expression groups (P � 0.018) (Figure 2(a)). In addition, a significant difference in pathologic N was found between the two groups (P � 0.029) (Figure 2(b)). In total, 2,349 DEGs, of which 1,949 and 400 were upregulated and downregulated, respectively, between the high and low expression groups were selected (Figure 2(c)).

Disease-Related Modules and Genes.
A total of 2,349 DEGs were used for WGCNA analysis. e soft threshold power for matrix transformation was determined to be 8, with the square of the correlation coefficient between log2k and log2p (k) being 0.85 (Figure 3(a)). e minimum number of genes was set to 30 for each module, and the pruning height was cutHeight � 0.2. A total of five different disease-related modules were screened (Figure 3(b)), including blue, brown, green, red, and yellow modules. To ensure the reliability of the screening results of key network modules, the disease-related key network modules were analyzed again by calculating the absolute value of gene significance (GS) within the module, and the results showed that the blue and brown modules had a higher GS ( Figure 3(c)) and were significantly correlated with the disease ( Figure D). Finally, a total of 967 genes were obtained from the blue and brown modules.

CD8+ T Cell-Related DEGs.
Immune infiltration in the high and low expression groups was analyzed based on the CIBERSORT algorithm. As illustrated in Figures 4(a) and 4(b), 22 tumor-immune cell proportions in the high and low expression groups were analyzed. To observe the probability of the distribution of different tumor-immune cells between the two groups, a violin plot was drawn. As shown in Figure 4(c), the probability of the distribution of CD8+ T cells was the most significant difference between the two groups (P < 0.01). e correlation coefficient between DEGs and CD8+ T cell-related genes was then calculated, and a total of 612 CD8+ T cell-related DEGs were identified. In addition, the module genes obtained from WGCNA intersected with these CD8+ T cell-related genes, and a total of 320 closely related DEGs of APOE was obtained (Figure 4(d), Supplementary Table 1).

Enrichment Analysis and PPI Network Analysis of Closely
Related DEGs of APOE. Enrichment analysis showed that the closely related upregulated DEGs were mainly involved in 544 GO-biological processes (BPs; e.g., T cell activation; regulation of lymphocyte activation; and regulation of T cell activation) and 54 KEGG pathways (e.g., hsa05169: Epstein-Barr virus (EBV) infection and hsa04514: cell adhesion molecules) (Figures 5(a) and 5(b)), whereas the closely related downregulated DEGs were not involved in the GO-BP and KEGG pathways. PPI network analysis of 320 closely related DEGs of APOE revealed 191 nodes and 1,153 interaction pairs ( Figure 5(c)). In this PPI network, genes including the histocompatibility leukocyte antigen (HLA) gene family (HLA-E, HLA-C, HLA-B, etc.), beta-2-microglobulin (B2M), interferon regulatory factor 4 (IRF4), and STAT5A had a higher degree, which could lead to their classification as key genes (Table 1).

Analysis of DEGs Related to APOE and CD8+ T Cells.
As shown in Figure 5 Figures 6(a) and 6(b), the survival analysis results of GSE3958 and GSE71187 were verified that the high expression group had a poor prognosis (P < 0.05). Moreover, as shown in the box diagram in Figures 6(c) and 6(d), most of the concerned genes in the GSE39582 and GSE71187 datasets were highly expressed (P < 0.05). e results were consistent with the analysis results of the TCGA database.

e Infiltration of CD8+T Cells Correlated with APOE in Tissue Specimen Sections.
e basic information of recruited CRC patients was present in Supplementary Table 2. IHC staining detection of CD8 proteins showed that CD8 was upregulated expression in the APOE-high expression group (Figure 7(a)), while CD8 was downregulated expression in the APOE-low expression group (Figure 7(b)) (P � 0.001).
e statistical results are presented in Table 2.

Discussion
In this study, there were significant differences in prognosis and pathologic_N between the low and high expression groups. A total of 2,349 DEGs between the high and low expression groups were selected. In addition, the probability of CD8+ T cell distribution was the most significant difference between the two groups. A total of 320 closely related DEGs of APOE were identified. Genes including the HLA gene family, B2M, and IRF4 had a higher degree in the PPI network, and STAT5A was regulated by numerous miRNAs in the ceRNA network. Cholesterol has been reported to be associated with colon cancer. Eggs low in fat contain large amounts of cholesterol, and egg consumption has been associated with an increased risk of colon cancer [29]. In addition, Jacobs et al. showed that high cholesterol levels seem to increase the risk of colon cancer [30]. In line with our data, in this study, the survival analysis results showed that the high cholesterol group was related to poor survival, which suggested that    high cholesterol was a risk factor for colon cancer. In addition, a significant difference in pathologic_N was found between the low and high expression groups, and the proportion of pathologic_N3 in the high expression group was higher than that in the low expression group, which further indicated that high cholesterol was a risk factor for colon cancer. CD8+ T cells play an extremely crucial role in antitumor immunity; however, tumor-infiltrating T cells often lose their effector functions. Ma et al. reported that cholesterol induced CD8+ T cell exhaustion by modulating endoplasmic reticulum stress pathways (IRE1/XBP1) in the tumor microenvironment [13]. What is more, CD8+ T cells were polarized into IL-9-secreting (Tc9) cells to exert antitumor responses, while cholesterol or its derivatives inhibited IL-9 expression by activating liver X receptors (LXRs) and leading to LXR SUMOylation and reduced p65 binding to IL-9 promoter [31]. e characteristics of exhausted CD8+ T cells were attenuating antitumor responses and decreasing infiltrating density. erefore, high cholesterol decreased the infiltration of CD8+ T cells. In this study, we also analyzed the key markers involved in the anticolon cancer response of CD8+ T cells through cholesterol metabolism regulation using bioinformatics analysis. e results showed that genes including the HLA gene family, B2M, and IRF4 had a higher degree in the PPI network, and STAT5A was regulated by numerous miRNAs in the ceRNA network. e HLA gene family plays a major role in regulating immune responses in cancers. For instance, Tsai  surveillance strategy in colon cancer tissues [33]. Moreover, Benevolo et al. found that the HLA-A, HLA-B, and HLA-C expression in colon cancer is associated with prognosis [34]. Blum et al. suggested that, in patients with stage II colon cancer, the Map7/B2M expression ratio is a prognostic factor for survival [35]. IRF4 is considered an oncogene in lymphoid malignancy and multiple myeloma [36], and Zhang et al. showed that overexpression of circPIP5K1A attenuated the expression of IRF4 in the progression of colon cancer development [37]. As a transcription factor, the activation or phosphorylation of STAT is associated with many cancers. Slattery et al. showed that STAT5A was associated with colon cancer progression [38]. In addition, nucleotide analysis showed that the upregulated closely related genes were involved in EBV infection and cell adhesion molecule KEGG pathways. EBV is associated with several malignant tumors. Guan et al. reported a positive association between EBV infection and colon cancer [39]. It has been reported that a variety of cell adhesion molecules are involved in cell-cell and cell-matrix interactions in colon cancer, and some cell-cell and cell-matrix interactions determine colon cancer behavior [40]. Based on these results, we speculate that genes including the HLA gene family, B2M, IRF4, and STAT5A that may be involved in EBV infection and cell adhesion molecule pathways play important roles in the anticolon cancer response of CD8+ T cells by regulating cholesterol metabolism.
However, there are some limitations to this study. We need to conduct related experiments such as cell biology experiments and animal studies to validate the multiple candidate targets and signaling pathways obtained in this bioinformatics analysis.

Conclusion
In summary, genes including the HLA gene family, B2M, IRF4, and STAT5A might be the key genes involved in the anticolon cancer response of CD8+ T cells via regulation of cholesterol metabolism. ese findings can guide clinical decision-making for colon cancer treatment as well as future research on colon cancer.