Integrated Bioinformatics and Experimental Analysis Identified TRIM28 a Potential Prognostic Biomarker and Correlated with Immune Infiltrates in Liver Hepatocellular Carcinoma

Background Since the 1970s, liver hepatocellular carcinoma (LIHC) has experienced a constant rise in incidence and mortality rates, making the identification of LIHC biomarkers very important. Tripartite Motif-Containing 28 (TRIM28) is a protein-coding gene which encodes the tripartite motif-containing proteins (TRIMs) family and is associated with specific chromatin regions. TRIM28 expression and its prognostic value and impact on the immune system in LIHC patients are being investigated for the first time. Methods The TRIM28 expression data from TCGA database was used to analyze TRIM28 expression, clinicopathological information, gene enrichment, and immune infiltration and conduct additional bioinformatics analysis. R language was used for statistical analysis. TIMER, CIBERSORT, and ssGSEA were used to assess immune responses of TRIM28 in LIHC. Next, the results were validated using GEPIA, ROC analysis, and immunohistochemical staining pictures from the THPA. GSE14520, GSE63898, and GSE87630 datasets were analyzed using ROC analysis to further evaluate TRIM28's diagnostic value. To ultimately determine TRIM28 expression, we performed qRT-PCR (quantitative real-time polymerase chain reaction). Results High TRIM28 expression level was associated with T classification, pathologic stage, histologic grade, and serum AFP levels. In patients with LIHC, TRIM28 was an independent risk factor for a poor prognosis. The pathways ligand-receptor interaction, which is critical in LIHC patients, were closely associated with TRIM28 expression, and the function of DC could be suppressed by overexpression of TRIM28. As a final step, our results were validated by GEO data and qRT-PCR. Conclusions TRIM28 will shed new light on LIHC mechanisms. As an effective diagnostic and intervention tool, this gene will be able to diagnose and treat LIHC at an early stage.


Introduction
The incidence and mortality of liver hepatocellular carcinoma (LIHC) have increased over the past 40 years, making it important to identify biomarkers for LIHC [1]. LIHC is being monitored due to concerns about the COVID pandemic and associated policy lockdowns [2]. A recent overview on global cancer statistics released in 2020 revealed that 906,000 new diagnosed cases and 830,000 deaths occurred from LIHC, with more than half occurring in China [3]. Hepatocellular carcinoma (HCC) is the most common histological subtype of primary liver cancer (75-85%) [4]. Now, the development of LIHC is associated with a number of risk factors, including hepatitis B and C, excessive drinking, chemical exposure, tobacco use, and aflatoxin [5][6][7]. At present, there are noninvasive detection and diagnosis methods for LIHC, but they are not sensitive enough for early detection of LIHC [8]. In order to improve LIHC prognosis, it is important to identify more specific biomarkers and possible treatment targets.
Tripartite Motif-Containing 28 (TRIM28) is a proteincoding gene which encodes the tripartite motif-containing proteins (TRIMs) family and is associated with specific chromatin regions. TRIMs family are linked with autoimmune and autoinflammatory diseases which are closely related to malignant tumor [9]. Meanwhile, previous study shows that TRIM28 plays a critical role in T cell activation and T cell tolerance [10]. So, we hypothesize that TRIM28 is linked to immune cell infiltration and significantly promotes tumor progression in LIHC. The hypothesis is in accordance with the results of these past studies [11][12][13][14].
In spite of the fact that various types of cancers, comprising colorectal cancer, melanoma, kidney renal clear cell carcinoma, and lung adenocarcinoma are associated with TRIM28-associated immune responses, there is still a lack of understanding of how TRIM28 contributes to immune infiltration and prognosis in LIHC [15][16][17][18]. As a response to this challenge, The Cancer Genome Atlas (TCGA, https://cancergenome.nih.gov/) was used to analyze and check the expression level of TRIM28 in LIHC. Under RStudio 1.4, we used R software (version 3.6.3) to assess the relationship between TRIM28 expression and some clinicopathological parameters, as well as possible prognostic value in LIHC. Gene ontology (GO) analyses, Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses, and protein-protein interaction (PPI) networks were performed to clarify the pathogenic impact of TRIM28 and to understand the regulatory mechanisms that govern LIHC invasion and metastasis. Tumor Immunoassay Resource (TIMER) (https://cistrome .shinyapps.io/timer/), CIBERSORT algorithm, and single sample gene-set enrichment analysis (ssGSEA) were performed to further investigate the relationship between TRIM28 and Tumor-Infiltrating Immune Cells (TIICs). Furthermore, Gene Expression Profiling Interactive Analysis (GEPIA), Kaplan-Meier (K-M) survival analysis (http:// kmplot.com/analysis/), and the Human Protein Atlas (THPA) were used to compare and assess the interrelationship between high TRIM28 and poor prognosis. Finally, receiver operating characteristic curve (ROC) and experimental analysis were constructed to determine TRIM28's diagnostic value. This is the first analysis of the relationship of TRIM28 with LIHC. In order to develop and propagate LIHC, a variety of causative mechanisms and risk factors must be considered in its pathogenesis and development. There is a strong association between higher TRIM28 expression and poor prognosis among the diagnostic criteria, outcome events, and influencing factors. Moreover, GO and KEGG analyses revealed that TRIM28 was involved in appendage development, cell cycle control, amino acid, and fatty acid metabolism. We also explored the correlation between TRIM28 and TIICs. This research investigated the function of TRIM28 in LIHC and explored effective molecules to diagnose and treat LIHC.

Materials and Methods
2.1. Data Acquisition and Mining. The applied data, including clinical data, immune system infiltrates, and gene expression data (workflow type: HTSeq-TPM) were obtained from the TCGA database [19]. Samples will also be excluded from the study if data sources are missing, insufficient, or unclear. Analyzing and investigating the data was based on both RNA-sequences and clinical data which were selected for further study. Our research included 424 samples, 374 of which were LIHC tissues and 50 of which were normal healthy liver tissues. As part of the investigation of mechanisms of TRIM28 expression, LIHC patients were grouped into two groups: those with high or low expression levels of TRIM28. In accordance with the publication guidelines offered by TCGA, we conducted our research [20]. We used the Gene Expression Omnibus (GEO) database to collect 3 gene expression profiling datasets (GSE14520, GSE63898, and GSE87630) to determine the expression and diagnostic value of TRIM28 [21][22][23].
2.2. Validation of TRIM28 Expression. TCGA dataset was analyzed to confirm the potential prognostic role of TRIM28 gene in LIHC. In order to compare TRIM28 gene differences between LIHC samples and normal tissues, independent sample t-test was used for nonpaired samples and paired t -test was used for paired samples. In order to plot the results, boxplots were generated using the ggplot2 R package.

Survival Analysis Based on TRIM28 Expression.
To summarize, survival analysis was performed by graphing K-M survival curves with the R packages survival and survminer. The K-M survival curve was used to compare the OS and progression-free interval (PFI) between the high and low TRIM28 groups. Based on the OS and PFI time, we calculated the relationships between TRIM28 expression level and patients' survival outcomes. Additionally, ROC curves were generated using the R language package pROC to assess further the outcomes of K-M survival analysis [24].

GO and KEGG Pathway Enrichment
Analyses. We conducted ssGSEA by normalizing RNA-sequences data [25]. With default parameters, gene-set permutations were set to 1,000. TRIM28 was analyzed using ssGSEA for GO pathway enrichment and KEGG pathway enrichment to determine function. Statistics were considered significant when enrichment results with two conditions ((NOM) P value <0.05 and false discovery ratio (FDR) P value <0.25) was considered.
2.5. Construction of the Predicted PPI Network. In the PPI network, protein complexes are formed either by biochemical events or electrostatic forces, and each complex performs a unique biological function. PPI networks act as an organism's skeleton, allowing it to respond to genetic and environmental signals. By understanding these circuits, we may be able to better predict gene function and cellular behaviour. PPI can be predicted using an online biological tool called STRING that includes direct (physical) as well as indirect (functional) associations [26]. The differentially expressed genes (DEGs) were identified with the help of the PPI database STRING version 11.0. DEGs must meet the following criteria: the threshold values of | log2 fold − change ðFCÞ | > 2:0 and adjusted P value (adj. P value) <0.05. As a cut-off criterion for significant interactions in this network, a medium confidence score (0.400) was required. Using Cytoscape (version 3.8.2), the network was visualized [27]. 3 Computational and Mathematical Methods in Medicine 2.6. Immune Infiltrates Analysis. Infiltrates of immune cells across a variety of tumor types can be systematically analyzed using TIMER, a comprehensive and publicly available resource [28]. TIMER was used for investigating the relationship between the expression of TRIM28 and tumors. As part of this LIHC study, the TIMER correlation module was used to analyze the relationship between tumorinfiltrating immune cells and gene expression profiles. The deconvolution statistical method is used to examine the association between infiltrating immune cells and TRIM28 genes in TIMER. As a result of the gene modules, we examined the correlation between TRIM28 and the abundance of immune infiltration in LIHC. A picture of TRIM28 against tumor purity was drawn by TIMER [29]. CIBERSORT (https://cibersort.stanford.edu/), a deconvolution algorithm by evaluating the expression of related genes based on gene expression, served as examination of the relevance between TRIM28 expression and the infiltration of immune cells in LIHC [30]. To build gene expression datasets, we used standard annotation files with a 1,000 permutation default signature matrix. Based on Markov chain Monte Carlo (MCMC) methods, CIBERSORT calculated the P value of the deconvolution method. We split 375 tumor samples into two groups to investigate how TRIM28 expression affects the immune microenvironment. Based on the P value <0.05, we identified the types of lymphocytes which were affected by TRIM28. The content of immune cells in LIHC TCGA samples were quantified via ssGSEA package and "GSVA" R package.
2.7. Comprehensive Analysis. GEPIA analyzed the expression of RNA-sequencing data of 8587 normal and 9736 tumor samples from public databases (TCGA and GTEx) [31]. Overall survival was analyzed by GEPIA for TRIM28 expression in LIHC. Additionally, to calculate differential expression of TRIM28, boxplots were generated via tumor or normal state. The interaction between TRIM28 expression and LIHC survival information were examined by K-M analysis of survival curves [32]. Using the log-rank P value and the hazard ratio (HR), the risk of death was calculated. P value <0.05 was counted as statistically significant.
2.8. Immunohistochemistry-Based Validation of Hub Genes in THPA. THPA, a database funded by a Swedish grant, was used for finding information of immunohistochemically stained tissues and cells for 26,000 human proteins. Antibody proteomics allows THPA detection of normal and LIHC tissues, which commonly regarded as hub gene validation. In this way, THPA can verify TRIM28 gene in normal tissues and LIHC tissues.

Survival Outcomes and Variable Analyses.
Firstly, data was analyzed to confirm TRIM28 expression levels in pancancer via TIMER datasets. According to the analysis, expression levels of TRIM28 are upregulated in most tumors (Figure 1(a)). It has not been studied whether TRIM28 plays a role in human liver cancer, particularly in LIHC. To validate TRIM28's prognostic influence in LIHC, we analyzed the TCGA datasets to determine how TRIM28 expression differed between normal and tumor tissue and discovered that the level of TRIM28 was increased in all LIHC tissues compared to normal liver tissue (Figure 1(b)).

Ranked list metric
The same outcome was obtained in paired LIHC tissues compared with normal tissues (n = 50) (Figure 1(c)). Additionally, LIHC patients with low TRIM28 expression had better OS and PFI (Figures 1(d) and 1(e)). A Cox analysis was performed as valuing the relationship between TRIM28 expression and OS, as well as other multivariable characteristics in LIHC patients, as shown in Table 1. Based on univariate regression analysis, it appears that pathological stage, T stage, M stage, and OS are highly correlated with TRIM28. Based on the multivariate analysis shown in Figure 2(a), TRIM28 expression (P value <0.001) has been found to be an independent predictor of poor prognosis for oncologist-patients (Table 1). Figure 2(b) shows the expression distribution of TRIM28 as well as survival status and TRIM28 expression profiles for patients with LIHC. In Figure 2(c), the ROC curve found that TRIM28 was associated with prognosis since its AUC for survival prediction was 0.687.

Construction and Prediction of Nomogram Model.
In order to establish a clinically applicable way that could assess the prognosis of LIHC patients, the nomogram prediction for predicting the survival probability at 1-, 2-, and 3-year for LIHC patients in TCGA cohort. The predicting model nomogram was constructed by involving the clinical and pathological elements, such as gender, age, histologic grade, pathologic stage, neoplasm staging (T, M, and N stage), and TRIM28 level in Figure 3(a). Based on the clinicopatho-logical characteristics, each patient was assigned a nomogram-based score to predict 1-, 2-, and 3-year survival probability. We found that the nomogram model had perfect performance for predicting the 1-year OS, 2-year OS, and 3year OS of LIHC patients by calibration curves (Figure 3(b)).

Relationship between TRIM28 Expression and
Clinicopathology. As part of the TCGA database, 424 tumor tissues have been analyzed, which includes gene expression data and clinical characteristics collected from patients. LIHC with increased TRIM28 expression was significantly related to the T stage (Figure 4(a)), pathologic stage (Figure 4(b)), histologic grade (Figure 4(c)), AFP (Figure 4(d)), OS event (Figure 4(e)), weight (Figure 4(f)), and BMI (Figure 4(g)). As a result of the study, it was found that high TRIM28 patients had worse T stage, pathologic stage, histological grade, AFP, OS, weight, and worse nutritional status outcomes compared to those with low TRIM28 levels.

PPI Network Construction.
A total of 606 DEGs were included in the PPI network via the STRING database.
Using 254 nodes and 471 edges, the PPI network was constructed to examine the interactions of DEGs correlated with LIHC risk.
In STRING database, 606 DEGs were incorporated into the PPI network in total, including 254 nodes and 471 edges, which could examine the interaction of DEGs related to LIHC risk. (Figure 4(j)).

Relationship between TRIM28 Expression and Tumor-
Infiltrating Immune Cells. Tumor-infiltrating lymphocytes (TILs) were considered to be an independent predictor of OS and sentinel lymph node status in cancer. Therefore, we analyzed the relationship between TRIM28 and immune infiltration level by selecting the TIMER (Figure 5(a)). It was found that there was a positive correlation between TRIM28 expression levels and B cell (P value = 1:98 × 10 −17 ), CD8+ T cell (P value = 2:15 × 10 −6 ), CD4+ T cell (P value = 4:79 × 10 −10 ), macrophage (P value = 6:45 × 10 −13 ), neutrophil (P value = 1:39 × 10 −6 ), and dendritic cell (P value = 1:83 × 10 −12 ). Based on above results, an important and pivotal role was played by TRIM28 in immune infiltration. Our study also sought to determine if there was a difference in tumor immune microenvironment     (Figure 5(c)). According to the heat map, there were moderate to strong correlations between subpopulations of TIICs. Finally, the relevance between TRIM28 and other immunocytes was assessed for yet again by using the GSVA package ( Figure 5(d)).

Data
Validation. First, we used the GEPIA database to analyze TRIM28 expression. The TRIM28 expression was increased in the LIHC group in Figure 6(a). The immunohistochemistry (IHC) images also showed that TRIM28 The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) The expression of TRIM28 log 2 (TPM + 1) (d) 11 Computational and Mathematical Methods in Medicine was more expressed in tumor tissues than in nontumor tissues ( Figure 6(b)). There was a significant correlation between high TRIM28 level and poor OS for LIHC (P value = 0:021 < 0:05, Figure 6(c)). In addition, we performed K-M survival plots to confirm this result. As shown in Figure 6(d), the K-M survival plots revealed that high TRIM28 expression groups had a significant correlation with poor OS rates (P value = 0:00055 < 0:05).
3.8. TRIM28 Possesses a Higher Specificity than AFP for LIHC Diagnosis and qRT-PCR for an External Validation. As a final step in evaluating the diagnostic value of TRIM28, we used ROC analysis to analyze GSE14520, GSE63898, and GSE87630 datasets. LIHC is commonly associated with AFP, a diagnostic tumor marker. The TRIM28 expression in GSE14520 was significantly higher than nontumor tissue (Figure 7(a)), and the AUC of TRIM28 in GSE14520 was 0.853, which was higher than that of AFP (0.685) (Figure 7(b)). In GSE63898, the expression of TRIM28 in nontumor tissues was notably lower than that in tumor tissues (Figure 7(c)). The AUC value of AFP in this dataset was 0.566, which was lower than that of TRIM28(0.706) (Figure 7(d)). The TRIM28 expression in GSE87630 was obviously higher than nontumor tissue (Figure 7(e)), and the AUC of AFP was 0.711, which was lower than that of TRIM28 (0.929) (Figure 7(f)). In Figure 7(g), qRT-PCR was performed on multiple cell lines to confirm TRIM28 expression. These results demonstrated that TRIM28 could be useful as a diagnostic marker for LIHC patients.

Discussion
Cancer deaths from LIHC are the 3rd leading cause of death from cancer, and it is one of the five most commonly diagnosed types [33]. The prevalence of LIHC has continued to increase over the past two decades [34]. Prevention and treatment are essential for the survival of this devastating disease. Bioinformatics analysis was found to be the most suitable solution [35]. A number of previous biomarker studies have proved the effectiveness of this method in LIHC. It was recently discovered that MAST2 is a biomarker that can be used to diagnose and prognosis LIHC by dry-lab analyses. There was a correlation between high MAST2 and late clinical state [36].
As part of our research on LIHC, we were evaluating TRIM28 as a prognostic biomarker. By analyzing the TCGA database, we evaluated TRIM28's prognostic value for patients with LIHC. Further analysis showed that TRIM28 was an independent prognostic factor, and the higher the expression of TRIM28, the worse the survival rate. A high expression level of TRIM28 was associated with the following factors: T classification, pathologic stage, histologic grade, AFP, OS event, weight, and BMI. In conclusion, these results suggest that the expression level of TRIM28 may affect the occurrence, development, and immune microenvironment of LIHC.
GO and KEGG pathway analyses revealed that TRIM28 was participated in cell cycle, amino acid, and fatty acid metabolism. The function of cell cycle pathways in the regulation of tumor had been previously demonstrated [37].
Immunomodulation by antitumor cell cycle inhibitors could be the promising targets of cancer therapy [38][39][40]. At the same time, amino acid and fatty acid metabolism also were considered as a potential targeted therapeutic strategy for cancer therapy [41,42]. Studies have shown that fatty acid receptor and synthase represent a potential strategy and attractive target for tumor treatment [43]. Furthermore, all of those pathways are both recognized as playing a major role in tumor immunity [44][45][46][47]. As a next step, we investigated the relationship between TRIM28 and immune cell infiltration.
The purpose of this study was to examine the relationship between TRIM28 and immuno-cell infiltration level in LIHC by using the TIMER database. It was found that TRIM28 was positively related with B cell, CD8+ T Cell, CD4+ T Cell, macrophage, neutrophil, and dendritic cell. By the CIBERSORT algorithm, we confirmed that high TRIM28 expression was related with upregulation of CD56bright NK cells, T helper cells, TFH, Th1 cells, Th2 cells and downregulation of cytotoxic cells, DC, neutrophils, Tgd, Th17 cells, and Treg. Among the functionally specialized antigen-presenting cells, DC played an important role  14 Computational and Mathematical Methods in Medicine in innate antitumor immunity by activating specific T cells [48]. Tumor development was also inhibited by DC through regulation of humoral immune responses [49]. Secondly, cytotoxic cells and neutrophils played an important role in killing tumor cells [50,51]. As such, our hypothesis was that TRIM28 overexpression could reduce the activity of DCs, cytotoxic cells, and neutrophils. As a result of these studies, TRIM28 is critical for modulating LIHC immune responses. The mechanism by which TRIM28 promotes LIHC immune responses activation is unclear; however, it is necessary to conduct multicenter, randomized, controlled clinical trials, and mechanism studies to better understand the relationship between TRIM28 and LIHC [52][53][54][55][56].
As a final step, GEO datasets and its ROC curve analysis are used to validate our results. Meanwhile, qRT-PCR was used to further verify the expression of TRIM28 in multiple cell lines. It was found that expression levels of TRIM28 were higher than those of nontumor tissues and AUC values of TRIM28 also were higher than those of AFP, the mainstream biomarker for LIHC in 3 datasets. It was demonstrated that liver cancer cell lines expressing TRIM28 were highly expressed. In summary, these results showed TRIM28 was a positive predictive tumor marker for LIHC patients.
Several drawbacks remain in our study. With regard to the first point, let us look at data sources that are sourced from public databases. We solely validated the result by using qRT-PCR in cell lines. Sufficient serum samples from clinical patients will be necessary for the validation of these biomarkers in the future. Next, we will discuss the 2nd point. Because the effectiveness of markers is considerably dependent on mechanism, we need to validate the hypothesis experimentally and elucidate its mechanisms in cancer cells by siRNA or plasmid. Moreover, transgenic animal research is needed to further verify the TRIM28 functions. Fortunately, we have demonstrated that TRIM28 was highly expressed in liver cancer cells line. There was sufficient evidence to initiate further study.

Conclusions
Based on bioinformatic analysis and qRT-PCR, TRIM28 associated with LIHC has been identified. There is a novel and independent prognostic LIHC biomarker, TRIM28, that correlates with immune infiltrates. The TRIM28 gene will provide a novel perspective on LIHC mechanisms with further study in the future. As an effective diagnostic and intervention gene, TRIM28 will be able to diagnose and treat LIHC at an early stage.

Data Availability
The gene expression profiling data supporting this study are from previously reported studies and datasets, which have been cited. The expression and survival data are derived from TCGA and GEO databases. TCGA and GEO belong to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. The qRT-PCR data used to support the findings of this study have not been made available. (https://www.ncbi.nlm.nih .gov/geo/query/acc.cgi?acc=GSE14520; https://www.ncbi .nlm.nih.gov/geo/query/acc.cgi?acc=GSE63898; and https:// www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE87630).