In Silico Identification of Key Genes and Immune Infiltration Characteristics in Epicardial Adipose Tissue from Patients with Coronary Artery Disease

Background The present study is aimed at identifying the differentially expressed genes (DEGs) and relevant biological processes and pathways associated with epicardial adipose tissue (EAT) from patients with coronary artery disease (CAD). We also explored potential biomarkers using two machine-learning algorithms and calculated the immune cell infiltration in EAT. Materials and Methods Three datasets (GSE120774, GSE64554, and GSE24425) were obtained from the Gene Expression Omnibus (GEO) database. The GSE120774 dataset was used to evaluate DEGs between EAT of CAD patients and the control group. Functional enrichment analyses were conducted to study associated biological functions and mechanisms using the Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), and Gene Set Enrichment Analysis (GSEA). After this, the least absolute shrinkage and selection operator (LASSO) and support vector machine recursive feature elimination (SVM-RFE) were performed to identify the feature genes related to CAD. The expression level of the feature genes was validated in GSE64554 and GSE24425. Finally, we calculated the immune cell infiltration and evaluated the correlation between the feature genes and immune cells using CIBERSORT. Results We identified a total of 130 upregulated and 107 downregulated genes in GSE120774. Functional enrichment analysis revealed that DEGs are associated with several pathways, including the calcium signaling pathway, complement and coagulation cascades, ferroptosis, fluid shear stress and atherosclerosis, lipid and atherosclerosis, and regulation of lipolysis in adipocytes. TCF21, CDH19, XG, and NNAT were identified as feature genes and validated in the GSE64554 and GSE24425 datasets. Immune cell infiltration analysis showed plasma cells are significantly more numerous in EAT than in the control group (p = 0.001), whereas macrophage M0 (p = 0.024) and resting mast cells (p = 0.036) were significantly less numerous. TCF21, CDH19, XG, and NNAT were correlated with immune cells, including plasma cells, M0 macrophages, and resting mast cells. Conclusion TCF21, CDH19, XG, and NNAT might serve as feature genes for CAD, providing new insights for future research on the pathogenesis of cardiovascular diseases.


Introduction
Coronary artery disease (CAD) is one of the leading causes of death worldwide, and atherosclerosis is its most basic associated pathophysiological change [1]. Obesity represents a significant risk factor for cardiovascular disease, and the expansion of ectopic and visceral fat is strongly involved in the pathogenesis of CAD [2]. Recent evidence revealed the promising role of epicardial adipose tissue (EAT) in the occurrence, development, and prognosis of CAD [3]. EAT is recognized as a unique adipose storage, supplied by the branches of the coronary artery and directly adjacent to the myocardium. It is mainly comprised of adipocytes, stroma-vascular cells, fibroblasts, nerves, and various immune cells. Besides providing energy storage, the EAT serves as an endocrine and immune organ [4,5]. Under physiological conditions, the EAT plays an important part in cardiac metabolism, prevention of cardiac lipotoxicity, mechanical protection of coronary arteries, and provides immunological support for the heart [6]. The link between EAT inflammation and CAD has increasingly attracted research focus. Over the recent years, the EAT has been proposed as a biomarker for acute coronary syndrome (ACS), major adverse cardiac events (MACE), and atrial fibrillation (AF) [7][8][9]. Moreover, several large-scale cohort studies demonstrated that the EAT volume is positively associated with the occurrence, development, and prognosis of CAD [10][11][12]. Specifically, it is currently accepted that some cytokines secreted by the EAT either protect or negatively affect cardiomyocytes' function and coronary arteries through paracrine or vasocrine mechanisms [13,14]. Cytokines secreted by the EAT might diffuse through the interstitial fluid into coronary wall layers. Besides, they could be directly released into the vasa vasorum of the coronary arteries [15,16]. In pathological conditions, the proinflammatory or proatherogenic factors secreted by the EAT, including IL-6, IL-8, monocyte chemoattractant protein 1, leptin, resistin, and tumor necrosis factor α [15], exert their pathophysiological effects through direct diffusion, enhancing the potential to induce atherogenic changes in monocytes and endothelial  GSM1574192  GSM1574192  GSM1574190  GSM1574188  GSM1574186  GSM1574184  GSM1574182  GSM1574169  GSM1574171  GSM1574173  GSM1574175  GSM1574177  GSM1574179  GSM1574181  GSM1574183  GSM1574185  GSM1574187  GSM1574189  GSM1574191  GSM1574193  GSM1574170  GSM1574172  GSM1574174  GSM1574176  GSM1574178  GSM1574180 Group EAT SAT (c) cells [17]. Leptin, for example, is regarded as an independent risk factor for atherosclerosis that exerts a variety of atherogenic effects, such as increasing endothelial dysfunction, promoting inflammatory responses, oxidative stress induction, platelet aggregation and migration, and the proliferation of vascular smooth muscle cells [3,18]. Although a high number of studies confirmed the involvement of the EAT in the development and progression of coronary atherosclerosis through adipokines, the exact mechanisms through which the EAT participates in CAD remain unclear [3,5,[19][20][21]. A considerable limitation of these studies relates to the sole recruitment of patients who underwent cardiac surgery. Furthermore, it is difficult to collect the EAT from healthy subjects due to ethical concerns, whereby the subcutaneous adipose tissue (SAT) is usually used as control across various studies [22][23][24][25]. Bioinformatics analysis has been extensively applied to the identification of differentially expressed genes (DEGs) at the genome-wide level and constitutes a useful strategy for exploring the potential biomarkers and molecular mechanisms associated with the EAT and CAD. Here, we screened two microarray datasets from the Gene Expression Omnibus (GEO) database for DEGs between the EAT and the SAT. We attempted to explore the underlying biological functions using enrichment analysis and identified the best feature genes by employing machine-learning algorithms. In addition, we used CIBERSORT to investigate the proportion of immune cells that are present in the EAT [26,27] and studied the relationship between the feature genes and infiltrating immune cells to provide a basis for further research.

Materials and Methods
2.1. Microarray Data. The GSE120774, GSE64554, and GSE24425 datasets were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). The GSE120774 dataset was used as the discovery cohort, and GSE64554 and GSE24425 datasets were used as the validation cohort. We analyzed a total of 9 EAT and 8 SAT samples from patients with CAD in GSE120774, which was based on the GPL6244 Affymetrix Human Gene 1.0 ST Array. In addition, there were 13 EAT and 13 SAT samples from patients with CAD in GSE64554, which was based on the GPL6947 Illumina HumanHT-12 V3.0 expression bead chip. Furthermore, 6 EAT and 6 SAT samples from patients with CAD in GSE24425 were also analyzed, which was based on the GPL6884 Illumina HumanWG-6 V3.0 expression beadchip. We used the limma package in R to normalize the expression data and ensure a similar distribution among these datasets.

Identification of Differentially Expressed
Genes. The DEGs were identified by the limma package in R. A volcano plot was used to assess the DEGs, and the cutoff was set as jlog 2 fold change ðFCÞj ≥ 1 (adjusted p value < 0.05).

Functional Annotation for Differentially Expressed
Genes. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses were conducted using the Database for Annotation, Visualization, and Integrated Discovery (DAVID). GO was composed of biological processes (BP), cell components (CC), and molecular function (MF). The R package ggplot was used to visualize the results. Functional enrichment analysis on all expression data was performed by Gene Set Enrichment Analysis (GSEA). The R packages clusterProfiler and http:// org.Hs.eg.db were used to conduct GSEA. The GSEA cutoff point was set as a p value < 0.05 and jnormalized enrichment score ðNESÞj > 1.

Feature Genes Identification.
We used two machinelearning algorithms to screen for the most significant candidate biomarkers between SAT and EAT. The least absolute shrinkage and selection operator (LASSO), which was based on a regression analysis algorithm, is suitable for both linear and nonlinear cases. We used the glmnet package in R to perform LASSO. Support vector machine (SVM) is another machine-learning algorithm that is used for regression or 3 BioMed Research International classification. To avoid overfitting, the SVM-recursive feature elimination (RFE) was used to screen for feature genes from selected genes. We selected the top 20 genes for the SVM-RFE algorithm according to |log2 fold change (FC)| and then merged the obtained genes using the two algo-rithms to get the intersection. Both LASSO and SVM-RFE were performed using the e1071 and mlbench R packages. To further evaluate the diagnostic ability of the candidate biomarkers, we calculated the area under the curve (AUC) of the receiver operating characteristic (ROC) curve.

Results
3.1. Identification of DEGs. The GSE120774, GSE64554, and GSE24425 datasets were normalized before analysis ( Figure 1 and Supplemental File- Figure 1 show both the nonnormalized and normalized data). We identified a total of 130 upregulated and 107 downregulated genes. Genes with the most significant logFC in EAT compared with SAT in CAD patients are shown in the volcano plot of Figure 2.

Functional Enrichment Analysis of DEGs.
We subsequently conducted functional enrichment analyses, including GO, KEGG, and GSEA, to explore the biological function and pathways associated with the DEGs. GO enrichment analysis revealed that negative regulation of 5 BioMed Research International transcription from RNA polymerase II promoter, negative regulation of cell proliferation, cell adhesion, angiogenesis, and response to lipopolysaccharide are enriched terms in BP (Figure 3(a)); plasma membrane, extracellular space, extracellular region, and extracellular exosome are enriched terms in CC (Figure 3(b)); and RNA polymerase II transcription factor activity, calcium ion binding, integrin binding, and DNA-binding activities are enriched in MF (Figure 3(c)). In addition, KEGG pathway analysis revealed that DEGs are mainly involved in the complement and coagulation cascades, fluid shear stress and atherosclerosis, and TNF signaling pathway (Figure 3(d)). In the GSEA, we identified several enriched pathways, including the calcium signaling pathway, complement and coagulation cascades, ferroptosis, fluid shear stress and atherosclerosis, lipid and atherosclerosis, and regulation of lipolysis in adipocytes (Figures 4(a)-4(f)).

Immune Cell Infiltration.
Functional enrichment analysis revealed that DEGs might be involved in immune response, whereby we used the CIBERSORT algorithm to explore immune cell infiltration between EAT and SAT in CAD patients. The composition of immune cells in EAT vs. SAT samples in CAD patients is shown in Figure 8(a), which shows the proportions of plasma cells are notably higher in the EAT compared to the SAT (p = 0:001). In contrast, the proportion of M0 macrophages (p = 0:024) and  BioMed Research International mast cell resting (p = 0:036) are notably lower in the EAT than in the SAT (Figure 8(b)).

Correlation Analysis between the Four Feature Genes
and Immune Cells. We found that TCF21 is positively correlated with plasma cells (r = 0:824, p < 0:001), but negatively  (Figures 9(a)-9(d)). Overall, we found that the four feature genes are highly correlated with immune cells.

Discussion
The EAT participates in the pathological process of atherosclerosis through the endocrine and paracrine pathways, although the specific mechanisms remain unknown [14]. Here, we found 130 upregulated and 107 downregulated genes from a microarray analysis. Functional enrichment analysis indicated that these DEGs are involved in various pathophysiological processes and that four feature genes (TCF21, CDH19, XG, and NNAT) identified via LASSO regression and the SVM-RFE algorithm are correlated with immune cells, including plasma cells, M0 macrophages, and resting mast cells, as shown by infiltration analysis.
Previous studies have revealed that adipokines secreted by the EAT might affect myocardial cells and coronary arteries [3,19]. Hypoxic and dysfunction of EAT might lead to lipolysis and inflammatory activities through the dysregulated secretion of vasoactive and inflammatory factors, which are involved in the process of atherosclerosis, including vascular remodeling, endothelial dysfunction, the proliferation and migration of smooth muscle cell (SMC), foam cell formation, and plaque destabilization [28]. Intelectin 1 (ITLN1), which in our analysis had the highest expression differences between the EAT and the SAT (Figure 2, Supplemental File- Figure 2A), is abundantly expressed in visceral adipose tissue and known to regulate obesity-related   immune cell subtypes were compared between the EAT and the SAT group. The blue and red colors represent the SAT and the EAT samples, respectively. 9 BioMed Research International cardiometabolic disorders through its anti-inflammatory activity [29]. Leptin is regarded as an independent risk factor for atherosclerosis that exerts a variety of atherogenic effects. However, the expression level of leptin was not significantly higher in EAT compared with SAT in our analysis (Supplemental File- Figure 2(b)), and we hypothesize that the reasons might be as follows: (1) the samples are not sufficient to show significant differences; (2) leptin in EAT might mainly derived from circulation. In contrast, chemerin, which can bind to the G protein-coupled receptor (CMKLR1), is associated with immune response and the metabolism of glucose and lipids [30], and its expression levels are reportedly positively associated with coronary atherosclerosis [21].

B cells naive B cells memory Plasma cells T cells CD8 T cells CD4 naive T cells CD4 memory resting T cells CD4 memory activated T cells follicular helper T cells regulatory (Tregs) T cells gamma delta
Our study identified four feature genes (TCF21, CDH19, XG, and NNAT) associated with CAD using two machinelearning algorithms. TCF21 is involved in cardiac fibrosis and plays a critical role in the fate of smooth muscle cells

10
BioMed Research International [31], promoting SMC dedifferentiation by inhibiting the serum response factor-myocardin axis (SRF-MYOCD) [32]. The specific effects of TCF21 on atherosclerosis are complex. On the one hand, TCF21 suppresses the progression of atherosclerosis by regulating the transition from SMC to fibromyocytes and promoting the formation of antiatherosclerotic fibrous caps on the lesions [33]. On the other hand, when compared with the control, the transfection of TCF21 siRNA (siTGF21) notably decreases the level of reactive oxygen species (ROS) and cell apoptosis-related protein Bax and leads to an increase in the expression of active antiapoptotic protein Bcl-2 in human umbilical vein cells (HUVECs) [34]. This suggests that TCF21 might promote atherosclerosis via increasing the apoptosis rate and ROS accumulation. Cadherin 19 (CDH19) is a gene encoding calciumdependent cell adhesion proteins involved in vascular remodeling and plays a critical role in the structural integrity of blood vessels [35]. Recent studies have demonstrated the involvement of classic cadherin in many complex processes, such as angiogenesis, morphogenesis, cellular communication, and cellular proliferation [36][37][38]. Niu et al. [39] revealed that the expression knockdown of CDH12 and CDH19 markedly inhibits monocyte chemotactic protein-1-induced protein (MCPIP) and suppresses the capillary-like tube formation in HUVECs. Moreover, CDH19 might serve as a new target of tumorigenesis and drug development for glioblastoma stem-like cells (GSC) and can be considered an independent prognostic biomarker of lung adenocarcinoma (LUAD) and breast cancer (BC) [36,40,41]. XG was one of the blood group systems located at the pseudoautosomal boundary on the short arm of chromosome X, composed of two X-borne alleles, Xg a and Xg [42]. Recent studies evaluating the biological functions of the gene were limited to its association with red blood cells (RBC). Meynet et al. showed that high XG protein expression in Ewing's sarcoma (EWS) is associated with a worse prognosis. Furthermore, the overexpression of XG increased the proliferation and migration of EWS cells in vitro, while the knockdown of the gene with short hairpin RNA led to the opposite effect [43]. However, the role played by XG in atherosclerosis remains uncharacterized. Finally, NNAT is a paternally imprinted gene, which is expressed in the developing brain, pituitary, pancreas, and adipose tissue, and plays an important role in the appetite behavior, energy balance, adipogenesis, and inflammatory responses associated with insulin resistance [44][45][46]. Gene set enrichment analysis indicated a significantly negative correlation exists between NNAT and energy metabolism, but uncovered a positive correlation with inflammation [46]. It has been reported that NNAT inhibits oxidative stress and inflammation and promotes adipocyte differentiation by mediating the NF-κB signal pathway [45]. NNAT expression levels are also closely associated with endothelial dysfunction and EAT secretion [45,47]. Furthermore, it has been found that increased NNAT expression levels are associated with poor prognosis in myxoid liposarcoma, lung cancer, and breast cancer [48][49][50]. However, very few studies clarified the association of this gene with atherosclerosis.
We calculated immune cell infiltration and estimated the correlation between the four genes and immune cells. We found that the four feature genes are correlated with immune cells, including plasma cells, M0 macrophages, and resting mast cells. To our knowledge, this is the first study to calculate the infiltration of the immune cells in EAT vs. SAT. Adipocytes not only serve as an energy storage depot but also play a critical role in endocrine and immune. Adipokines, such as leptin and adiponectin, are critical for the development of B cells, activation, and antibody production [51]. Hence, adipocytes play a crucial role in adaptive immunity mediated by B cells.
Despite the associations described above, few studies investigating the molecular mechanisms between these four genes and immune cells have been published to date, whereby further experiments are required to explore their pathogenesis. Among the limitations to our study, we can include (1) the choice of the SAT as control rather than the EAT of healthy individuals (due to ethical restrictions). Hence, the difference between the EAT and the SAT in healthy groups remains unknown; (2) the three datasets have limited sample sizes; (3) the association between the feature genes and CAD and their interaction with immune cells needs further investigation on larger sample sizes to confirm our observations.

Conclusions
In this study, we identified the DEGs between the EAT and the SAT in patients with CAD and explored the potential biological processes and pathways involved. The identified DEGs are mainly associated with the calcium signaling pathway, complement and coagulation cascades, ferroptosis, fluid shear stress and atherosclerosis, lipid and atherosclerosis, and regulation of lipolysis in adipocytes. In addition, the four feature genes identified (TCF21, CDH19, XG, and NNAT) might serve as feature genes for CAD, bringing new insights into the pathogenesis of cardiovascular diseases.

Data Availability
All the analyses in this study were based on the publicly available datasets (GSE120774, GSE64554, and GSE24425). Original data are available in the GEO database (https:// www.ncbi.nlm.nih.gov/).

Conflicts of Interest
The authors declare no conflict of interest.