Identification of Key Pathways and Genes in Advanced Coronary Atherosclerosis Using Bioinformatics Analysis

Background Coronary artery atherosclerosis is a chronic inflammatory disease. This study aimed to identify the key changes of gene expression between early and advanced carotid atherosclerotic plaque in human. Methods Gene expression dataset GSE28829 was downloaded from Gene Expression Omnibus (GEO), including 16 advanced and 13 early stage atherosclerotic plaque samples from human carotid. Differentially expressed genes (DEGs) were analyzed. Results 42,450 genes were obtained from the dataset. Top 100 up- and downregulated DEGs were listed. Functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) identification were performed. The result of functional and pathway enrichment analysis indicted that the immune system process played a critical role in the progression of carotid atherosclerotic plaque. Protein-protein interaction (PPI) networks were performed either. Top 10 hub genes were identified from PPI network and top 6 modules were inferred. These genes were mainly involved in chemokine signaling pathway, cell cycle, B cell receptor signaling pathway, focal adhesion, and regulation of actin cytoskeleton. Conclusion The present study indicated that analysis of DEGs would make a deeper understanding of the molecular mechanisms of atherosclerosis development and they might be used as molecular targets and diagnostic biomarkers for the treatment of atherosclerosis.


Introduction
Atherosclerosis associated cardiovascular diseases (CVD) are the leading cause of mortality worldwide. Immune system responses play a pivotal role in all phases of atherosclerosis [1] and inflammation responses contribute to focal plaque vulnerability [2]. High-level LDL in plasma and other atherosclerosis-prone conditions expedite immune cell recruitment into the lesion area in the early and advanced stages [3][4][5]. Variety of inflammatory process was identified during atherosclerosis progression, which might be amenable to interventions.
High-throughput platforms for analysis of gene expression, such as microarrays, are the promising tools for inferring biological relevancy, especially complex network during the process of atherosclerosis. Recently, atherosclerotic gene expression profiling studies have been performed by microarray technology and suggested that hundreds of differentially expressed genes (DEGs) are involved in variety pathways, biological processes, or molecular functions. Microarray technology combined bioinformatics analysis made it possible to analyze the expression changes of mRNA from early to advanced stage of coronary atherosclerosis development, comprehensively. Samples from early ((pathological) intimal thickening and intimal xanthoma) and from advanced (thin or thick fibrous cap atheroma) lesions have been retrieved from the Maastricht Pathology Tissue Collection (MPTC) [6]. However, the protein-protein interactions (PPI) network among DEGs remains to be elucidated.
In this study, the original data was downloaded from Gene Expression Omnibus (GEO). DEGs from early and advanced lesions were screened. Subsequently, the gene ontology and biological function annotation were performed followed by PPI network analysis. By using the bioinformatic 2 BioMed Research International method, further investigation on mechanism of atherosclerosis was lighted and it might provide potential biomarker candidates for clinical use and drug targets discovery.

Microarray Data.
The gene expression profiles of GSE28829 were downloaded from Gene Expression Omnibus (GEO). GSE28829 was performed on GPL570, [HG-U133 Plus 2] Affymetrix Human Genome U133 Plus 2.0 Array. The GSE28829 data set contained 29 samples, including 16 advanced atherosclerotic plaque samples and 13 early atherosclerotic plaque samples.

Identification of Differentially Expression Genes (DEGs).
The analysis was carried out by Morpheus (https://software .broadinstitute.org/morpheus/). The expression files were uploaded. Advanced and early stages of atherosclerotic plaque were assigned according to the annotation of the GSE28829 (https://www.ncbi.nlm.nih.gov/geo/query/acc .cgi?acc=GSE28829). DEGs were identified using signal to noise method where a total of 42,450 genes were analyzed and top 100 (top 100 upregulated and top 100 downregulated genes) genes were listed.

Gene Ontology and Pathway Enrichment Analysis of
DEGs. Cellular component, molecular function, biological process, and Kyoto Encyclopedia of Genes and Genomes (KEGG) were analyzed using a web-based tool, search tool for the retrieval of interacting genes (STRING) (https://stringdb.org/). Due to limitation of the settings of the tool, top 2000 upregulated genes and top 2000 downregulated genes were analyzed.

Integration of Protein-Protein Interaction (PPI) Network
and Module Analysis. STRING (version 10.0) was used to evaluate the interactive (PPI) relationships between DEGs. Only experimentally validated interactions with a combined score >0.4 were selected as significant. PPI networks were constructed using the Cytoscape software. A plug-in molecular complex detection (MCODE) was used to screen the modules of PPI network identified in Cytoscape. Modules inferred using the default settings that the degree cutoff was set at 2, node score cutoff was set at 0.2, -core was set at 2, and max. depth was 100.

Pathways Interrelation Analysis.
Pathways interrelation analysis was carried out using plug-in ClueGO v2.3.3. Genes composed of modules A and D (inferred from MCODE) were analyzed. KEGG was conducted and pathways with < 0.05 were showed in Figure 3

Identification of Differentially Expressed Genes (DEGs).
29 samples from atherosclerotic carotid artery segments, 16 advanced and 13 early lesions included, have been retrieved from the Maastricht Pathology Tissue Collection (MPTC). The series from each chip was analyzed by Morpheus using signal to noise method to find out as much as possible genes up-or downregulated. Among the total 42,450 genes, the most significant signal of upregulated gene is C2, and the signal to noise score is 1.792. The most significant signal of downregulated gene is H2AFV where the signal to noise score is −2.249. All the DEGs were listed (data not shown). Top 100 upregulated and downregulated genes were listed, as shown in Figure 1.

Pathways Interrelation Analysis.
In order to investigate the involved interrelation between the pathways unidentified before, modules inferred from the network were analyzed and the interrelation between pathways and genes involved was drawn as shown in Figure 3. Modules with highest MCODE score were selected where for module A inferred from upregulated DEGs and module D from downregulated DEGs ( Figure 2) pathways interrelation analysis was conducted. As shown in Figure 3(a), these genes from module A mainly are involved in four pathways that were NFkappa B signaling pathway, chemokine signaling pathway, legionellosis signaling pathway (with Salmonella infection, interleukin 17 (IL-17), tumor necrosis factor (TNF), epithelial cell, and rheumatoid arthritis (RA) signaling pathway as subgroups), and Staphylococcus aureus infection signaling pathway. C-X-C motif chemokine ligand 2 (CXCL2), C-X-C motif chemokine ligand 3 (CXCL3), C-X-C motif chemokine ligand 8 (CXCL8), and C-X-C motif chemokine ligand 12 (CXCL12) took part in three pathways which were NFkappa B signaling pathway, chemokine signaling pathway, and legionellosis signaling pathway (nodes in three colors). C-C motif chemokine ligands 19 and 21 (CCL19, CCL21) were involved in NF-kappa B and chemokine signaling pathway while C-C motif chemokine ligand 5 (CCL5), C-C motif chemokine ligand 20 (CCL20), C-X-C motif chemokine ligand 1 (CXCL1), and C-X-C motif chemokine ligand 1 (CXCL3) played a role in both legionellosis and chemokine signaling pathway. Besides, Complement C3 (C3) participated in legionellosis and Staphylococcus aureus infection signaling pathway. Pathway and gene set were listed in Table 6. Analysis of module D demonstrated that these genes were mainly involved in focal adhesion (with regulation of actin cytoskeleton, platelet activation, and long-term potentiation as subgroups), adherens junction (with glioma, melanoma signaling pathways as subgroups), pathogenic Escherichia coli infection, and mRNA surveillance pathway (with adrenergic signaling in cardiomyocytes, oocytes meiosis signaling pathway as subgroups). Among these genes, RHOA took in 5 pathways that were pathogenic Escherichia coli infection, vascular smooth muscle contraction, focal adhesion, adherens junction, and mRNA surveillance pathway. Raf-1 protooncogene and serine/threonine kinase (RAF1) participate in 4 pathways. Protein phosphatase 2 catalytic subunit beta (PPP2CB), protein phosphatase 2 regulatory subunit B (B56) alpha isoform (PPP2R5A), protein phosphatase 2 regulatory subunit B (B56) gamma isoform (PPP2R5C), protein phosphatase 1 catalytic subunit beta (PPP1CB), insulin-like growth factor 1 receptor (IGF1R) and cytochrome C, somatic (CYCS), participate in 3 pathways. Ras homolog enriched in brain (RHEB), protein phosphatase 3 catalytic subunit beta (PPP3CB), epidermal growth factor receptor (EGFR), smooth muscle gamma-actin (ACTG2), Vinculin (VCL), and protein phosphatase 1 regulatory subunit 12A (PPP1R12A) took part in 2 pathways. Pathway and gene set were listed in Table 7.

Discussion
The underlying cause of the cardiovascular event is atherosclerosis, a chronic inflammatory disease [7]. Profoundly understanding the molecular mechanism of atherosclerosis was critically important for diagnosis and treatment of cardiovascular disease. Since microarray and high-throughput sequencing provided thousands of gene expression data types, it has been widely used to predict the potential therapeutic targets for atherosclerosis. In the present study, GSE28829 was analyzed and the total differentially expressed genes were identified between early and advanced plaque collected from patients. Functional annotation demonstrated that these DEGs were mainly involved in osteoclast differentiation, cytokine-cytokine receptor interaction, chemokine signaling pathway, lysosome and Staphylococcus aureus infection, focal adhesion, regulation of actin cytoskeleton, arrhythmogenic right ventricular cardiomyopathy (ARVC), oxytocin signaling pathway, and cGMP-PKG signaling pathway.
Cross-talks between the vascular and immune system play a critical role in atherosclerosis. It is a key point that new drug development should not be focused on cardiovascular system only; the immune system is the potential target for the treatment of atherosclerosis either. The osteoclast-associated receptor (OSCAR), originally described in bone as immunological mediator and regulator of osteoclast differentiation, may be involved in cell activation and inflammation during atherosclerosis [8]. Cytokine interactions mainly involved interleukins (IL), transforming growth factors (TGF), interferons (IFN), and tumor necrosis factors (TNF) [9,10]. CCL2, CCL5, IFN-, and TNF-participate in the monocyte recruitment. IFN-, IL-1 , TGF-, and TNF-take part in plaque stability. IFN-, IL-1 , IL-6, IL-12, IL-33, and M-CSF are involved in lesion formation. These signaling pathways but also those identified in this study are well documented where these cytokine targeted therapies use antibodies to block and inhibit proinflammatory cytokine signaling in order to dampen the inflammatory response observed in atherosclerotic lesions [11]. In this study, signal to noise method implanted in the Morpheus was used to identify the DEGs where this method could get most number of DEGs. In order to better understand the interaction of DEGs, GO and KEGG analysis were performed.
The GO term analysis revealed that the upregulated genes were mainly involved in immune system process, defense response, and regulation of immune system process (Table 3). These results showed that, as atherosclerosis developed, immune system cells activated and gathered in the plaque [12][13][14]. Downregulated genes were mainly involved in   cytoskeleton organization, cellular component organization, and positive regulation of cellular process and confirm the recent findings [15][16][17] (Table 3). Besides, as shown in Table 4, the KEGG analysis showed that upregulated genes participate in osteoclast differentiation [18][19][20], cytokine-cytokine receptor interaction [21], and chemokine signaling pathway [22][23][24]. Downregulated genes took part in focal adhesion [25][26][27], regulation of actin cytoskeleton [27][28][29], and arrhythmogenic right ventricular cardiomyopathy (ARVC) [30]. These pathways demonstrated promising targets for new drugs intervention. It is important to keep in mind that the upstream or the key node gene might not be the appropriate target for drug design because of the core effects and farrange effects especially the side effects that prevent the further application of the drugs. These GO term and KEGG analyses indicated the possible direction of experimental validation.
Module analysis of the PPI network showed that the development of atherosclerosis was associated with chemokine signaling pathway, cell cycle, B cell receptor signaling pathway, focal adhesion, and regulation of actin cytoskeleton. Indeed, kinds of chemokines were secreted and trapped different types of immune cells to the arterial plaque [49,50]. As atherosclerosis developed, the immune system offers a large variety of immune checkpoint proteins; both costimulatory and inhibitory proteins are involved. Costimulatory proteins can promote cell survival, cell cycle progression, and differentiation to effector and memory cells, whereas inhibitory proteins terminate these processes to halt ongoing inflammation [51]. Studies showed that B1 cells can prevent lesion formation, whereas B2 cells have been suggested to promote it [52,53]. These activated signaling pathways are key to the development of atherosclerosis; it suggested the promising candidates for therapeutic intervention.
Interrelation between pathway showed that cross-talk arises through genes participating in different signaling pathways. It was suggested that these genes might be used as targets for intervention. Liver X receptors (LXRs), as a promising target, preventing the development of atherosclerosis, attracted much more attention during these years. Both activators of LXR and LXR presented preferable effects in preclinical practice but due to unclarified mechanism, these activators always induce adverse neurological events [54,55]. Analysis of interrelation between pathways suggested that the fact that the cross-talk might be beneficial or detrimental for the ultimate clinical goal should be taken much more into consideration.

Conclusion
All these results in this study inspired that immune system and inflammation progress are the promising targets for prevention of atherosclerosis besides lipid lowering and cholesterol metabolism regulation. In fact, immune system disorders are the physiological and pathological basis of many diseases, including angiocardiopathy [56][57][58][59]. Our data provides a comprehensive bioinformatics analysis of DEGs that might be involved in the development of atherosclerosis. Those genes and signaling pathway identified in this study implied further application for clinical use. However, molecular biological experiments are required to confirm the function of the identified genes in atherosclerosis.

Conflicts of Interest
The authors declare that there are no conflicts of interest.