Identification of Immune-Related Genes in Sepsis due to Community-Acquired Pneumonia

Background Immunosuppression has a key function in sepsis pathogenesis, so it is of great significance to find immune-related markers for the treatment of sepsis. Methods Datasets of community-acquired pneumonia (CAP) with sepsis from the ArrayExpress database were extracted. Differentially expressed genes (DEGs) between the CAP group and normal group by Limma package were performed. After calculation of immune score through the ESTIMATE algorithm, the DEGs were selected between the high immune score group and the low immune score group. Enrichment analysis of the intersected DEGs was conducted. Further, the protein-protein interaction (PPI) of the intersected DEGs was drawn by Metascape tools. Related publications of the key DEGs were searched in NCBI PubMed through Biopython models, and RT-qPCR was used to verify the expression of key genes. Results 360 intersected DEGs (157 upregulated and 203 downregulated) were obtained between the two groups. Meanwhile, the intersected DEGs were enriched in 157 immune-related terms. The PPI of the DEGs was performed, and 8 models were obtained. In sepsis-related research, eight genes were obtained with degree ≥ 10, included in the models. Conclusion CXCR3, CCR7, HLA-DMA, and GPR18 might participate in the mechanism of CAP with sepsis.


Introduction
Recent studies have reported high mortality and morbidity in community-acquired pneumonia (CAP) [1], which is frequently complicated with sepsis. Uncontrolled excessive inflammatory response to infection and injury is a significant cause of CAP with sepsis [2]. This study shows that infection is one of the causes of sepsis [3]. CAP with sepsis is characterized by inflammatory syndrome in the early stages, which is similar to the symptoms of sepsis caused by other types of disease, such as peritonitis [4], cholangitis [5], and acute bacterial meningitis [6]. Subsequently, the body produces a series of inflammatory responses to infection, in which the immune function and the coagulation system of the body undergo complex changes [7]. It has been reported that noninvasive biomarkers for diagnosis of sepsis [8]. However, the identification of biomarkers for the sepsis diagnosis is still lacking.
The immune regulation disorder is one of the key mechanisms of sepsis. The initial immune response of sepsis is characterized by the production and release of excessive proinflammatory mediators, which results in clinical symptoms of fever, rapid heart rate, and shortness of breath [9]. The proinflammatory response not only is beneficial to the clearance of pathogenic agents but also increases the immune damage to the body itself [9]. An important marker of progressive sepsis is the suppression of the immune response, which is mainly manifested by the decrease in the number and function of immune cells [10], including inactivation of macrophages, low ability of antigen presentation, and decreased activity of lymphocyte proliferation. Meanwhile, the release of inhibitory cytokines is also a significant cause of induction of immunosuppression [11]. Therefore, in the pathogenesis of sepsis, immune reaction plays an important influence, which impacts the occurrence and development of sepsis. It is significant to find immune-related markers for sepsis treatment.
As a new technology, clinical bioinformatics is considered to be one of the effective methods for cancer prediction, early diagnosis, and treatment [12]. This method has been widely used in many studies, such as liver cancer and gastric cancer [13,14]. It also plays an increasingly important role in some nontumor diseases [15]. Based on the analysis of integrating bioinformatics and meta-analysis, COL1A2 is a novel biomarker to improve clinical prediction in human gastric cancer [16]. Comprehensive bioinformatics analysis demonstrates the expression profile, clinical application significance, and prognostic value of the SLC16A gene family in pancreatic cancer [17]. In our study, we downloaded sepsis-related data from public databases to screen for differentially expressed genes. The function and pathway enrichment of different genes were delved. Based on the publication, we screened important differential genes as possible biomarkers of sepsis.

Materials and Methods
2.1. Data Source. From the EMBL-EBI ArrayExpress [18] database, the expression profile matrix file of preprocessed data and the standardized probe were downloaded (number: E-MTAB-5273). The chip platform was A-MEXP-2210-Illumina HumanHT-12_V4_0_R1_15002873_B. The human whole blood data of 127 community-acquired pneumonia patients with sepsis (CAP group) including 63 females and 64 males and 10 healthy subjects (normal group) including 2 females and 8 males were included in this study. Blood samples for RNA were obtained after ICU admission at the time of study enrollment (a window up to day 5). The average age of the cap group was 62.1 years, and that of the normal group was 67.9 years, and no significant difference was found (P = 0:25).

2.2.
Prepreprocess of the Data. The annotation information was downloaded from the platform. Through one-to-one matching between the probe number and the gene symbol, the probes not matching the gene symbol got removed. If several probes mapped to one gene, the average value of different probes was selected as the final value of the gene, so as to get the gene expression profile.

Calculation of Immune Score and Screening of
Differentially Expressed Genes (DEGs). The ESTIMATE algorithm [19] was applied to estimate the immune score of 127 patients with sepsis. Based on the ESTIMATE algorithm, the immune score was calculated. Samples above the median immune score were classified as the high immune score group, while those below the median were classified as the low immune score group.
The screening of DEGs was divided into two parts. In the first part, the DEGs between the CAP group and the normal group got delved using Bayesian tests with Limma package [20] (version 3.10.3), and the P value was corrected by the Benjamini/Hochberg tests. The genes with |log fold change ðFCÞ | >0:585 and adj. P value < 0.05 were identified as DEGs. Differentially expressed genes between the low immune score group and high immune score group were defined as immune-related DEGs according to adj. P value < 0.05 and |log FC | >0:585.
Furthermore, the DEGs between upregulated DEGs (CAP vs. normal group) and downregulated DEGs (high immune vs. low immune group) were selected for further analysis. Moreover, the DEGs between downregulated DEGs (CAP vs. normal group) and upregulated DEGs (high immune vs. low immune group) were also chosen for further analysis. After obtaining the terms that meet the above parameters, further cluster analysis got performed based on the similarity of genes enriched in each term (similarity of >0.3). The term with the lowest P value in the cluster was used to represent the cluster. The top 20 clusters were retained for bar graph display. To further obtain the relationship between terms, the interactive network diagram of terms was used to show the selection criteria of terms as follows: the item with the lowest P value from each cluster of 20 clusters was selected, and each cluster was limited to no more than 15 items. The total number could not exceed 250 items. Terms with similarity > 0:3 were connected by edges. Finally, the interactive network of related terms was built by Cytoscape software [21] (version 3.4.0).

Network and Module Analysis of PPI.
The Metascape was used to analyze protein-protein interaction (PPI) of intersecting DEGs based on the database of BioGrid [22], InWeb_IM [23], and OmniPath [24]. Parameters were set to default. Metascape tool based on Molecular Complex Detection (MCODE) was used for module mining in the PPI network [25]. For each module, GO BP and KEGG enrichment analysis was conducted, and the results of the top 3 of P value were retained for display.

Publication Investigation of Key
Genes. The DEGs in the PPI network with a degree of connectivity ≥ 10 and the module genes mined were selected as the key genes. Based on the Biopython [26] (https://doc.yonyoucloud.com/doc/ Biopython-cn/cn/chr09.html#sec-elink-citations), the sepsisrelated publications were searched in the National Center for Biotechnology Information (NCBI) PubMed using the symbol of genes and sepsis as the keywords. The number of publications on sepsis of each gene was counted, and the genes corresponding to 3 ≤ number of publications ≤ 10 were taken as the focus of attention on sepsis-related genes.
To observe the gene expression in the normal group, high immune score group, and low immune score group, GraphPad Prim 5 [27] was used to draw the expression scatter diagram of each key sepsis-related gene.  Figure 1 shows the analysis process. According to the preprocessing analysis, there were 19,189 genes annotated. Based on the median immune score of each sample, 64 samples with high immune scores and 63 samples with low immune scores were obtained. From the three-dimensional analysis of principal component analysis (PCA) about 127 samples, the disease group and normal group could separate, which indicated that the samples could be used for further analysis (Supplementary Figure 1). According to the screening threshold, 2668 DEGs (1052 upregulated DEGs and 1616 downregulated DEGs) were obtained between the CAP group and the normal group. Besides, 409 DEGs (238 upregulated DEGs and 171 downregulated DEGs) were obtained between the high immune score group and low immune score group.

Differentially Expressed Analysis.
The intersected DEGs in two parts were obtained in the opposite direction including 360 DEGs (157 upregulated genes and 203 downregulated) as shown in Figures 2(a) and 2(b). The heatmap of the expression of intersected DEGs was shown in Figure 2(c), which indicated the significant differences among different groups of these DEGs.

Enrichment Analysis Result.
A total of 157 enriched terms were obtained, including 122 GO BP, 15 GO CC, 9 GO MF, and 11 KEGG pathways, which were gathered in 20 clusters (Figure 3(a)). Meanwhile, the interactive network between the terms was in Figure 3(b). The data revealed the intersected DEGs were significantly enriched in the pathways which were correlated with immune, such as leukocyte activation involved in immune response, activation of the immune response, and T cell activation.

Publication Search of Key Genes.
A total of 43 candidate genes with the degree ≥ 10 and included in the eight modules were taken as the key genes, which were searched for the publication on sepsis. Finally, eight genes were selected, including CCR7, C-X-C motif chemokine ligand 3 (CXCR3), FYN, CEA cell adhesion molecule 1 (CEACAM1), CD81 molecule (CD81), amphiphysin (AMPH), major histocompatibility complex, class II, DM alpha (HLA-DMA), and GPR18. The expressions of the eight genes in the normal group, high immune score group, and low immune score group are in Figure 5. AMPH and CEACAM1 had upregulated genes, which had lower immune scores corresponding to a higher expression level. The other genes were downregulated, which had the higher immune score corresponded to the higher expression level.

Validation of the Eight Genes.
We used RT-qPCR to detect the expression level of eight genes screened from 43 key genes in normal blood samples and the blood of patients in the CAP group (AMPH has not been successfully determined). As shown in Figure 6, the relative expression levels of CXCR3, HLA-DMA, CCR7, and GPR18 of the CPA group were markedly lower than those of the control group (P < 0:05). Compared with that in the control group, CEA-CAM1 in the experimental group was increased, while FYN and CD81 were decreased, with no significant difference (P > 0:05).

Discussion
At present, patients infected with sepsis may suffer from multiple system damage. Such as circulatory, digestive, and blood damage. Septic shock, liver abscess, heart failure, respiratory distress syndrome, etc. Immunosuppression is the main cause of death in patients with sepsis. However, the 5 Computational and Mathematical Methods in Medicine identification of biomarkers for the diagnosis of sepsis is still lacking. In this study, the enrolled samples were divided into two groups (64 samples with high immune scores, 63 samples with low immune scores). After the VENN analysis between the two groups with opposite directions, 157 upregulated genes and 203 downregulated genes were obtained. Besides, the coexpressed DEGs were enriched in 157 terms. Furthermore, 295 PPI edges were obtained which including 141 protein nodes. Through searching the sepsis-related publication of the key genes, CCR7, CXCR3, FYN, CEACAM1, CD81, AMPH, HLA-DMA, and G protein-coupled receptors (GPR18) were considered to be key genes.   Computational and Mathematical Methods in Medicine Figure 4: The protein-protein interaction (PPI) network and eight modules identified from the PPI network for the differentially expressed genes. Different colors indicate that the nodes belonged to different modules. The size and degree of a node were in proportion.

Computational and Mathematical Methods in Medicine
Enrichment analysis reveals that genes are mainly enriched in pathways that are correlated with immune, such as leukocyte activation involved in immune response and activation of the immune response. The mechanism of sepsis and its associated systemic inflammatory response syndrome are complex. With the impaired immune function of phagocytes, immune suppression, and complement activation, a hyperinflammatory state is formed, and eventually, septic shock is triggered [28]. In conclusion, sepsis-induced organ dysfunction can be attributed to the interaction between the initial inflammatory response and the subsequent anti-inflammatory response [29].
CXCR3 belongs to the G-protein coupled seven-subunit transmembrane receptor. The expression of the gene in vivo is located in parenchymal cells and inflammatory cells during the lesions of multiple organs, such as blood vessels dermal cells, activated lymphocytes, macrophages, and dendritic cells [30]. CXCR3 is served as a symbol for the activity of Th1 lymphocytes [31]. Studies have shown that CXCR3 and the ligands have significant roles in the oncogene of viral infection, autoimmune disease, tumor, transplant immunity, organ/tissue fibrosis, and other diseases [32]. The related axis might be through the chemotactic activation of T cells, tumor cells, and inhibition of angiogenesis   Computational and Mathematical Methods in Medicine [32]. The studies also show the potential of CXCR3 blockade as a therapeutic approach to decrease the severity of sepsis during its acute phase [33]. CCR7 belongs to G protein-coupled receptor family. CCR7 and its ligands are mainly involved in the homing of T cell subsets and antigen-presenting dendritic cells to lymph nodes [34], and a previous study shows that dendritic cells and macrophages play a critical role in the inflammation of eosinophilic pneumonia [35]. CCR7 is inversely related to sequential organ failure and mortality [36].
HLA-DM, encoded by HLA-DMA and HLA-DMB, is a heterodimeric molecule that is important for normal antigen presentation. The study found that the low level of PDE4D expression was associated with HLA-DMA and HLA-DMB, while PDE4D expression continued to decline over time in sepsis patients [37].
GPR18 belongs to the orphan class A family. Studies report that modulation of GPR18 has been associated with immunomodulation, cancer, or metabolism [38,39]. GPR18 is also involved in bacterial clearance and survival in microbial sepsis. Moreover, a previous study reveals that GPR18 plays a great role in controlling infectious inflammation and promote organ protection [40,41]. A decrease in the positive rate of GPR18 in neutrophils often predicts more severe sepsis and a poorer prognosis [42].
This study has some limitations. First of all, the mechanism of action of key genes in sepsis should be investigated. Secondly, we only analyzed the expression levels of seven key genes in five sepsis patients. In future studies, we will collect more clinical samples to explore the mRNA and protein levels of key genes and further study the specific role of these genes in sepsis.
In conclusion, a series of bioinformatics analyses are conducted with immune-related genes in CAP with sepsis. We obtained 360 intersected DEGs (157 upregulations and 203 downregulations). The intersected DEG was rich in 157 immune-related terms. We constructed DEG's PPI network and obtained 8 models. CXCR3, CCR7, HLA-DMA, and GPR18 might be correlated with the sepsis mechanism. These immune-related genes may play a key role in the development of sepsis and help improve the immunomodulatory treatment of patients with sepsis.

Data Availability
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.