The Potential Genes Mediate the Pathogenicity of Allogeneic CD4+T Cell in aGVHD Mouse Model

Acute graft-versus-host disease (aGVHD) remains a significant and severe complication of allogeneic hematopoietic stem cell transplantation (allo-HSCT). Due to the occurrence of aGVHD, allo-HSCT significantly increases the mortality rate compared with autologous hematopoietic stem cell transplantation (auto-HSCT). In this study, auto-HSCT and allo-HSCT aGVHD mouse models were built to detect the difference in CD4+ lymphocyte in different tissues based on ribonucleic acid sequencing (RNA-Seq) analysis. Clustering analysis, functional annotation, and pathway enrichment analysis were performed on differentially expressed genes (DEGs). The protein-protein interaction (PPI) network was used to find hub genes. CD4+T cells were activated by MLR and cytokine stimulation. Cells were sorted out by a flow cell sorter. The selected genes were verified by qRT-PCR, histology, and immunofluorescence staining. The GSE126518 GEO dataset was used to verify the hub genes. Enrichment analysis revealed four immune-related pathways that play an important role in aGVHD, including immunoregulatory interactions between a lymphoid and a nonlymphoid cell, chemokine receptors binding chemokines, cytokine and cytokine receptor interaction, and the chemokine signaling pathway. At the same time, with the PPI network, 11 novel hub genes that were most likely to participate in immunoregulation in aGVHD were identified, which were further validated by qRT-PCR and the GSE126518 dataset. Besides, the protein expression level of Cxcl7 was consistent with the sequencing results. In summary, this study revealed that immunoregulation-related DEGs and pathways played a vital role in the onset of aGVHD. These findings may provide some new clues for probing the pathogenesis and treatment of aGVHD.


Introduction
Allogeneic hematopoietic stem cell transplantation (allo-HSCT) is an effective treatment for various haematological diseases wherein blood cells are unavailable in unmodified autologous transplants. Acute graft-versus-host disease (aGVHD) is one of the significant complications of allo-HSCT. Despite considerable achievements in the treatment of aGVHD, conventional clinical treatments of aGVHD are still targeted at the entire immune system, which not only increases the risk of serious infection but also leaves some patients at risk of developing aGVHD [1]. The occurrence of aGVHD can be divided into three stages according to its sequence. Firstly, preconditioning regimens such as radiotherapy and chemotherapy as well as infection cause epithelial cell damage and an inflammatory environment, while the expression of the MHC antigen in the recipient initiates the recognition of donor T cells. Secondly, the interactions between T cell receptors, antigen major histocompatibility complexes (anti-MHC), costimulatory molecules, and cytokines activate the donor T cells at an early stage. In this stage, T cells in the transplant are activated under the stimulation of an inflammatory environment and an allogeneic antigen, clonally proliferate, and finally differentiate into helper T cells (CD4 + T cells) or cytotoxic T cells (CD8 + T cells). Lastly, the activated effector T cells migrate to target tissues and recruit more effector cells, and various cytokines are released abnormally to act on effector cells resulting in damages to the lung, gut, and liver [2].
Alloreactive T cells proliferate and differentiate into subsets of T cells, including Th1, Th2, Th17, and others in peripheral lymphoid tissues, which are considered as the major detrimental factors for the pathogenesis of aGVHD [3]. IFN-γ-secreting Th1 cells are considered the key factor in the onset and progression of aGVHD [4]. Among all T cells, donor-derived CD4 + T cells and secreted cytokines are particularly vital in the pathogenesis and regulation of aGVHD. Studies have shown that the regulation of donorderived CD4 + T cells and the production of some critical cytokines (such as IFN-γ) could affect the development of aGVHD [5]. CD4 + T cells have been studied as the potential treatment targets for GVHD in a large number of clinical trials [4,6]. A clinical study, however, found that peripheral blood stem cell (PBSC) graft containing more mature T cells did not increase the incidence and severity of aGVHD [7]. Some studies found that the infiltration of immunomodulatory Treg cells in the intestine, skin, and stomach was different [8][9][10]. Meanwhile, we found a phenomenon that the infiltration of activated effector CD4 + T cells in these tissues was different in the aGVHD mouse model in this study. The exact character of activated effector CD4 + T cells in the development of aGVHD remains unclear.
As a transcriptome sequencing technique, ribonucleic acid sequencing (RNA-Seq) is used to study alternative splicing and other forms of alternative isoform expression [11]. There are, however, few reports on the mechanism of aGVHD-related target organ damage by RNA-Seq. In this study, using RNA-Seq, we first analyzed the gene expressing profiles of CD4 + T cells from the spleen, lung, and liver tissues sampled in auto-HSCT and allo-HSCT recipient mice and identified the differentially expressed genes (DEGs) to characterize CD4 + T cells in aGVHD.

Materials and Methods
2.1. Mice. Female C57BL/6 (H-2b) and male Balb/c (H-2d) mice were purchased from the Animal Experiment Center of Peking University Health Science Center. C57BL/6 mice, 8-10 weeks of age with a body weight of 19-21 g each, were donors of bone marrow hematopoietic stem cells and spleen lymphocytes. Balb/c mice, 10-12 weeks of age with a body weight of 20-23 g each, were recipients. After transplantation, mice were fed with water containing gentamicin for seven days. Mice were bred under specific pathogen-free (SPF) conditions at the Animal Breeding Center of Peking University Health Science Center. All experimental protocols were approved by the Animal Care and Use Committee of Peking University Health Science Center. The study design is illustrated in Figure S1 for the experimental and analysis scheme.

aGVHD Induction in a Murine
Model. The method of aGVHD induction using radiation-exposed mice and allogeneic bone marrow transplants was described previously [12][13][14]. Briefly, bone marrow cells were acquired from C57BL/6 mice. Splenocytes were obtained from C57BL/6 and Balb/c mice by Ficoll gradient centrifugation. Auto-HSCT and allo-HSCT recipient Balb/c mice were irradiated with 7.5 Gy of total body irradiation (TBI). Autogeneic hematopoietic stem cell transplant (auto-HSCT) Balb/c recipient mice were intravenously injected with 1 × 10 7 bone marrow (BM) cells. Allo-HSCT Balb/c recipient mice were intravenously injected with 1 × 10 7 BM cells and 1 × 10 7 splenic mononuclear cells. Balb/c recipient mouse were then evaluated based on a clinical score, including activity, skin integrity, weight loss, posture, and fur texture every other day for one month.

T Lymphocyte Enrichment.
CD4 + T cells were negatively selected from the liver, lung, and spleen of Balb/c recipient mice using a mouse CD4 + T cell isolation kit (BioLegend, San Diego, CA, USA) according to the manufacturer's instructions. CD4 + lymphocytes were prepared at a concentration of 5 × 10 6 /mL. The purity of the enriched populations was >90% as assessed by a flow cytometry.

Antibodies and Flow
Cytometry. The following antimouse antibodies from BioLegend (Cal: 100421, 103043, and 104411; US) were used in flow cytometry: CD4-PE/Cy7, CD44-BV510, and CD62L-APC. Spleens, lungs, and livers were excised and gently pressed through a cell strainer (70 μm) on the seventh day after transplantation. Infiltrating lymphocytes in the liver and lung were isolated using Percoll (Living, Beijing, China). Single-cell suspensions were blocked by incubating with anti-Fc receptor antibody for 10 minutes on ice. Cells were then labelled with fluorescently conjugated antibodies at 4°C for 30 min, followed by washing with cold PBS twice before performing membrane molecule analysis using a flow cytometer. Cells were sorted out using BD FAC-Symphony S6, and gating was done using the BD FACSDiva Software (BD Biosciences).

RNA Preparation.
Total RNA was extracted from CD4 + T cells isolated from the liver, lung, and spleen using the TRIzol Reagent (Thermo Fisher Scientific, Waltham, MA, USA) and the Qubit® RNA Assay Kit in a Qubit® 2.0 Fluorometer (Life Technologies, CA, USA). RNase-free DNase I (Invitrogen, Carlsbad, CA, USA) was used to digest potential genomic deoxyribonucleic acid (DNA). The digested products were then purified using magnetic beads (Axygen, Union City, CA, USA) to eliminate residual DNA and DNase. The concentrate of the extracted RNA was analyzed with a Qubit® RNA Assay Kit in Qubit®2.0 Fluorometer (Life Technologies, CA, USA). RNA purity and integrity were assessed using a NanoPhotometer® spectrophotometer (Implen, CA, USA) and an RNA 6000 Nano Assay Kit of the Agilent 2100 Bioanalyzer system (Agilent Technologies, CA, USA). Total RNA of all samples had a concentration ≥ 200 ng/mL, a mass ≥ 10 mg, and an RNA integrity number ðRINÞ ≥ 8:0.
2.6. Complementary DNA (cDNA) Library Construction. The poly-T oligo-attached magnetic beads were used to purify mRNA from the total RNA. Fragmentations were carried out using divalent cations under elevated temperatures in a First Strand Synthesis Reaction Buffer (5x). The cleaved RNA fragments served as the templates for the synthesis of the first-strand cDNA using a random hexamer primer and an M-MuLV reverse transcriptase (RNaseH-). The second strand cDNA was synthesized using DNA polymerase I and RNase H. The double-stranded cDNA was then degraded to blunt ends using an exonuclease/polymerase. After adding a terminal A at the 3 ′ ends in DNA fragments, library fragments were purified using the AMPure XP system and cDNA fragments of 250~300 bp in length were selected (Beckman Coulter, Beverly, USA). The sequencing adaptors and the cDNA fragments were mixed and ligated at 37°C for 15 min. A polymerase chain reaction (PCR) was performed with the Phusion High-Fidelity DNA Polymerase, universal PCR primers, and an index (X) primer. The integrity and size of the product were verified on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA).
2.7. RNA-Seq Data Processing and Identification of DEGs. cDNA libraries of CD4 + T cells from the auto-HSCT and allo-HSCT models were sequenced at Novogene Bioinformatics Technology Co. (Beijing, China). Paired-end sequencing was performed on an Illumina Hiseq™ 4000 platform (Illumina, San Diego, CA, USA). FastQC software was used to preprocess raw reads to remove reads containing PCR duplicates, adapters, Ploy-N, and low-quality reads. In this step, Q20, Q30, and GC contents of the clean reads were calculated. All the subsequent analyses were based on clean data with high quality.
Gene model annotation files and the reference genome were downloaded from the genome website (http://apr2018 .archive.ensembl.org/Mus_musculus/Info/Annotation). Hisat2 v2.0.5 was used to build the index of the reference genome and to match paired-end clean reads with the reference genome. The sequencing process possibly has machine errors, and the inspection of the error rate can reflect the quality of sequencing data. The error rate was obtained through the conversion of the Phred score of the sequencing base through the formula Qphred = −10 log 10ðeÞ, where E is the base error rate [15].
Fragments per kilobase per million reads (FPKM) was used to calculate the transcript expression levels [16]. DESeq2 R package (1.16.1) was used to identify DEGs in CD4 + T cells from different organs between auto-and allo-HSCT mice [17].
According to the manufacturer's instructions, the clustering of the index-coded samples was carried out on a cBot Cluster Generation System by TruSeq PE Cluster Kit v3-cBot-HS (Illumina). The aggregations of all groups' DEGs were taken as the DEG sets. In the multigroup experiments, the clustering analysis of the DEG sets was performed to gather genes with similar expression patterns.
2.8. GO and Pathway Enrichment Analysis. All DEGs were mapped to terms in the Gene Ontology (GO) databases for functional enrichment analysis. padj < 0:05 was considered significantly enriched. GO terms were classified into three subgroups: biological process (BP), cellular component (CC), and molecular function (MF). Pathways with padj < 0:05 for DEGs were further annotated by the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the Reactome automatic annotation server [18,19]. Genome enrichment analysis of DEGs in immune-related pathways was performed using Gene Set Enrichment Analysis (GSEA) [20].
2.9. PPI Network Construction Analysis. STRING (Search Tool for the Retrieval of Interacting Genes) was applied to predict the protein-protein interaction (PPI) network for DEGs [21]. The Cytoscape software was used to visualize and analyze the PPI network (confidence score ≥ 0:4; maximum number of interactors = 0) [22]. The cytoHubba plugin from the Cytoscape was applied to look for hub genes according to the Maximal Clique Centrality (MCC) [23]. The intuition behind MCC is that essential proteins tend to be clustered in a yeast protein-protein interaction network. The Molecular Complex Detection (MCODE) plugin was applied to make analysis of the gene network functional module. The method is based on vertex weighting based on local neighborhood density and outward traversal of local dense seed protein, and the dense regions are separated according to the given parameters [24].

Validation of Gene Expression by qRT-PCR.
Real-time quantitative reverse transcription polymerase chain reaction (qRT-PCR) was performed to validate the expression levels of 11 DEGs related to immunoregulation. The murine gene GAPDH served as an endogenous reference. The primers for all genes are presented in Table 1. Total RNA was extracted and used as a template to be reversely transcribed into single-stranded cDNA using a Revert Aid First-Strand cDNA Synthesis Kit (Thermo Fisher, Waltham, MA, USA). The ABI Prism 7500 PCR system was used for real-time monitoring of the SYBR Green dye (Life Technologies, Foster City, CA, USA) integrated into PCR products. Reactions were carried out in the following conditions: initial denaturation at 50°C for 2 min and then 95°C for 10 min, followed by 40 cycles of reaction at 95°C for 15 s and 60°C for 60 s. Each sample had three technical replicates (3 independent wells with the same template) to ensure the precision of the quantification. The 2 −ΔΔCt method was used for gene expression.

Histological
Research. Mice in each group were sacrificed on the 7th day after transplantation. The liver, lung, and spleen tissues were fixed in 10% buffered formalin and embedded in paraffin blocks for subsequent hematoxylin and eosin staining. Histologic changes were graded and scored by a pathologist blinded to the clinical status of the mice. Four items were assessed as described previously [25].
To determine the infiltration of T cells in the tissues of 3 BioMed Research International post-HSCT mice, immunohistochemical analysis was performed on the liver, lung, and spleen tissues of recipient mice at 7 days posttransplantation, stained with the primary antibodies anti-CD4 (1 : 20, Servicebio, Wuhan, China) and anti-Cxcl7 (1 : 20, Abcam, Cambridge, UK) as well as the appropriate horseradish peroxidase-conjugated secondary antibody (1 : 50, Servicebio, Wuhan, China).
2.15. Statistical Analysis. DEGs were calculated and identified by the DESeq2 R package [26]. Significance was calculated by the hypothesis-testing probability (p value), as well as tested and corrected by the Benjamini-Hochburg method (also known as the FDR value). Significant DEGs were defined to meet the following two requirements: (i) more than twofold change and (ii) padj < 0:05. The fold expression of each gene on CD4 + T cells was calculated by comparing to the read counts against the standardized control. For the sake of convenience, log 2 scale of fold change was adopted. Log 2 ðfold changeÞ > 1 means over twofold change. The experimental data of this study were obtained by more than three replicates. For other comparisons in this study, differences between groups were calculated by Student's t-test and ANOVA. p < 0:05 was considered significant. Differences between groups were analyzed by using the GraphPad Prism 7 software.

Result
3.1. Detect RNA Sequence Information of CD4 + T Cells in Target Organs of HSCT Mice. The samples of target organs were excised on the 7th day after transplantation of and infiltration by CD4 + T cells (Figure 1(a)). CD4 + T in different organs of Balb/c recipient mice had >92% purity (Figure 1(b)). CD44 + CD62L low effector cells were about 90% in allo-HSCT mice (Figure 1(c)). Clean reads that met the quality control criteria of preprocessing constituted >96% of all reads in all samples ( Figure S1A). The percentage of reads mapped in the exon region was above 94% ( Figure  S1B). The error rate of all samples was less than 0.1% ( Figure  S1C). For chain-specific libraries, the contents of A/T and G/C were almost the same ( Figure S1D). Boxplots of the expression levels of all genes revealed no sample errors and outliers in Figure S2A. The correlation coefficient R 2 and PCA analysis indicated the high comparability between groups ( Figure S2B-S2C). These results indicated the high quality of our data with great replicability within groups and comparability between groups.
3.2. DEG Identification of CD4 + T Cells from HSCT Mice. In order to detect the impact of a xenoantigen and the intertissue microenvironment on donor CD4 + T, the spleen CD4 + T cells of allogeneic transplanted mice without the influence of the xenoantigen and microenvironment were regarded as the control group, and CD4 + T cells from different organs of allogeneic transplanted mice affected by the xenoantigen and microenvironment were analyzed as an experimental group. The DEGs and samples were analyzed by clustering analysis (Figure 2(b)). The results indicated that genes with similar expression patterns were functionally correlated. Based on padj < 0:05 and log 2 ðfold changeÞ > 1 shown in the volcano map (Figure 2(a)), a total of 2996 (1673 upregulated and 1323 downregulated), 3955 (2131 upregulated and 1824 downregulated), and 5255 (3289 upregulated and 1966 downregulated) DEGs were identified in allospleen vs. autospleen, alloliver vs. autospleen, and allolung vs. autospleen. We further explored the group-specific and overlapped DEGs in these groups. There were in total 627 overlapped DEGs in the autospleen, allospleen, allolung, and alloliver ( Figure 2(c)).

Analysis of Immune-Related Pathways in HSCT Mice.
Gene functions of DEGs were annotated and classified from three aspects: biological process (BP), cellular component (CC), and molecular function (MF). After a comparative study, data discrepancies enriched in DEGs are shown, respectively: 380 BP terms, 14 CC terms, and 17 MF terms in allospleen vs. autospleen; 851 BP terms, 48 CC terms, and 141 MF terms in alloliver vs. autospleen; and 1339 terms, 47 CC terms, and 93 MF terms in allolung vs. autospleen. The top 10 GO terms are presented in Figure 3. It should be noted that 6 BP terms (regulation of leukocyte activation, leukocyte chemotaxis, regulation of lymphocyte activation, regulation of chemotaxis, chemotaxis, and leukocyte migration) and 3 MF terms (cytokine activity, chemokine activity, and cytokine receptor binding) were distinguished to be associated with immunoregulation ( Figure 3).
The top 10 most significantly enriched pathways are shown using KEGG and Reactome (Figures 4(a)-4(c)). Gene set enrichment analysis (GSEA) revealed 2 KEGG pathways (chemokine signaling pathway, cytokine and cytokine receptor interaction) and 2 Reactome pathways (chemokine receptors binding chemokines, immunoregulatory interactions between a lymphoid and a nonlymphoid cell). The four pathways mentioned above belonged to immunoregulation in GVHD (Figures 4(d)-4(f)).

Potential Immunoregulatory Genes in DEGs.
To gain insight into the relationships between DEGs, we constructed a PPI network with high confidence interactions using STRING and Cytoscape. The top 10 hub genes from a total of 60 hub genes from each group identified by the Cyto-Hubba plugin are listed in Table 2. 11 out of the 60 hub genes associated with immunoregulation have not been studied in GVHD, including Rgs1, Fpr2, Ppbp (Cxcl7), S1pr3, Npy, Hebp1, Kng1, Plg, Cxcl16, Ccl9, and Cxcl3. The PPI network among the 11 genes was constructed ( Figure S3A). According to MCC, Cxcl16, Kng1, and Cxcl7 were considered as more critical hub genes ( Figure S3B). Moreover, we applied the MCODE plugin to analyze PPIs to predict significant protein complexes. One significant module was identified in the DEG  7 BioMed Research International regulatory network, which contained 8 genes, including Cxcl3, Npy, Cxcl16, Kng1, Ccl9, S1pr3, Fpr2, and Cxcl7 (Figure S3C).

Validation of the Sequence Results In Vitro.
To confirm the credibility of the DEGs identified from RNA-Seq, we performed qRT-PCR to verify the expression level of the DEGs. The CD4 + T cells activated by MLR and cytokines (Figures 5(a) and 5(b)) were obtained by FACS sorting. Four related hub genes (Rgs1, Fpr2, Cxcl7, and S1pr3) were verified by activated CD4 + T cells through MLR and cytokine stimulation (Figures 5(c) and 5(d)). These genes showed consistent results as those found in RNA-Seq.

Validation of the Sequence Results
In Vivo. The validation data GSE126518 obtained from the GEO database was further used to verify the results (Figure 6(a)). We also selected the 11 novel hub genes that were significantly differentially expressed among the livers, lungs and spleens. As shown in Figure 6(b), most of the genes showed consistent results as those found in RNA-Seq. The protein levels of CXCL16, KNG1, and CXCL7 were verified by histological research and immunofluorescence staining. The results showed that the expression level of CXCL7 in the autospleen was higher than that in the allospleen (Figures 6(c) and 6(d)). However, the KNG1 protein and the CXCL16 protein were almost undetectable in tissues between the autospleen and allosp-leen, and there was no statistical difference about the expression level of KNG1 and CXCL16 between two groups (the data was not shown). Moreover, it was found that the CXCL7 expression level was also elevated in the autoliver and autolung ( Figure S4). In conclusion, the results basically verified the reliability of the result.

Discussion
Acute GVHD is induced by numerous graft-derived immunocytes, including the naive CD4 + T cells from donors, which can differentiate and exert cytotoxicity on multiple organ systems of the allo-HSCT recipients [4]. However, it was uncertain whether effector CD4 + T expressed differently among different organs. RNA-Seq, a method of comprehensive transcriptome profiling, generates a deep sequencing data for the direct quantification of the transcripts by next-generation sequencing (NGS) technologies [27]. With integrative transcriptome analysis, the study is aimed at exploring potential molecular mechanisms involved in CD4 + T cells in different target organs sampled from auto-HSCT and allo-HSCT aGVHD mouse models. We identified significant DEGs and critical enriched pathways, including the chemokine receptor binding chemokine pathway, immunoregulatory interactions between a lymphoid and a nonlymphoid cell, the cytokine and cytokine receptor interaction pathway, and the chemokine signaling pathway. 11 novel genes that likely

14
BioMed Research International participated in the immunoregulation of the pathogenesis of aGVHD were also identified. In the PPI network, significant protein complexes containing DEG regulators were detected. RNA-Seq was not free of biases. Results could be affected by differences in fragment size, transcript length, and differences in GC content [28]. In this study, to ensure a high quality and creditability of the data and results, we collected CD4 + T cells (>92%) and CD44 + CD62L low effector cells (about 90%) with high purity in allo-HSCT mice, and applied quality control filters in the preprocessing of the raw read before DEG identification. Boxplots, correlation heat maps, and PCA plots of the gene expression FPKM values provided further evidence for the high quality and creditability of our data.
Organs have unique structures, physiological functions, and tissue microenvironments, thereby forming local immune environments that may be closely related to the occurrence and development of aGVHD associated with the organs. The spleen is the largest peripheral lymphatic organ in the human body, harboring various types of lymphocytes, mainly B and T cells. These cells in the spleen exert immune responses when the body is invaded by pathogens. However, there are few studies on the local immune environments of target organs in aGVHD. To further explore the immunopathological mechanism of GVHD, we focused on DEGs of the spleen, liver, and lung hub genes with immunoregulatory functions in a murine model of allogeneic HSCT compared with splenic CD4 + T cells from mice with autologous transplantation.
We identified 627 overlapped DEGs and 400, 1872, and 1049 organ-specific DEGs by CD4 + T in the allospleen, allolung, and alloliver compared to the autospleen. Many DEGs were found associated with GVHD in previous studies, such as CD34 and CD33 [29,30]. We also identified many novel DEGs that were firstly reported to be associated with aGVHD in our study. Pathway enrichment analysis and PPI networks are helpful ways to extract the main information from a large list of genes, often used in transcriptome analysis. Due to the effect on the relevant pathway and the regulation of other genes, hub genes in PPI networks provide promising targets and research hotspots in a biophysiological and pathological developmental process study. We identified 60 hub genes in PPI networks of DEGs, including 11 novel genes that had not been reported in previous GVHD studies. 11 novel genes were later confirmed with qRT-PCR.
We observed that a significant portion of the hub genes were chemokine genes, especially the upregulated DEGs in the spleens and lungs from allogenic recipient mice. At the right is the number of CXCL7 per scale in the spleen. CXCL7 were detected by colocalization of CXCL7 (green) and DAPI (blue). CXCL7 deposition was quantified on a 0-3 scale to determine the amount of antibodies in the tissues. (d) Representative spleen histology staining revealing the expression of CXCL7 (brown) 7 d after HSCT in auto-HSCT and allo-HSCT mice. Cells were stained with CXCL7 (brown). Bars: 50 or 200 μm. * p < 0:05 and * * p < 0:01. The y-axis is the log2 scale of the expression level of hub genes. Each experiment was performed at least three times, and the results were presented as mean ± SD.