Shared Molecular Mechanisms between Alzheimer's Disease and Periodontitis Revealed by Transcriptomic Analysis

Objective To investigate the genetic crosstalk mechanisms that link periodontitis and Alzheimer's disease (AD). Background Periodontitis, a common oral infectious disease, is associated with Alzheimer's disease (AD) and considered a putative contributory factor to its progression. However, a comprehensive investigation of potential shared genetic mechanisms between these diseases has not yet been reported. Methods Gene expression datasets related to periodontitis were downloaded from the Gene Expression Omnibus (GEO) database, and differential expression analysis was performed to identify differentially expressed genes (DEGs). Genes associated with AD were downloaded from the DisGeNET database. Overlapping genes among the DEGs in periodontitis and the AD-related genes were defined as crosstalk genes between periodontitis and AD. The Boruta algorithm was applied to perform feature selection from these crosstalk genes, and representative crosstalk genes were thus obtained. In addition, a support vector machine (SVM) model was constructed by using the scikit-learn algorithm in Python. Next, the crosstalk gene-TF network and crosstalk gene-DEP (differentially expressed pathway) network were each constructed. As a final step, shared genes among the crosstalk genes and periodontitis-related genes in DisGeNET were identified and denoted as the core crosstalk genes. Results Four datasets (GSE23586, GSE16134, GSE10334, and GSE79705) pertaining to periodontitis were included in the analysis. A total of 48 representative crosstalk genes were identified by using the Boruta algorithm. Three TFs (FOS, MEF2C, and USF2) and several pathways (i.e., JAK-STAT, MAPK, NF-kappa B, and natural killer cell-mediated cytotoxicity) were identified as regulators of these crosstalk genes. Among these 48 crosstalk genes and the chronic periodontitis-related genes in DisGeNET, C4A, C4B, CXCL12, FCGR3A, IL1B, and MMP3 were shared and identified as the most pivotal candidate links between periodontitis and AD. Conclusions Exploration of available transcriptomic datasets revealed C4A, C4B, CXCL12, FCGR3A, IL1B, and MMP3 as the top candidate molecular linkage genes between periodontitis and AD.


Introduction
An association between periodontitis and Alzheimer's disease (AD) has been demonstrated, and periodontitis reportedly confers risk for the incidence and progression of Alzheimer's disease (AD) [1,2]. Alzheimer's disease is a neurodegenerative disease characterized by the formation of amyloid-β peptide (AβP) plaques and intraneuronal neurofibrillary tangles (NFTs), which drives neuroinflammation in the brain [3]. Periodontitis, a chronic, immunoinflammatory disease affecting supporting structures of teeth, is multifactorial in nature, driven by polymicrobial dysbiosis and unfavorable shifts in the plaque biofilm composition, which disrupts the host-microbial homeostasis [4,5]. Inflammation is considered the key connecting link between both these diseases [6]. Two purported mechanistic links have been highlighted. First, proinflammatory mediators including specific cytokines or chemokines in the periodontal milieu that enter systemic circulation impose a systemic inflammatory burden, propagating inflammatory responses by microglial cells in the brain [6]. Second, periodontal pathogens may directly enter the brain via blood circulation or peripheral nerves, as evidenced by the discovery of the keystone periodontal pathogen Porphyromonas gingivalis (Pg) in AD patients' brains [7,8].
Here, we designed a bioinformatic study of existing experimental datasets to understand putative molecular links between periodontitis and Alzheimer's disease by identifying crosstalk genes, transcription factors, and signaling pathways involved in both disorders. The molecular mechanisms identified through this approach could suggest potential therapeutic targets particularly relevant to drug development and personalized medicine approaches.

Materials and Methods
2.1. Study Design. Figure 1 depicts a flowchart outlining the study workflow. DEGs dysregulated in periodontitis-and AD-related genes were obtained from the GEO database and DisGeNET database, respectively. Crosstalk genes linking periodontitis and AD were identified as the AD-related genes that overlapped with significantly up-or downregulated DEGs in periodontitis. Thereafter, feature selection from the crosstalk genes was performed using a conventional recursive feature elimination (RFE) algorithm and the Boruta algorithm. The crosstalk genes obtained by feature selection were used to construct two networks to identify the transcription factors and the differentially expressed pathways that target these crosstalk genes. In the next step, "core" crosstalk genes were identified as the crosstalk genes obtained by feature selection that were overlapping with chronic periodontitis-related genes in the DisGeNET database.
2.2. Procurement of Periodontitis-Related Datasets. Samplematched whole-genome gene expression datasets from periodontitis were sourced and downloaded from the NCBI Gene Expression Omnibus (GEO). The eligibility criteria for these datasets were as follows: datasets that included established periodontitis samples as the experimental group and healthy gingival samples as the control group, where periodontitis was defined based in accordance with the case definition presented in the 2017 World Workshop: (1) interdental CAL detectable at ≥2 nonadjacent teeth or (2) buccal or oral CAL ≥ 3 mm with pocketing >3 mm detectable at ≥2 teeth [11].
2.3. Differential Gene Expression Analysis. Differential gene expression analysis of periodontitis-related datasets was carried out using the Linear Models for Microarray (limma) package [12] in the R project (version 3.0.1, http://www.rproject.org/) [13]. Three such datasets, GSE23586, GSE16134, and GSE10334, were sourced and analyzed. Genes with p value < 0.05 and |logFC ðfold changeÞ | ≥1 were regarded as significant differentially expressed genes (DEGs). For another dataset, GSE79705, the screening range of DEGs was broadened by extending the thresholds and settings to p value < 0.05 and |logFC | >0 as DEGs.
Next, a Venn diagram (http://bioinformatics.psb.ugent .be/webtools/Venn/) was drawn to identify shared genes within the DEGs identified from the four datasets. The common up/downregulated DEGs in four datasets were used for the following analyses, and DEGs that were not common to all the datasets were excluded.
2.6. Identification of Crosstalk Genes and Construction of the Crosstalk Gene-Related PPI Network. AD-related genes were downloaded from the DisGeNET database [26]. The ADrelated genes that overlapped with the up-and downregulated periodontitis-related DEGs were identified. These overlapping genes were regarded as "crosstalk" genes linking AD and periodontitis and further used for constructing a crosstalk gene-related PPI network. The crosstalk gene-related PPI network consisted of four types of nodes, namely, (1) DEGs dysregulated in periodontitis (not related to AD), (2) crosstalk genes or AD-related genes which were also DEGs dysregulated in periodontitis, (3) AD-related genes (not dysregulated in periodontitis), and (4) other genes (neither related to AD nor dysregulated in periodontitis).
2.7. Feature Selection from Crosstalk Genes. Since GSE16134 had the largest sample size among the periodontitis-related datasets, it was used as the test set. The other three datasets (GSE23586, GSE10334, and GSE79705) were used as validation sets. Firstly, expression values of the crosstalk genes (identified in the previous step) from GSE16134 were used as input for the Boruta algorithm in the R project [27] and the conventional recursive feature elimination (RFE) algorithm [28] and feature selection was performed. Each gene was regarded as a feature.
2.8. Support Vector Machine (SVM) Modeling Using Feature-Selected Crosstalk Genes. The expression values in GSE16134 and GSE10334 were scale-standardized. Next, it was examined if the crosstalk genes obtained by feature selection were found in the four periodontitis-related datasets (i.e., GSE23586, GSE16134, GSE10334, and GSE79705). The gene expression values of these feature selection-obtained crosstalk genes were extracted from these datasets. If the number of expression profile genes in a certain dataset after extraction was lower than the number of feature selectionobtained crosstalk genes, the expression values of the missing genes were considered missing values and represented by the NA symbol. The missing values were processed by using the DMwR package [29] in R, and the K-Nearest Neighbors (KNN) algorithm was used to impute these missing values (NA) in that dataset. By imputing these missing values, all the four periodontitis-related datasets presented all the feature selection-obtained crosstalk genes. Thereafter, the scikit-learn package [30] was used to perform a grid search, and the best hyperparameters of a support vector machine (SVM) model were found by using 5-fold cross-validation Alzheimer's disease-related data procurement from DisGeNET database Figure 1: Flowchart depicting study workflow.

BioMed Research International
(CV) [31]. A SVM classifier model was established by using data from GSE16134 as the training set and test set, where the samples of the GSE16134 dataset were split into 60% : 40% for the training set and test set, respectively. Data from the other three datasets (GSE23586, GSE10334, and GSE79705) were used as the validation set. The decision function method was used to obtain the score for each sample. Next, receiver operating characteristic (ROC) curves for the four datasets were generated by using the pROC package and displayed using the ggplot2 package in R.
2.10. Pathway Analysis of the Crosstalk Genes. Human data describing relationships between signaling pathways and genes were downloaded from the KEGG database, and all pathways related to the feature selection-obtained crosstalk genes were extracted. The expression levels of these crosstalk gene-related pathways were plotted as a heatmap, and differential expression analysis was performed and applied to the four periodontitis-related datasets to identify the DEPs (differentially expressed pathways) using the R package limma. For the three datasets (GSE23586, GSE16134, and GSE79705), the pathways with p value < 0.05 and |logFC | ≥ 1 were regarded as differentially expressed pathways (DEPs), while for the dataset GSE79705, the pathways with p value < 0.05 and |log FC | >0 were regarded as DEPs.
2.11. Classification Performance of the Core Crosstalk Genes. The genes related to periodontitis were downloaded from the DisGeNET database [26]. The overlap between the periodontitis-related genes obtained from the DisGeNET database and the feature selection-obtained crosstalk genes was analyzed, and the overlapping genes were termed core crosstalk genes. The corresponding expression values of these overlapping genes in the four periodontitis-related datasets were obtained, and ROC curves were drawn.

Identification of Periodontitis-Related DEGs and Their
Functions. Table 2 shows the number of up-and downregulated DEGs that were identified in each of the four datasets.      Figure 3 shows the biological processes and signaling pathways in which the upand downregulated DEGs were significantly enriched.

The Hub
Genes Identified by the Periodontitis-Related PPI Network. Figure 4 depicts a PPI network based on the periodontitis-related DEGs, and Table 3 shows the topological characteristics of the top 30 nodes in the PPI network. As seen in Table 3, several genes with the highest degree were identified as hub genes. These upregulated DEGs included SMAD3, TRIM27, VIM, YWHAH, and FOS, and downregulated DEGs included MYC, HSPB1, DDB1, RPS3, KAT5, SMARCA4, RPL13, and PLCG1.

Crosstalk Genes Bridging Alzheimer's Disease and
Periodontitis. In total, 51 upregulated crosstalk genes and 41 downregulated crosstalk genes were identified and are listed in Table 4. The crosstalk gene-related PPI network as shown in Figure 5 consisted of 3496 nodes and 5141 edges. The topological characteristics of the top 30 nodes in this network are presented in Table 5. Hub crosstalk genes with the highest degree included MYC, HSPB1, VIM, KAT5, RPL13, FOS, and CDH1.

Crosstalk
Genes Obtained by Feature Selection. A total of 48 crosstalk genes were selected by using the Boruta algorithm ( Figure 6(a)). In addition, 62 crosstalk genes were selected by using the RFE algorithm ( Figure 6(b)). All 48 genes obtained by using the Boruta algorithm were included in the 62 genes obtained by the RFE algorithm, indicating that this 48-gene set was representative of the characteristics of all 92 crosstalk genes.
3.6. Classification Accuracy Using the Feature Selection-Obtained Crosstalk Genes. The 48 genes obtained by SVM feature selection were not shown in all of the four periodontitis-related datasets. The gene expression profiles of GSE16134 and GSE10334 included all of the 48 crosstalk genes, whereas GSE23586 included 44 crosstalk genes and GSE79705 included 45 crosstalk genes. The classification performance of these 48 crosstalk genes for the four datasets is shown in Table 6. For the test set GSE16134 and the validation set GSE10334, the accuracy performance was high at 91.94% and 88.26%, respectively. By comparison, the performance for the other two datasets GSE23586 and GSE79705 was low at 50% and 66.66%, respectively.

ROC Curves for the Four Periodontitis-Related Datasets.
As shown in Figure 7, the AUC (area under the curve) values for the GSE16134 test set and the GSE10334 validation set  BioMed Research International were high at 95.77% and 90.53%, respectively, congruent with the results in Table 6. It was thus inferred that the classifier performance was adequate only when the sample sizes of the validation sets were similar to those of the training and test datasets, and therefore, poor performance was noted for GSE23586 and GSE79705 having much lower sample numbers.
3.8. The Identification of Transcription Factors Regulating the Crosstalk Genes. As shown in Figure 8, the TF-crosstalk gene target network consisted of 388 nodes and 1178 edges. Several transcription factors which were also DEGs played critical roles by regulating the most number of crosstalk genes, for example, FOS, MEF2C, and USF2 (Table 7).

Signaling Pathways Enriched in the Crosstalk Genes.
From the 48 feature selection-obtained crosstalk genes, 37 crosstalk genes were found among gene-pathway interaction pair data in the KEGG database. 137 KEGG pathways corre-sponded to these 37 crosstalk genes. Figure 9 shows the expression values of these 137 pathways in the four periodontitis-related datasets. The numbers of crosstalk gene-related DEPs obtained from each of the four periodontitis-related datasets are listed in Table 8. The interaction relationships between crosstalk genes and DEPs are depicted in Figure 10, showing that several DEPs were dysregulated in at least two datasets, including cytokine-related pathways (cytokine-cytokine receptor interaction, chemokine, and IL-17), immune cell-related pathways (T cell receptor, B cell receptor, Th1 and Th2 cell differentiation, Th17 cell differentiation, natural killer cellmediated cytotoxicity, and osteoclast differentiation), JAK-STAT signaling, NOD-like receptor signaling, MAPK signaling, Toll-like receptor signaling, NF-kappa B signaling, and C-type lectin receptor signaling.

Discussion
The present study addressed shared genetic mechanisms and molecular links between periodontitis and Alzheimer's diseases by identifying gene expression, signaling pathways, and TFs that were most robustly associated with both these diseases. These findings are largely substantiated by preexisting experimental data.

BioMed Research International
Six genes C4A, C4B, CXCL12, FCGR3A, IL1B, and MMP3 were identified as the most significant crosstalk genes linking chronic periodontitis and Alzheimer's disease. C4B and C4A, respectively, encode the basic and acidic forms of the complement factor 4, and C4 gene deficiency has been noted to predispose the development of severe chronic periodontitis [38]. In AD, the expression levels of C4 mRNA were shown to be 3.27-fold increased in temporal cortex samples as compared to controls [39]. The CXCL12 (C-X-C motif chemokine ligand 12) gene is the ligand of the C-X-C motif chemokine receptor 4 (CXCR4). CXCL12 expression in the gingival crevicular fluid of periodontitis patients was shown to be significantly higher than that of healthy subjects, suggesting that it might play a role in enhancing neutrophil migration and further the progression of periodontitis [40]. A decreased level of CXCL12 in Alzheimer's disease has been documented as affecting cognitive function, impairing learning and memory [41]. FCGR3A (Fc fragment of IgG receptor IIIa) encodes a receptor for the Fc portion of immunoglobu-lin G, and FCGR3A polymorphisms are shown to confer susceptibility to periodontitis in Caucasians [42]. In AD, the Fc gamma receptor (FcγR) was recently found to exacerbate neurodegeneration [43]. Cytokines are considered a primary link between chronic periodontitis and Alzheimer's disease as they can enter systemic circulation through periodontal pockets [6]. The classical proinflammatory cytokine, IL1B, is elevated in periodontitis and can induce resorption of alveolar bone [44]. In AD, IL1B gene polymorphisms are linked to disease susceptibility [45]. MMP3 (matrix metalloproteinase 3) is implicated in the progression of chronic periodontitis and can degrade the periodontal tissue matrix [46]. Elevated brain levels of MMP3 have been associated with the duration of Alzheimer's disease, and it has been found to increase the activity of MMP9, thereby indirectly promoting aggregation and cerebral accumulation of tau deposits [47].
More interestingly, the six genes discussed in the last paragraph were also found to be the molecular crosstalks in  [48,49]. The immune-inflammatory mediators (e.g., cytokines and chemokines) abundantly expressed during periodontal inflam-mation can circulate into the bloodstream and travel into the brain by crossing the blood-brain barrier (BBB) and impact the function of CNS [50]. Therefore, the crosstalk between the peripheral immune system and the CNS might be an important mechanism underlying periodontitis, increasing the risk of AD. This paragraph will provide a description regarding the potential role of the six crosstalk genes in linking periodontal disease and AD, especially by means of neuroimmune interaction. For example, the  Importance   shadowMin  FOXO1  MED12  TIAF1  shadowMean  DNM1  TP73  PLXNA3  CASP7  GSTO1  BRI3  MFN2  KAT5  SCIMP  MAP3K12  PSENEN  COX8A  AMIGO1  HM13  MSRA  MYC  GSTK1  NDUFB8  PTGS1  NOS3  CHCHD10  TYRP1  ACE  MMP1  FHL2  NGF  CALR  RPL13  SPPL2B  WARS2  LYZ  PPIL2  MAK16  MS4A1  shadowMax  CDH1  MMP3  BRSK1  NPTXR  USF2  ADRB1  CALML5  ARG2  CTDSP2  C1D  GRIN3B  HSPB1  IGFBP7  CD36  CXCL8  CD177  CTGF  FCGR3A  FLG  BDKRB2  NES  VIM  MEF2C  HCLS1  IL1B  SERPINI1  VEGFA  CDK5R1  CTSS  CD38  BCL2L2  VCAN  MZB1  FCGR3B  ENPP2  GJA1  FOS  SEL1L  ABCA12  DSG1  CXCL12  ST6GAL1  C4A  C4B_2  PLAT  C4B  PECAM1  NEFL  HMGCR  GRHL3  C3  CSF3  CXCL1     BioMed Research International complement components C4A and C4B highly expressed in periodontal disease were found to modulate T cell immune response by stimulating the activation and migration of T cells [51][52][53]. The migration of T cells enhanced by C4A and C4B might allow T cells to traffic across the BBB and enter the brain. For another example, the chemokine and its receptor-composed system CXCL12/CXCR4-7 system were found to be a significant player of the neuroimmune interface [54]. On the one hand, the chemokine CXCL12 mediated the immune-inflammatory response in CNS by  recruiting lymphocytes and macrophages [55]. On the other hand, CXCL12 can lead to neurotoxicity and neurodegenerescence by activating the neuronal survival-associated G protein-activated inward rectifier K(+) (GIRK) [54]. FCGR3A (also named CD16) is essential for the antibodydependent cellular cytotoxicity (ADCC) mediated by natural killer (NK) cells [56]. The increased cytotoxic activity of NK cells was found to cause the dysregulation of protein kinase C and further led to the cognitive deficits in Alzheimer's disease, indicating the contribution of immunological factors to the dysfunction of CNS [57]. IL1B, which was upregulated in periodontitis and transported through the vascular circulation into the brain, was found to play promoting roles in neuroinflammation by enhancing the expression of leukocyte chemotactic chemokines, cell surface adhesion molecules, cyclooxygenases, and MMPs within the brain parenchyma [58]. Likewise, MMP3 abundantly produced in periodontitis was also found to be associated with neuroinflammation via activating microglial cells, as well as participating in the BBB breakdown through the proteolysis of fibronectin and type IV collagen [59]. Taken together, the six crosstalk genes identified in the present research were well evidenced to be involved in periodontitis-triggered peripheral systemic host immune response caused CNS dysfunction in Alzheimer's disease.

BioMed Research International
Three transcription factors, FOS, MEF2C, and USF2, were identified as related to the regulation of the crosstalk genes and were also found to be dysregulated in chronic periodontitis. The proto-oncogene FOS (also named C-Fos) was found to be involved in the transcriptional regulation of collagenase and cell proliferation genes in periodontal gingival fibroblasts [60]. In AD, FOS is reported to initiate amyloid- β-mediated apoptosis and found to be increased in the hippocampal regions of AD patients [61]. Myocyte-specific enhancer factor 2C (MEF2C) was identified as a critical transcription factor involved in the coexpression network of chronic periodontitis [62]. Genome-wide association studies (GWAS) have shown the linkage between mutation of MEF2C and aging-associated late-onset Alzheimer's disease [63,64]. Experimentally, a lack of MEF2C expression was shown to exaggerate microglial response and negatively affect brain function [65]. The USF2 (upstream transcription   Figure 10: The crosstalk gene-differentially expressed pathway interaction network.  17 BioMed Research International factor 2, C-Fos interacting) transcription factor is reported to enhance osteogenic differentiation of periodontal ligament cells (PDLCs) [66]; however, its involvement in periodontal inflammation has not been reported. In the context of AD, the USF2 gene was shown to regulate the expression of genes Dhcr24, Aplp2, Tia1, Pdrx1, Vdac1, and Syn2, which drive the neuropathological mechanisms [67]. However, a study of Japanese participants found that the single nucleotide polymorphisms of the USF2 gene were not significantly related to the onset of AD [68].
Differentially expressed pathways were identified from the crosstalk gene-pathway network, and several pathways including JAK-STAT, MAPK, NF-kappa B, and natural killer cell-mediated cytotoxicity were found as the most robust differentially expressed pathways in at least two periodontitis datasets. Overall, experimental evidence supports these as linkage mechanisms between periodontitis and AD. The activation of the JAK-STAT pathway induced by the Porphyromonas gingivalis lipopolysaccharide (LPS) and nicotine was shown to increase the expression of cyclooxygenase-2 (COX-2), prostaglandin E2 (PGE2), and proinflammatory cytokines in osteoblasts, thus further accelerating periodontitis progression [69,70]. In AD, the inhibitor of the JAK-STAK pathway is reported as a therapeutic target, and  Figure 11: The ROC curves for 6 chronic periodontitis-related genes, C4A, C4B, CXCL12, FCGR3A, IL1B, and MMP3, in the four periodontitis-related datasets. 18 BioMed Research International blocking this pathway can protect against neuroinflammation and dopaminergic neurodegeneration [71]. The MAPK pathway is the upstream signaling intermediate to many inflammatory cytokines such as TNF-α, IL-1β, IL-6, and prostaglandin E2 [72], and the blockage of this pathway could be beneficial for treating inflammatory diseases like chronic periodontitis and AD [73,74]. The activation of MAPK signaling is noted to promote the production of MMPs and RANKL, leading to osteoclastogenesis and the acceleration of alveolar bone loss [73]. MAPK signaling is also implicated in multiple aspects of the neuropathology of AD, such as promoting neuroinflammation, amyloid-beta toxicity and aggregation, autophagy, and apoptosis [75]. The overexpression of NF-κB signaling plays a pivotal role in periodontitis-associated bone destruction by promoting the differentiation and activation of osteoclasts [76]. The blockade of NF-κB signaling is found to trigger detrimental neural alterations including neuroinflammation, activation of microglia, oxidative stress-related complications, and apoptosis [77]. Natural killer (NK) cells are important regulators of innate and adaptive immunity and are closely linked to the regulation of cytotoxicity [78]. Experimental data shows NK cells can directly recognize the Fusobacterium nucleatum pathogen, leading to alveolar bone resorption and periodontitis [79]. The overactivity of NK is also purported to play a driving role in the progression of AD by producing a series of proinflammatory cytokines [80].
The findings of this in silico analysis must be considered in light of the strengths and limitations of this work. By using a machine learning-based feature selection method as the core technique, the most putatively robust crosstalk genes could be identified. Furthermore, functional molecular links were also analyzed in terms of differentially expressed pathways. The integrated analysis of multiple periodontitisrelated GEO datasets enabled a larger sample size for improved accuracy of our computational prediction in the present study. The major limitation of the current approach is that no experimental validation of the identified pivotal genetic molecular linkage candidates was performed. This work has multiple implications for future research. Experimental and clinical studies focused on these candidates could be valuable from the perspectives of identification of shared susceptibility, exaggerating pathogenic mechanisms, biomarkers, and therapeutic targets relevant to precision medicine and drug development or repurposing.

Conclusion
Bioinformatic analysis integrating experimental transcriptomic data from Alzheimer's disease and periodontitis revealed the most robust potentially shared molecular linkages. Six crosstalk genes, C4A, C4B, CXCL12, FCGR3A, IL1B, and MMP3, three transcription factors, FOS, MEF2C, and USF2, and several pathways, JAK-STAT, MAPK, NF-kappa B, and natural killer cell-mediated cytotoxicity, emerged as top candidate shared molecular linkage entities and merit future research in experimental and clinical studies.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval
As this study only applied bioinformatic techniques based on computational analyses of publicly available datasets, therefore, this study did not require ethical approval.

Consent
Consent for publication is not applicable in this study because no individual person's data was used.

Conflicts of Interest
The authors declare no potential conflict of interest with respect to the authorship and publication of this paper.