A Novel Schizophrenia Diagnostic Model Based on Statistically Significant Changes in Gene Methylation in Specific Brain Regions

Objective The present study identified methylation patterns of schizophrenia- (SCZ-) related genes in different brain regions and used them to construct a novel DNA methylation-based SCZ diagnostic model. Methods Four DNA methylation datasets representing different brain regions were downloaded from the Gene Expression Omnibus. The common differentially methylated genes (CDMGs) in all datasets were identified to perform functional enrichment analysis. The differential methylation sites of 10 CDMGs involved in the largest numbers of neurological or psychiatric-related biological processes were used to construct a DNA methylation-based diagnostic model for SCZ in the respective datasets. Results A total of 849 CDMGs were identified in the four datasets, but the methylation sites as well as degree of methylation differed across the brain regions. Functional enrichment analysis showed CDMGs were significantly involved in biological processes associated with neuronal axon development, intercellular adhesion, and cell morphology changes and, specifically, in PI3K-Akt, AMPK, and MAPK signaling pathways. Four DNA methylation-based classifiers for diagnosing SCZ were constructed in the four datasets, respectively. The sample recognition efficiency of the classifiers showed an area under the receiver operating characteristic curve of 1.00 in three datasets and >0.9 in one dataset. Conclusion DNA methylation patterns in SCZ vary across different brain regions, which may be a useful epigenetic characteristic for diagnosing SCZ. Our novel model based on SCZ-gene methylation shows promising diagnostic power.


Introduction
Schizophrenia (SCZ) is a serious mental illness [1]. e World Health Organization estimates the global lifetime prevalence of SCZ at 3.8-8.4% [2]. SCZ is a severe psychosis induced by multiple factors and it manifests as a clinical syndrome with many symptoms. e course of the disease can include repeated relapses that aggravate disease and reduce quality of life. Some patients suffer from depression or mental disability [3]. Currently, clinical diagnosis of SCZ is based on the diagnostic scale of International Mental Disorders Classification [4]. e heterogeneous nature of SCZ pathogenesis has made it impossible so far to identify a single, reliable diagnostic biomarker or model.
Previous studies have shown that epigenetic changes may be related to the pathology of SCZ [5]. DNA methylation is the most stable epigenetic modification, and it can lead to changes in phenotype although the DNA sequence remains unchanged [6,7]. DNA methylation may affect neuronal activity, transcriptional output, and synaptic function. us, methylation may be important in the pathology of SCZ [8]. Abnormal methylation of some genes, such as DRD2 [9], DLGAP2 [10], or COMT [11], may be associated with the occurrence and development of SCZ. In addition, a few DNA methylation-based classifiers for SCZ diagnosis have been reported [12,13]. However, these studies were limited because they relied more on statistical associations without in-depth analysis of biological function.
In the present study, we identified common differentially methylated genes (CDMGs) shared across different brain regions in SCZ patients. We used the 10 CDMGs involved in the greatest number of neurological or psychiatric-related biological processes to construct a DNA methylation-based classifier for SCZ diagnosis. is method may be more biologically relevant than previous ones and may provide new insights to guide future research in SCZ.

DNA Methylation in SCZ.
In the present study, four SCZ methylation datasets (GSE89702, GSE89703, GSE89705, and GSE89706) were downloaded from the Gene Expression Omnibus database (https://www.ncbi.nlm.nih.gov/geo/). DNA was isolated from postmortem brain samples. GSE89702 was derived from cerebellum from the Douglas Bell-Canada Brain Bank, and it included 16 SCZ samples and 17 normal controls. GSE89703 was derived from hippocampus, and it included 14 SCZ samples and 13 normal controls. GSE89705 was derived from striatum, and it included 16 SCZ samples and 17 normal controls. GSE89706 was derived from striatum from the London Brain Bank for Neurodegenerative Disorders, and it included 21 SCZ samples and 28 normal controls. e platform was GPL13534 and included each probe, the position on the chromosome, and the corresponding gene name of each probe. e normalized methylation data matrix was shown as beta values (ranging 0 to 1) of each probe with probe ID in row and patient ID in column. e workflow of the present study is shown in Figure 1.

Differential Methylation Analysis.
Although GSE89705 and GSE89706 were both taken from the region of striatum, they were not combined because we did not know if there is any difference in the processing of samples between the two brain banks and that it is possible that combining datasets may result in some residual inflation according to a published study [14]. us, differential methylation analysis was performed separately in the four datasets using the limma package [15] in R software. e size of samples in each dataset was relatively small, and thus, the significance of differential methylation sites may be relatively low. If rigorously P value filtering, genes with potential biological function may be filtered out. erefore, in the present study, differences with a P value <0.05 were considered significant.

Enrichment Analysis.
To explore the biological functions of CDMGs that may be related to SCZ, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analyses were performed using the Cluster-Profiler package [16] in R. GO terms and KEGG pathways with P values <0.05 were considered significant.

Variable Selection and LASSO Classifier.
Differentially methylated genes were involved in larger numbers of neurological or psychiatric-related biological processes and their methylation levels are thought to be more likely associated with SCZ than genes involved in fewer such processes. e differential methylation genes are different in different brain regions, and the differential methylation sites of the same gene in different brain regions may also be different. So we tried to build a brain-specific methylationbased classifier for SCZ. e corresponding different methylation sites of ten CDMGs involved in the most neurological or psychiatric-related biological processes were used to construct a diagnostic model of SCZ. ese samples of the four datasets were randomly assigned to the training set (75%) and test set (25%), respectively. e four training sets were, respectively, used to select variables (different methylation sites) for establishing a DNA methylation-based classifier, and the test sets were used to validate the four classifiers. e glmnet package [17] in R used the least absolute shrinkage and selection operator (LASSO) [18] was used to select variables and construct a DNA methylationbased diagnostic classifier.

Evaluation of Methylation-Based SCZ Diagnostic Model.
e diagnostic performance of the DNA methylation-based SCZ diagnostic model was evaluated by accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and area under the receiver operating characteristic curve (AUC) as analyzed in the pROC package [19] in R software.

SCZ-Related CDMGs in Different Brain Regions.
We assessed DNA methylation characteristics from the four datasets taken from different regions of the brain. Compared with control groups from the respective datasets, we identified 7887 hypermethylated positions and 7484 hypomethylated positions in GSE89702 (Figure 2 Table S1). However, these CDMGs were methylated at different sites and to different degrees in different brain regions.

CDMGs Involved in Multiple Neurological or Psychiatric-Related Biological Processes and
Pathways. GO analysis of the SCZ-related CDMGs revealed these genes were involved in 244 biological processes, 43 cellular components, and 31 molecular functions (Figures 3(a)-3(d)). KEGG pathway analysis showed that the CDMGs were involved in 80 signaling pathways, most significantly in PI3K-Akt, AMPK, and m-activated protein kinases/extracellular regulated protein kinases (MAPK/ERK), as well as several pathways related to neurological or psychiatric-related biological processes (Figure 3(e)). Notably, the CDMGs were involved in multiple neuronal axon-related biological processes (Table S2); this suggests that the methylation level of CDMGs in neuronal axons may be associated with SCZ.

DNA Methylation-Based Diagnostic Model for SCZ.
e following CDMGs were involved in the greatest numbers of neurological or psychiatric-related biological processes: SHANK3, WNT5A, NLGN1, GLI3, PTPRS, DISC1, SHH, BAIAP2, GLI2, and PAX6. e beta values of the differential methylation sites corresponding to these ten genes were used to construct a methylation template for diagnosing SCZ. Since our four datasets were taken from different regions of the brain, we found that these genes were differentially methylated based on their location-both in the brain region and on the DNA methylation position (Tables 1-4). erefore, the beta values of the differential methylation sites corresponding to each gene were collected, and the LASSO method was used for variable selection and construction of the DNA methylation-based classifier. e results suggested that the counts of differential methylation sites selected by LASSO in different brain regions were various ( Figure 4). e diagnostic ability of the same CDMG in different brain regions varied, which indicated that the epigenetic dysregulation of SCZ is complicated.

Diagnostic Efficiency of the Composite Model in Training and Validation.
In order to evaluate the diagnostic efficiency of the DNA methylation-based classifier, receiver operating characteristic curves were analyzed ( Figure 5). In GES89702, GES89703, and GES89705, the accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and AUC in the training and test sets were 1 (Tables 5-7). In GES89706, the  accuracy was 0.950 in the training set and 0.920 in the test set, while the AUC was 0.994 in the training set and 0.943 in the test set (Table 8). e results suggest the DNA methylation-based classifier is a potential biomarker for diagnosing SCZ.

Discussion
In recent years, the morbidity and mortality rates of SCZ have increased, and the many health-and social-related problems for patients with SCZ are cause for much concern. However, the effective molecular diagnostic methods are unmet. In particular, diagnostic models that take into account both the molecular statistical and biological significance have not received much attention.
In the present study, a total of 849 CDMGs were identified in different brain regions. Functional enrichment analysis indicated the CDMGs were involved in various neurological or psychiatric-related biological processes and  pathways, specifically signaling pathways PI3K-Akt, AMPK, and MAPK. e methylation levels of CDMGs may affect these biological processes and pathways. Our study identified biological processes with confirmed roles in mental diseases, including SCZ [20,21]. MAPK/ERK and PI3K/Akt signaling pathways can regulate protuberant          growth and protein synthesis related to neural plasticity, and it can assist in the normal development of nerve cells, which may protect against SCZ [22]. e initiation of nerve axon regeneration is regulated by the MAPK pathway and this initiates a neuronal response [23]. Further studies should explore directly whether the CDMGs in the present study are associated with SCZ. e beta values of the differential methylation sites corresponding to 10 CDMGs (SHANK3, WNT5A, NLGN1, GLI3, PTPRS, DISC1, SHH, BAIAP2, GLI2, and PAX6) which were most often involved in neurological or psychiatric-related biological processes were used to construct a DNA methylation-based brain region-specific for SCZ. e methylation sites within genes and degree of methylation varied in different brain regions, suggesting that the methylation patterns of SCZ-related genes are extremely complex. e DISC1 protein regulates the development, maturation, and migration of brain neurons and synaptic signal transmission [24][25][26][27], and disruption can lead to various mental diseases, including SCZ [28,29]. e process of nerve development and synaptic transmission regulated by DISC1 can be affected by its degree of methylation [30]. SHANK3 knock out may affect neuronal development and induce SCZ [31]. SHANK3 and NLGN1 are also related to the progression of SCZ [32,33]. However, few reports exist on the association of genes WNT5A, GLI3, PTPRS, SHH, BAIAP2, GLI2, or PAX6 with SCZ. Our results suggest that the methylation level of these genes may be related to the disease. Indeed, our DNA methylation-based classifier showed strong diagnostic potential based on AUC analysis. It is worth noting that due to the characteristics of LASSO method, the more the inclusion variables, the better the effect of the classifier. However, from the perspective of economics, the more the variables (differential methylation sites) are included, the higher the cost is. So taking into account the effectiveness of the model and the cost of economics, we started to perform feature section and classifier construction from 10 genes instead of fewer or more. From the results of the present results, when we included 10 genes, LASSO method identified differential methylation sites of 7-8 genes for us and obtained the best AUC value (close to 1). So we may foresee that the inclusion of fewer genes may greatly reduce classification efficiency, while the inclusion of more genes may be not necessary because it would increase costs but not increase the effect.
Some critical limitations exist in the present study.
Due to the small sample size, our DNA methylation-based SCZ diagnostic model needs to be further validated and improved in larger, independent datasets. e potential roles of CDMGs in SCZ need to be explored experimentally.
Despite these limitations, our findings suggest that gene methylation patterns are significantly associated with SCZ and may be a promising diagnostic method. Methylation levels and sites in CDMGs varied widely across different brain regions, and future studies should explore the potential relevance of this variation for SCZ onset and progression.

Data Availability
e data used to support the findings of this study are included within Supplementary Table S1.

Conflicts of Interest
e authors report no conflicts of interest in this work.

Authors' Contributions
Donghua Zou and Yufen Qiu contributed equally to this work.

Acknowledgments
is study was supported by the Guangxi Natural Science Foundation (2016GXNSFCA380012), the Project of Nanning Scientific Research and Technology Development Plan (20193093), the High-Level Medical Expert Training Program of Guangxi "139" Plan Funding (G201903049) and sponsored by Nanning Excellent Young Scientist Program (RC20190103).

Supplementary Materials
Table S1: common differential methylated genes in four datasets. Table S2: common differential methylated genes involved in neurological or psychiatric-related biological processes. (Supplementary Materials)