Use of Deep-Learning Genomics to Discriminate Healthy Individuals from Those with Alzheimer's Disease or Mild Cognitive Impairment

Objectives Alzheimer's disease (AD) is the most prevalent neurodegenerative disorder and the most common form of dementia in the elderly. Certain genes have been identified as important clinical risk factors for AD, and technological advances in genomic research, such as genome-wide association studies (GWAS), allow for analysis of polymorphisms and have been widely applied to studies of AD. However, shortcomings of GWAS include sensitivity to sample size and hereditary deletions, which result in low classification and predictive accuracy. Therefore, this paper proposes a novel deep-learning genomics approach and applies it to multitasking classification of AD progression, with the goal of identifying novel genetic biomarkers overlooked by traditional GWAS analysis. Methods In this study, we selected genotype data from 1461 subjects enrolled in the Alzheimer's Disease Neuroimaging Initiative, including 622 AD, 473 mild cognitive impairment (MCI), and 366 healthy control (HC) subjects. The proposed deep-learning genomics (DLG) approach consists of three steps: quality control, coding of single-nucleotide polymorphisms, and classification. The ResNet framework was used for the DLG model, and the results were compared with classifications by simple convolutional neural network structure. All data were randomly assigned to one training/validation group and one test group at a ratio of 9 : 1. And fivefold cross-validation was used. Results We compared classification results from the DLG model to those from traditional GWAS analysis among the three groups. For the AD and HC groups, the accuracy, sensitivity, and specificity of classification were, respectively, 98.78 ± 1.50%, 98.39% ± 2.50%, and 99.44% ± 1.11% using the DLG model, while 71.38% ± 0.63%, 63.13% ± 2.87%, and 85.59% ± 6.66% using traditional GWAS. Similar results were obtained from the other two intergroup classifications. Conclusion The DLG model can achieve higher accuracy and sensitivity when applied to progression of AD. More importantly, we discovered several novel genetic biomarkers of AD progression, including rs6311 and rs6313 in HTR2A, rs1354269 in NAV2, and rs690705 in RFC3. The roles of these novel loci in AD should be explored in future research.


Introduction
Alzheimer's disease (AD) is the most common type of dementia and is an irreversible, progressive neurological brain disorder typically beginning with mild memory decline; in time, it can seriously impair an individual's ability to carry out daily activities and lead to loss of autonomy [1,2]. Mild cognitive impairment (MCI) is a preclinical stage of AD, in which individuals have no obvious cognitive behavioral symptoms but can show subtle prodromal signs of dementia [3,4]. It is widely recognized that early detection of AD and MCI is essential to slowing progression.
Among factors that influence AD progression, common genetic variants are major risk factors [5]. Currently, the development of cheap comprehensive genetic testing of peripheral blood has brought dramatic changes to studies of the mechanisms of disease development. In recent decades, several genes have been associated with AD risk based on full-genome genotyping arrays using blood samples [6,7]. For instance, genomics analysis showed APOE to be the most strongly associated AD risk gene [8]. In addition, the CLU, PICALM, SORL1, BIN1, and TOMM40 genes have also been identified as AD risk factors in the literature [7,9,10].
Technological advances [11] have allowed analysis of millions of nucleotide polymorphisms from thousands of subjects, including advanced genome-wide association studies (GWAS) and whole genome sequencing [12][13][14][15][16] that have increased our understanding of the genetic complexity of AD susceptibility. For instance, recent GWAS from the Alzheimer's Disease Neuroimaging Initiative (ADNI) have related known AD risk genes to differences in rates of brain atrophy and biomarkers of AD in the cerebrospinal fluid [17]. Moreover, the International Genomics of Alzheimer's Project studied 74046 participants, confirming nearly all of the previous genetic risk factors and identifying 12 new susceptibility loci for AD [18]. Therefore, genomics analysis, especially GWAS analysis, has yielded important advances in AD research.
However, there are some limitations of GWAS. Firstly, traditional GWAS intergroup analysis is distorted by differences in sample sizes [19]. Secondly, traditional GWAS analysis is strongly dependent on prior knowledge and hand coding, which requires much time and energy and risks bias or errors in data entry [16] that can result in poor repeatability. Moreover, although traditional GWAS analysis can assure high specificity of disease screening, accuracy, and sensitivity are relatively low. In practice, false positives are preferred over false negatives in order to avoid omissions in disease screening. Therefore, alternative analytical tools would help to drive novel hypotheses and models.
Deep-learning algorithms implemented via deep neural networks can automatically embed computational features to yield end-to-end models that facilitate discovery of relevant highly complex features [20]. Seminal studies in 2015 demonstrated the applicability of deep neural networks to DNA sequence data [21,22]. Deep convolutional neural networks (CNNs) have been used in recent studies to predict various molecular phenotypes on the basis of DNA sequence alone. Applications include classifying transcription factor binding sites, predicting molecular phenotypes such as DNA methylation, microRNA targets, and gene expression [23][24][25][26][27]. In addition, CNNs have been utilized to call genetic variants [28] and classify genetic mutations in tumors [29]. Multitask and multimodal models and transfer learning have also been developed in genomics [30,31]. In this work, we hypothesize that deep-learning genomics (DLG) can be applied to AD and outperform traditional GWAS analysis. We propose a DLG method to replace traditional GWAS analysis for multitasking classification of AD progression and use this approach to seek novel genetic biomarkers of AD susceptibility.

Materials and Methods
The experimental workflow of this study consisted of three steps as shown in Figure 1. First, we conducted quality control and SNP genotype coding for SNP genotype data. Second, we used the deep residual network ResNet for DLG. The goal of the deep residual network was to obtain a model by supervised learning for prediction and extraction of DLG features. The details of this process are described in detail in the following sections. Finally, we investigated interpretability of the DLG model by applying Gradient-weighted Class Activation Mapping (Grad-CAM).
2.1. Subjects. Data used in the preparation of this study was obtained from the ADNI database (http://adni.loni.usc.edu/ ). ADNI was launched in 2004 by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, the Food and Drug Administration, private pharmaceutical companies, and nonprofit organizations, as a $60 million, 5-year public-private partnership. In this study, 1461 individuals (622 AD, 473 MCI, and 366 healthy controls (HCs)) from the ADNI 1, ADNI 2, and ADNI GO cohorts of the ADNI database were included. Meanwhile, the following data from the 1461 ADNI participants was downloaded: Illumina SNP genotyping data, demographic information, and diagnosis information. Written informed consent was obtained from all participants, and the study was conducted with prior institutional review board approval. Clinical characteristics, including age, sex, education, and Montreal Cognitive Assessment (MoCA) results, were collected and are listed in Table 1.
The subjects were of age 55-90 (inclusive) years. The detailed ADNI eligibility criteria are available from http:// adni.loni.usc.edu/methods/documents/. In brief, eligibility criteria for these participants were as follows: (1) normal subjects: a Clinical Dementia Rating (CDR) of 0, nondepressed, non-MCI, and nondemented; (2) MCI subjects: a memory complaint, objective memory loss measured by education adjusted scores on Wechsler Memory Scale 7/Logical Memory II, a CDR of 0.5, absence of significant levels of impairment in other cognitive domains, essentially preserved activities of daily living, and an absence of dementia; (3) AD: CDR of 0.5 or 1.0 and met the National Institute of Neurological and Communicative Disorders and Stroke and Alzheimer's disease and Related Disorders Association 2 Behavioural Neurology criteria for probable AD [32]. Specific psychoactive medications were excluded. We investigated two groups of subjects using SNP genotype data collected from the ADNI databases. Our training and validation group contained 560 subjects with AD, 426 subjects with MCI, and 330 HC subjects. We used the SNP genotype data from this group to establish and test the validity of our predictive models. Our test group consisted of 62 AD subjects, 47 MCI subjects, and 36 HC controls, and we used the SNP genotype data to evaluate the diagnostic value of the predictive models.
2.2. DNA Isolation and SNP Genotyping. SNP genotyping for more than 620,000 target SNPs was completed on all ADNI participants using the following protocol. First, a total of 7 mL of blood was taken from each participant and stored in EDTA-containing Vacutainer tubes, and genomic DNA was extracted using the QIAamp DNA Blood Maxi Kit following the manufacturer's protocol. Second, lymphoblastoid cell lines were established by transforming B lymphocytes with Epstein-Barr virus [33]. Fourteen genomic DNA samples were analyzed using the Human 610-Quad BeadChip according to the manufacturer's protocols. Before starting the assay, a 50 ng sample of genomic DNA from each participant was examined qualitatively on a 1% Tris-acetate-EDTA agarose gel to check for degradation. Degraded DNA samples were excluded from further analysis. Third, samples were quantitated in triplicate with PicoGreen® reagent and diluted  3 Behavioural Neurology to 50 ng/L in TrisEDTA buffer (10 mM Tris, 1 mM EDTA, pH 8.0). A total of 200 ng of DNA was denatured, neutralized, and amplified for 22 hours at 37°C, and then fragmented with FMS reagent (Illumina) at 37°C for 1 hour, precipitated with 2-propanol, and incubated at 4°C for 30 minutes. Fourth, the resulting blue precipitate was resuspended in RA1 reagent (Illumina) at 48°C for 1 hour. Samples were then denatured (95°C for 20 minutes) and immediately hybridized onto BeadChips at 48°C for 20 hours. The BeadChips were washed and subjected to single base extension and staining. Finally, the BeadChips were coated with XC4 reagent (Illumina), desiccated, and imaged on a BeadArray Reader (Illumina). Illumina BeadStudio 3.2 software was used to generate SNP genotypes from bead intensity data.
2.3. Quality Control and APOE Genotype. The following quality control (QC) steps were performed on the 1461 samples using PLINK v1.07 software. QC processes were conducted separately between the AD and HC groups, the HC and MCI groups, and the AD and MCI groups. SNPs and participants were excluded from the analysis if they failed to meet any of the following criteria [34]: call rate per SNP ≥ 90%; call rate per participant ≥ 90%; gender check; minor allele frequency ðMAFÞ ≥ 5%; Hardy-Weinberg equilibrium test of p ≤ 10 −6 ; PI HAT < 0:5. After the QC procedure, the numbers of features considered for future analysis of each subject in the paired groups were as follows: 301,388 in the HC and MCI groups, 301,853 in the HC and MCI groups, and 301,138 in the MCI and AD groups. The overall genotyping rate for the remaining dataset was over 99.5%.
In addition, although the APOE gene is an important target gene in AD research, it was not available for all identified APOE SNPs on the Illumina array. Therefore, based on the reported APOE ε2/ε3/ε4 status, the genotypes of the unavailable APOE SNPs were added manually to ADNI genotype data before assessing sample quality.

SNP Genotype Coding.
A single-nucleotide polymorphism is a DNA sequence variation which occurs when a single nucleotide (A, T, C, or G) in the genome differs among members of a biological species or across paired chromosomes. Based on the satisfactory ADNI GWAS SNP data of this study, we encoded SNPs using the following coding scheme: 1 refers to A, 2 refers to T, 3 refers to C, and 4 refers to G.

GWAS Analysis.
In the multitasking classification of this study, GWAS analysis, which has emerged as a popular tool for identifying genetic variants associated with disease risk, was designed to be compared with deep-learning models. Standard analysis of a case-control GWAS involves assessing the association between each individual genotyped SNP and disease risk. Manhattan and quantile-quantile (Q-Q) plots were used to visualize the GWAS results. All association results surviving the significance threshold of p < 1:66e −7 were saved and prepared for subsequent pattern analysis.
2.6. Deep-Learning Genomics Model Based on ResNet. The DLG model acted as a feature encoder, which had a significant impact on classification. In this study, we applied ResNet, a deep residual network, to the classification between AD and HC groups, AD and MCI groups, and HC and MCI groups. Residual units were added to the deep residual network on the basis of CNNs.
A CNN, the most effective type deep-learning model, is generally composed of three types of layers: convolutional, pooling, and fully connected. The following describes the operation of a CNN. The first step is to convolve the input sequences with a set of filter kernels; all the features active at different positions after convolution constitute the feature map [35]. A nonlinear activation function, typically a rectified linear unit (ReLU), is applied on each layer and on the sum of the feature maps. The operation of the convolutional layer and ReLU can be expressed as follows: ReLU y n ð Þ = max 0, y n ð Þ, where C r n is the n th output of the r th convolutional layer, n represents the number of filters in the r th layer, w r n and b r n are, respectively, the weight and bias of the n th filter of the r th layer, v r−1 m is the m th output of previous layer r − 1, and * denotes the convolutional operation.
Next, the resulting feature map is processed through the pooling layer by taking either the mean or maximum activation over disjoint regions for each channel [20,35]. By sequential combination of convolutional and pooling layers, a multilayer structure is built for feature description. Lastly, the fully connected layers are employed for classification. In total, when given a training set fX j g j , the learning process of a CNN with K convolutional layers, whose filter parameters are fW i g K i=1 , the bias values are fb i g K i=1 , and D refers to classification layers, can be represented as an optimization learning task: where L is the loss function that represents the cost difference between the true label hðXÞ and the predictive label from DÞ. Based on the CNN model, the greatest advantage of the ResNet framework lies in adding identity mapping that is performed by the shortcut connections, the outputs of which are added to the outputs of the stacked layers [36]. Therefore, the ResNet addressed the degradation problem and added neither extra parameters nor computational complexity. The formula for residual learning was designed as follows: the desired underlying mapping is denoted as HðxÞ, and the stacked nonlinear layers were allowed to fit a separate mapping of ϝðx, ΘÞ = HðxÞ − x. The original mapping was recast into Fðx, ΘÞ + x. Thus, the overall representation of the residual block was as follows: 4 Behavioural Neurology The formulation of ϝðx, ΘÞ + x can be realized by feedforward neural networks using "shortcut connections." A deep residual network can be established by stacking a series of residual blocks. Specifically, there were two steps in the process: forward computation and backward propagation. When K residual blocks are chosen to stack, the forward propagation of such a structure can be expressed by where x 0 and x 1 are the input and the output of the residual network, respectively, and Θ r = fθ r,l j 1≤l≤L g is the weight related to the r th residual block, L being the number of layers within the block. Likewise, the back propagation of the overall loss of the neural network to x 0 can be denoted as where L is the whole loss function of the neural network. Before modeling using the above procedures, each subject's SNP genotype data was cropped after quality control and mapped to 776 × 776 pixels. The pathology type was encoded to one-hot, which was the label. Thereafter, in the training stage, SNP genotype data was fed into the network to update model parameters via backward propagation with the Adam algorithm, a first-order gradient-based optimization algorithm which has been proven to be computationally efficient and appropriate for training deep neural networks. The outputs of the network were used as the classification results, and the crossentropy of the outputs was calculated as the loss function. More specifically, the output of the network for each individual SNP could be a binary value. 1 represented the highest probability of being AD subjects, while 0 represented highest probability of being HC subjects. We adopted ResNet18 and ResNet34 frameworks in this study. Meanwhile, we also utilized a traditional CNN model for the comparative experiments of classification. In the ResNet models, we set learning rate into 1e −3 and applied the Adam optimizer to update the model parameters with the batch size of 8. The maximum number of iterations was set into 20. Note that we used L2 regularization in this step to prevent the overfit of our model. For adjusting the CNN model parameters, we set learning rate into 1e −2 and applied the Adam optimizer to update the model parameters with the batch size of 8. The maximum number of iterations was set to 30. Above deep-learning models were processed on a GPU (graphics processing unit, GTX 1080 Ti acceleration of PyCharm 3.5).
For investigating the interpretability of the DLG model, the last convolutional layer of the last res-block was made transparent in order to extract DLG features by applying Grad-CAM and two-sample t-tests. For the first step, the last convolutional layer of the last res-block was chosen to extract normalized DLG features. Subsequently, using a two-sample t-test with a false discovery rate [37,38], we compared the Z -coefficients of the AD and HC groups, the HC and MCI groups, and the MCI and AD groups.
2.7. Classification. In this study, the subjects of multitasking classification were randomly divided into one training group and one independent test group at a ratio of 9 : 1 as shown in Table 1. The training group was then used to optimize the model parameters. We also randomly chose 25% of training group to form a validation group to guide the choice of hyperparameters.
On the one hand, we conducted training of several deeplearning models, including ResNet18, ResNet34, and a traditional CNN, and compared classification performance in order to screen for the optimum DLG. On the other hand, in order to verify the diagnostic capabilities of the DLG model compared with traditional GWAS analysis, we also designed comparative trials. Among all the gene indicators, theta proved to be the most directly related to SNP changes. APOE ε4 status and the normalized theta-value of the significant SNP loci found in this study were seen to be genetic predictors, and we used the support vector machine (SVM) with the linear kernel 500 times for classification of traditional GWAS.
To evaluate classification performance, we repeatedly conducted 5-fold crossvalidation in the training group. Accuracy, sensitivity, and specificity were used to evaluate the results. The mathematical expression of the three parameters was as follows: where Tn, Tp, Fn, and Fp denote, respectively, true negatives, true positives, false negatives, and false positives. A receiver-operating characteristic (ROC) curve was produced to intuitively compare the results of the different approaches, and the area under the curve (AUC) of the ROC was computed to quantitatively evaluate classification performance.
2.8. Statistical Analysis. Demographic characteristics were compared between groups using a two-sample t-test or the chi-square test. In addition, a two-sample t-test of the extracted features was applied as a criterion to estimate the differences in DLG features between AD patients and HCs, AD patients and MCIs, and HCs and MCIs. All statistical analyses were performed using SPSS Version 22.0 software (SPSS Inc., Chicago, IL) and Matlab2016b (Mathworks Inc., Sherborn, MA, United States). All p values < 0.05 were considered significant.

Outcomes of GWAS Analysis.
We carried out casecontrol GWAS analysis between the AD and HC groups and observed two genome-wide significant loci on chromosome 19, including rs429358 (APOE, the epsilon 4 marker) and rs2075650 (TOMM40). Figures 2 and 3 show the resulting Manhattan and Q-Q plots, and Table 2 summaries the SNPs that achieved genome-wide significance. The p value used to assess significant differences was calculated as p = 0:05/N, where N indicates the number of satisfied SNPs. Table 3 lists the performance of the different multitasking classification methods, including classification accuracy, sensitivity, specificity, and AUC. Taking the result of classification between the AD and HC test group subjects as an example, accuracy, sensitivity, specificity, and AUC were, respectively, 71:38% ± 0:63%, 63:13% ± 2:87%, 85:59% ± 6:66%, and 0.744, for the GWAS analysis, 92:45% ± 8:13%, 93:87 ± 12:26, 90:00 ± 15:97, and 0.915 for the CNN model, 97:96 ± 1:71, 97:42 ± 3:16, 98:89 ± 1:36, and 0.980 for ResNet18, and 98:78% ± 1:50%, 98:39% ± 2:50%, 99:44% ± 1:11%, and 0.981 for ResNet34. We found that the deep-learning model exhibited high accuracy, sensitivity, and specificity, whereas accuracy and sensitivity were low for the GWAS analysis. Therefore, we concluded that deep-learning models were superior to traditional GWAS analysis for classification. And compared with the CNN model, the results using ResNet were more robust and stable. These results were the same using the other two group-level classifications. Based on these results, ResNet34 was chosen for the DLG model because the observed classification performance was optimal among the several deep-learning models. A more intuitive comparison is provided by the ROC curves of the multitasking classification shown in Figure 4.

3.3.
Interpretability of the DLG Model. Setting a threshold of p < 0:05, more than ten thousand SNP loci showed differences between the groups, and even the significance of the most frequently identified loci was below 0.001.
Firstly, we compared the significant SNPs with those previously identified by GWAS as genetic susceptibility factors. Almost one hundred SNP loci between AD patients and HCs were consistent with findings from previous studies. Likewise, more than one hundred associated SNP loci were also found between the AD and MCI groups and between the HC and MCI groups.
Secondly, we sought significant SNP loci among three classification tasks. The gene regions of sixty-six SNP loci were shared in different stages of AD progression. Table 4 summarizes the sixty-six shared significant SNP loci among the three classifications, including, e.g., the well-known CLU, PICALM, and SORL1 gene regions. For rs11136000 (CLU) in chromosome 8, its p values were 6.63e −4 between the AD and HC groups, 8.37e −6 between the MCI and HC groups, and 1.49e −7 between MCI and AD groups. In addition, three SNP loci, rs543293, rs10501602, and rs3851179, were found in the PICALM gene region of chromosome 11, and the p values of rs3851179 for the comparisons of the three groups were 6.00e −3 , 1.06e −6 , and 1.51e −20 , while the p  Figure 3: Q-Q plot of genome-wide association study (GWAS) between AD and HC groups. Genomic inflation factor is 1.084. 6 Behavioural Neurology values of rs543293 were 4.65e −3 , 3.43e −8 , and 2.27e −13 , and the p values of rs3851179 were also much less than 0.001. These results are well supported by previous studies. Other significant results are detailed in Table 4, and the heatmaps of significant SNPs in chromosomes 8, 11, and 13 are shown in Figure 5. The horizontal axis represents major and minor alleles, and the vertical axis represents the p value of SNP loci in the chromosomes. We observed some distinct differences, for example, between rs11136000, rs3851179, and surrounding loci. In addition, except for those in Table 4, there were also several SNP loci showing an association with AD progression in their respective classifications. Several also have been reported and confirmed in previous large-scale GWAS studies, including APOE, BIN1, CHRM1, and TOMM40 with p values much less than 0.001. Furthermore, it is notable that rs6311 and rs6313 in the HTR2A gene region, rs1354269 in the NAV2 gene, and rs690705 in the RFC3 gene all exhibited significant differences among the three classifications. For instance, the p values of rs6311 were 1.96e −5 , 2.52e −3 , and 1.48e −11 between the respective groups, and the p values of rs6313 were 3.21e −5 , 4.55e −3 , and 2.05e −12 . An understanding of the roles of these novel loci in AD requires future study.
All of the information above was deposited in the DisGe-NET database (http://www.disgenet.org/home/), a discovery platform containing one of the largest publicly available collections of genes and variants associated with human disease.

Discussion
This study used a comparison of the performance of several different deep-learning models as a basis for proposing a deep-learning genomics method based on ResNet34. The classification results indicate that the DLG model offers a higher diagnostic value than traditional GWAS analysis.

Outcomes of GWAS Analysis. In GWAS analyses, two
SNPs have been identified at the p < 1:66e −7 significance level: APOE SNP rs429358 was determined to be the most significant genetic risk factor for AD. And the second most significant factor, TOMM40 SNP rs2075650, was found to be adjacent to the APOE SNP [10]. These results are consistent with previous studies. Although these SNP loci were identified by GWAS, traditional GWAS analysis suffers from being influenced by small sample size. Because other common genetic risk factors may have a much smaller impact on risk than the APOE gene, novel risk factors present in small samples may go undetected by GWAS analysis. Several previous studies have also demonstrated an explicit relationship between sample size and the number of significant  The methods are conducted with crossvalidation, and their results are given as mean ± standard deviation. 7 Behavioural Neurology differences in traits identified by genome-wide association studies [18,19].

Classification Performance.
In this study, in order to construct a deep-learning genomics model, we compared the performance of several deep-learning classification methods, including a simple CNN model, ResNet18, and ResNet34. As shown in Table 3, we observed that the results of the deep residual network were superior to those of a simple CNN, and in the process of training the model, the ResNet models exhibited robustness and stability superior to those of CNNs, and furthermore, ResNet34 was superior to RseNet18. Therefore, we chose ResNet34 as the final DLG model. More importantly, we compared the performance of the DLG model and traditional GWAS analysis under the same conditions and found the classification results of the DLG model to be superior. These results suggest that the deep-learning algorithm is effective in genome applications and that development of deep-learning genomics is worthy of further exploration.

Interpretability of the DLG Model.
When we interpreted the DLG model, we found more than one thousand SNP loci with significant differences between AD patients and HCs, between the MCI and AD groups, and between the HC and MCI subjects. As is well known, rs11136000 (CLU), rs3851179 (PICALM), rs2070045 (SORL1), and rs1699102 (SORL1) have previously been identified as risk factors for   [7,9,39]. Notably, they were all included among the sixty-six significant SNP loci shared in the three classification tasks in this study (as shown in Table 4 and Figure 5). For example, previous studies have shown that CLU modulates Aβ metabolism and is involved in Aβ clearance or acts as a chaperon for protein degradation [40]. PICALM, as an adaptor protein involved in clathrin-mediated endocytosis, regulates amyloid precursor protein (APP) internalization and subsequent Aβ generation, contributing to brain amyloid plaque load via its effect on Aβ metabolism [41,42]. In addition to the analysis of the above identical SNP loci found among the three classification tasks, several differential loci were identified among one or two classification tasks, which are also consistent with previous research. Rs10194375 (BIN1), a protein that may be associated with tau-mediated pathology was identified as being significant between the AD and HC groups and the AD and MCI groups. In addition, rs2075650 (TOMM40), rs405509 (APOE), and rs429358 (APOE) were identified as significant between the HC and MCI groups and the MCI and AD groups. In summary, the DLG model is able to identify differential genomics in multitasking classification.
Most importantly, in addition to those shown to associate with AD in the past, we found several new SNP loci, including rs6311 (HTR2A), rs6313 (HTR2A), rs1354269 (NAV2), rs1946518 (IL18), rs1799986 (LRP1), rs690705 (RFC3), and rs7943454 (LUZP2), whose p values were highly significant(as shown in Table 4). Rs6311 and rs6313 are in the HTR2A gene region. The HTR2A gene in humans is located on chromosome 13 and consists of exons separated by only two introns and encodes one of the receptors for serotonin. According to previous publications, HTR2A has received much attention in many psychiatric disorders such as mood disorders, attention deficit hyperactivity disorder, anxiety disorders, and schizophrenia. On the one hand, some studies have shown that medications for mood disorders and related conditions work by blocking 5-HT2A and altering the function of certain brain circuits. And blocking 5-HTR2A also seems to improve the effects of some antidepressants [43]. On the other hand, the numbers of the postsynaptic receptor HTR2A are reduced in the neocortex, and it seems to be involved in memory via its role in cortical pyramidal cells. For example, in AD research, HTR2A receptor densities in the brains of AD subjects were found to be reduced compared with age-matched controls, and the researchers also found this reduction correlated with the rate of decline of cognitive scores [44]. Hence, since subjects with AD or mild cognitive impairment exhibit depression and anxiety to various degrees, it is worth exploring whether rs6311 and rs6313 of the HTR2A gene contribute to AD susceptibility. Another significant locus identified here was rs1354269 located in the NAV2 gene region. The NAV2 gene, which encodes a member of the neuron navigator gene family, is highly expressed in brain and is involved in the development of the nervous system. Hence, the role of the NAV2 gene in AD is also worthy of future investigation. In addition, rs690705 of the RFC3 gene region also exhibited a significant difference in group-level classifications, and its impact on AD should be examined in the future.

Behavioural Neurology
In the future work, we plan to combine gene sequences with clinical data and brain imaging [45] to facilitate investigation of the mechanisms of AD progression by deep-learning genomics and deep-learning radiomics approaches. Secondly, we only classified information from the ADNI dataset in this study, so the results could be strengthened by including other datasets such as the Chinese populations. Thirdly, the number of subjects represented in this study may be limiting. Lastly, although this study has demonstrated the feasibility of DLG approach, it will be important to further explore the interpretability of deep-learning genomics.

Conclusions
In conclusion, the current study suggests that the deeplearning genomics approach is effective for multitasking classification research on AD progression and outperforms traditional GWAS analysis. Moreover, the several novel SNP loci identified in the DLG approach including rs6311 and rs6313 in HTR2A, rs1354269 in NAV2, and rs690705 in RFC3 are worthy of further exploration to better understand the mechanisms of AD.