Genetic Interactions Explain Variance in Cingulate Amyloid Burden: An AV-45 PET Genome-Wide Association and Interaction Study in the ADNI Cohort

Alzheimer's disease (AD) is the most common neurodegenerative disorder. Using discrete disease status as the phenotype and computing statistics at the single marker level may not be able to address the underlying biological interactions that contribute to disease mechanism and may contribute to the issue of “missing heritability.” We performed a genome-wide association study (GWAS) and a genome-wide interaction study (GWIS) of an amyloid imaging phenotype, using the data from Alzheimer's Disease Neuroimaging Initiative. We investigated the genetic main effects and interaction effects on cingulate amyloid-beta (Aβ) load in an effort to better understand the genetic etiology of Aβ deposition that is a widely studied AD biomarker. PLINK was used in the single marker GWAS, and INTERSNP was used to perform the two-marker GWIS, focusing only on SNPs with p ≤ 0.01 for the GWAS analysis. Age, sex, and diagnosis were used as covariates in both analyses. Corrected p values using the Bonferroni method were reported. The GWAS analysis revealed significant hits within or proximal to APOE, APOC1, and TOMM40 genes, which were previously implicated in AD. The GWIS analysis yielded 8 novel SNP-SNP interaction findings that warrant replication and further investigation.


Introduction
Alzheimer's disease (AD) is the most common neurodegenerative disorder characterized by a progressive decline in memory and cognition. The pathologic cascade in AD involves two primary hallmarks: amyloid-(A ) plaques and neurofibrillary tangles [1]. Genetics plays an important role in late-onset Alzheimer's disease (LOAD), but missing heritability remains to be found according to current approximations [2]. The last several decades of research yielded only one genetic risk factor of large effect for LOAD: Apolipoprotein E (APOE) with 2 copies of the 4 allele confers approximately 6to 30-fold risk for the disease [3]. Some recent genome-wide association studies (GWAS) have identified several additional AD susceptibility genes, including BIN1, CLU, ABCA7, CR1, PICALM, MS4A6A, MS4A4E, CD33, CD2AP, and EPHA1 [4][5][6][7][8][9]. However, these genetic factors have relatively low effect sizes (odds ratios of 0.87-1.23) and cumulatively account for approximately 35% of population-attributable risk [8]. More recently, a large scale GWAS meta-analysis identified 11 new susceptibility loci with also small effect sizes [10].
Traditional GWAS analyses used discrete disease status as the phenotypic trait of interest despite the fact that LOAD is a clinically heterogeneous disorder. Recently, researchers started to explore intermediate quantitative traits (QTs), such as clinical or cognitive features, biochemical assays, or neuroimaging biomarkers, in genetic association testing. This may have the potential to address the issue of clinical heterogeneity in LOAD. These QTs are often measured as continuous variables and thus exhibit a higher genetic signalto-noise ratio. Further, most intermediate QTs are more proximal to their genetic bases than disease status. Thus, the incorporation of intermediate QTs can potentially increase statistical power to detect disease-related genetic associations [11,12]. An ancillary benefit of using QTs is that they can serve as effective biomarkers for monitoring disease progress or treatment response in clinical practice or drug trials.
Over the past 10-15 years, studies have identified robust and predictive biomarkers for AD including levels of tau and amyloid-peptides in cerebrospinal fluid (CSF), selective measures of brain atrophy using magnetic resonance imaging (MRI), and imaging of glucose hypometabolism and amyloid using positron emission tomography (PET) [13]. PET imaging can be used to quantify levels of amyloid in the brain by utilizing a radiotracer such as florbetapir ( 18 F-AV-45 or AV-45) or/and Pittsburgh compound-B (PiB, N-methyl-[ 11 C] 2 -(40-methylaminophenyl)-6-hydroxybenzothiazole). These amyloid measures have been studied as biomarkers for classifying AD [14][15][16][17]. All these multimodal biomarkers can potentially be served as AD relevant QTs and have been examined in many existing quantitative genetics studies of LOAD [18].
In addition, most genetic association studies compute statistics at the single marker level and ignore the possible underlying biological interactions that contribute to the development of disease [19] and could be a possible source for "missing heritability. " Given the quadratically growing search space of two-way interactions, we are facing major computational and statistical challenges. To address this issue, one approach is to effectively explore epistatic interactions in genome-wide data by using a priori statistical and/or biological evidence to generate a reduced set of genetic markers for interaction testing. Using this strategy, previous interaction studies in LOAD (e.g., [20][21][22][23][24]) implicated interactions between CR1 and APOE using quantified A PET as the outcome variable [24] and between cholesterol trafficking genes [21,22] and tau phosphorylation genes [20] in case-control analyses. These studies demonstrated that the important information could be garnered from investigating genetic interactions in complex diseases like LOAD.
With these observations, in the present work, we conducted a quantitative genetics study of an AD-associated amyloid imaging phenotype and examined both single marker main effects and two-marker interaction effects at the genome-wide level. Specifically, we investigated the main and interaction effects of genome-wide markers on cingulate amyloid-beta (A ) load in an effort to better understand the genetic etiology of cingulate cortical A deposition (a LOAD biomarker).

Materials and Methods
Data used in the preparation of this paper were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (http://adni.loni.usc.edu/). The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and nonprofit organizations, as a $60 million, 5-year public-private partnership. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer's disease (AD). Determination of sensitive and specific markers of very early AD progression is intended to aid researchers and clinicians to develop new treatments and monitor their effectiveness, as well as lessen the time and cost of clinical trials.
The Principal Investigator of this initiative is Michael W. Weiner, M.D., VA Medical Center and University of California, San Francisco. ADNI is the result of efforts of many coinvestigators from a broad range of academic institutions and private corporations, and subjects have been recruited from over 50 sites across the US and Canada. The initial goal of ADNI was to recruit 800 subjects but ADNI has been followed by ADNI-GO and ADNI-2. To date these three protocols have recruited over 1500 adults, aged 55 to 90, to participate in the research, consisting of cognitively normal older individuals, people with early or late MCI, and people with early AD. The follow-up duration of each group is specified in the protocols for ADNI-1, ADNI-2, and ADNI-GO. Subjects originally recruited for ADNI-1 and ADNI-GO had the option to be followed in ADNI-2. For up-to-date information, see http://www.adni-info.org/.
We applied for and were granted permission to use data from the ADNI cohort (http://www.adni-info.org/) to conduct genetic association and interaction analyses.

Subjects and Data.
For the present work, analyses were restricted to subjects with both genotyping data and AV-45 PET data available. The study sample ( = 602) included 190 healthy control (HC), 215 early MCI (EMCI), 152 late MCI (LMCI), and 45 AD subjects. Table 1 shows selected demographic and clinical characteristics of these participants at the time of the baseline AV-45 PET scan.

Genotyping Data and Quality
Control. The genotyping data of the participants were collected using either the Illumina 2.5 M array (a byproduct of the ADNI whole genome sequencing sample) or the Illumina OmniQuad array [18,25,26]. For the present analyses, we included single nucleotide BioMed Research International 3 polymorphism (SNP) markers that were present on both arrays.
Quality control (QC) was performed using the PLINK software (version 1.07) [27]. SNPs not meeting any of the following criteria were excluded from further analyses: (1) call rate per SNP ≥95%; (2) minor allele frequency ≥ 5% ( = 117, 175 SNPs were excluded based on criteria 1 and 2); and (3) Hardy-Weinberg equilibrium test of ≥ 10 −6 ( = 997 SNPs were excluded) using control subjects only. Participants were excluded from the analysis if any of the following criteria were not satisfied: (1) call rate per participant ≥ 90% (3 participants were excluded); (2) sex check (1 participant was excluded); and (3) identity check for related pairs (3 sibling pairs were identified with PI HAT >0.5; one participant of each pair was randomly selected and excluded from the study).
Population stratification analysis was performed using EIGENSTRAT [28] and confirmed using STRUCTURE [29]. It yielded 47 study participants who did not cluster with the remaining subjects and with the CEU HapMap samples who are primarily of European ancestry (non-Hispanic Caucasians). These 47 participants were excluded from the analysis. After QC, 582,718 SNPs and 602 samples remained available for genetic association and interaction analyses.

Quantitative Traits.
A previous AV-45 PET study [30] showed that both AD and amnestic MCI subjects had higher standardized uptake value ratio (SUVR) in global cortical, precuneus, frontal, occipital, and posterior cingulate areas. We focused this study in one of these regions, which is cingulate. UC Berkeley extracted baseline SUVR mean measure from the cingulate cortical region (version 2014.7.30) that was downloaded from the ADNI database (http://adni.loni.usc.edu/) for 987 ADNI-GO/2 participants. We also downloaded the cerebellum SUVR measure and used it to normalize the cingulate SUVR measure. The normalized SUVR was used as the quantitative trait (QT) in our analyses. After excluding 383 participants due to the lack of genotyping data, 602 individuals remained in the further analysis.
In addition, amyloid-1-42 peptide (A -42), total tau (ttau), and tau phosphorylated at the threonine 181 (p-tau181p), measured in CSF samples, are potential diagnostic biomarkers for AD [31][32][33]. Among the 602 individuals, 504 have both AV-45 data and CSF data. Following a previous GWAS study on CSF biomarkers [34], QC was performed on the CSF data to reduce the potential influence of extreme outliers on statistical results. Mean and standard deviation (SD) of Aß1-42 and 2 ratios (t-tau/Aß1-42 and p-tau181p/Aß1-42) were calculated, blind to diagnostic information. Subjects who had at least one value greater or smaller than 4 SDs from the mean value of each of 3 CSF variables were regarded as extreme outliers and removed from the analysis. This step removed 5 additional participants, resulting in 499 valid CSF samples.

Genetic Association Studies: Main Effects and Interaction Effects.
For GWAS examining the main effects, linear regression was performed using PLINK to determine the association of each SNP to the AV-45 measure. An additive genetic model was tested with covariates including age, gender, and diagnosis (through four binary dummy variables indicating HC, EMCI, LMCI, or AD). Manhattan plots and Q-Q plots were generated using Haploview (http://www .broad.mit.edu/mpg/haploview/) and R (http://www.r-project.org/), respectively.
For GWIS examining the interaction effects, the INTER-SNP software [35] was applied to the genotyping data and phenotypic AV-45 measure. First, a single marker value for the main effect was computed for each SNP. Top 10,000 SNPs with the smallest values were selected and included in the subsequent interaction analysis. An explicit test for additive interaction (the full model including both additive and dominance effects plus interaction term versus reduced model that does not contain interaction terms) was performed on all possible SNP pairs among the top 10,000 SNPs, using twomarker analysis. The computation was conducted in a linear regression framework. We examined the association between SNP-SNP interactions and the AV-45 measure while controlling for relevant covariates at the baseline scan, including age, sex, and clinical diagnosis. This resulted in a total of approximately 50 million unique SNP pairs to be tested from the ADNI dataset. Interactions were considered significant if their Bonferroni corrected value < 0.05.

Post Hoc Analysis.
For identified significant interactions, we applied hierarchical linear regression using IBM SPSS 20 to estimate the amount of variance ( 2 ) in the AV-45 measure accounted for by these interaction terms. We first included the same set of covariates (age, gender, and diagnosis) in the linear model. After that, we included APOE status, the closest SNP to the BCHE SNP identified in a prior amyloid GWAS study [36], and the two SNP main effects from the identified SNP pair. Finally, we included the SNP-SNP interaction term to calculate additional variance explained by the interaction term. The difference in 2 for the significant models was calculated in SPSS as Δ 2 = 2 (full model with interaction term) − 2 (reduced model without interaction term). Significant effects were plotted in SPSS as well.
In addition, based on the identified interactions associated with AV-45, we further evaluated their main and interaction effects on the CSF levels related to amyloid, including A 1-42, t-tau181p/A 1-42, and p-tau/A 1-42. These three CSF measures were used as the QTs in 3 separate genetic analyses, following the same method and steps for analyzing AV-45 phenotype as described above. Table 1 shows selected demographic and clinical characteristics of 602 ADNI participants analyzed in this study, where the EMCI group is slightly younger than the other groups. Figure 1 shows the Q-Q plot, indicating no evidence of spurious inflation. Figure 2 shows the Manhattan plot. As expected, significant associations were identified between loci on chromosome 19 and the AV-45 measure. The top association is from rs4420638 ( = 5.11 × 10 −21 ), which codes for the APOC1 [37]. A few other SNPs within the  APOE region, including adjacent APOC1 and TOMM40, were significantly associated with the AV-45 level in cingulate.

SNP-SNP Interaction
Results. The INTERSNP model we tested included age, sex, and diagnosis as covariates.

Post Hoc Analysis.
Using a slightly reduced sample ( = 499) with CSF biomarker data available, all 8 identified interactions remained statistically significant when performing hierarchical linear regression using the CSF phenotypes (one baseline measure: A , two ratios: t-Tau/A and p-Tau/A ) instead of the AV-45 measure as outlined earlier (Table 3). We also repeated the same AV-45 analysis on the reduced sample and achieved a very similar result (Table 4).

Discussion.
In this study, we performed both GWAS and GWIS analyses of the cingulate AV-45 florbetapir PET measure, using a sample of 602 subjects from the ADNI database. To our knowledge, this is the first genome-wide study on examining SNP-SNP interaction effects on cingulate amyloid deposition in a substantially large sample. In the single marker analysis, as expected, SNPs in APOE, APOC1, and TOMM40 genes (Figure 2)  significant associations to the cingulate cortical Aß level. Two-marker interaction analyses revealed 8 SNP pairs, which had significant genetic interactions (corrected ≤ 0.05) with cingulate amyloid burden. The risk variants at these pairs had low main effects but explained a relatively high-level variance of the amyloid deposition in cingulate ( Table 2).
In addition, missing heritability can partially be explained by the interaction effects that are not examined in traditional GWAS analyses. Genetic risk underlying diagnosis of LOAD is considered to be manifested from multiple genes which interact with each other. We have performed a post hoc analysis investigating the effects of the identified SNP-SNP interactions LOAD related quantitative phenotypes including amyloid deposition and CSF biomarkers (A , t-tau/A , ptau/A ). Given amyloid and tau phosphorylation as major AD hallmarks, it is not surprising to observe the genetic interaction effects on both the amyloid load and relevant CSF biomarkers (Tables 2-4). Our results suggest that significant SNP-SNP interactions could exist between SNPs with low and insignificant main effects, and these interactions could be associated with altered amyloid burden and explain highlevel risk in AD.
In line with our hypothesis, we identified multiple significant genetic interactions associated with cingulate amyloid deposition. Several genes found in this study have already been implicated in AD, thus lending confidence to the analytic procedure and results. These genes include PRNP [38,39], IGFBP3 [40,41], and MAGI2 [42,43]. For example, Guerreiro et al. reported a nonsense mutation in PRNP associated with clinical Alzheimer's disease [38]. Ikonen et al. showed that interaction between the Alzheimer's survival peptide humanin and insulin-like growth factor-binding protein 3 (IGFBP3) regulates cell survival and apoptosis [40]. Potkin et al. identified an MAGI2 SNP associated with hippocampal atrophy using the ADNI data [42]. Perhaps more importantly, this study also identified a number of SNPs that had not yet been associated with AD in conventional GWAS studies. Thus, this study exposes several potential candidate genes that could be explored in future replication samples.
This study had several methodological and technical advantages over other imaging genetics studies in addition to the above interesting findings. (1) To our knowledge this is the first genome-wide study to explore how SNP-SNP interactions influence cingulate amyloid burden, measured using florbetapir PET scan information. (2) Using continuous quantitative traits as phenotypes confers higher statistical power than using conventional clinical status. (3) The sample in this study included HC, EMCI, LMCI, and AD, thus providing a continuous and wide spectrum of the disease progression in the dataset. (4) Our approach embraced, rather than ignored, the confounding factors including age, sex, diagnosis, and previously identified risk genes APOE and BCHE and provided more accurate estimate of the interaction effects on amyloid burden. (5) CSF data were used in this study to cross-check the identified interactions, which had the potential to serve as an indirect validation strategy or provide complemental information.
Our study has several limitations. (1) We used single marker main effect value to select SNPs for interaction analysis, which could miss significant interactions between SNPs with insignificant main effects. (2) The small cell size in the interaction analyses might introduce false positives.
(3) Our approach is mostly data-driven, without utilizing any existing biological knowledge (e.g., pathways, networks, and other functional annotation data), which may reduce the statistical power and result interpretability.

Conclusions
We performed GWAS and GWIS using amyloid imaging as the quantitative phenotype and investigated the genetic interaction effects on cingulate amyloid-beta (A ) load. The single marker analyses revealed significant hits within or proximal to APOE, APOC1, and TOMM40 genes, which were previously implicated in AD. The interaction analyses yielded a few novel interaction findings associated with cingulate amyloid burden, such as those between CLSTN2 and FHIT, between TACC2 and PRNP, between TACC2 and IGFBP3, and between BCR and MAGI2. Each of these SNP pairs demonstrated significant interaction effects while their individual main effects were not prominent. This suggests that searching for interaction effects may help solve the problem of missing heritability to some extent. Future studies should attempt to replicate these results in independent datasets with neuroimaging and genetic data, as they become available. Additional pathway analysis and gene sets enrichment analysis could be performed to help understand the genetic interactions between SNPs on amyloid imaging phenotypes and potentially provide critical functional evidence in support of the statistical association findings.