Copy number variants (CNVs) are DNA sequence alterations, resulting in gains (duplications) and losses (deletions) of genomic segments. They often overlap genes and may play important roles in disease. Only one published study has examined CNVs in late-onset Alzheimer's disease (AD), and none have examined mild cognitive impairment (MCI). CNV calls were generated in 288 AD, 183 MCI, and 184 healthy control (HC) non-Hispanic Caucasian Alzheimer's Disease Neuroimaging Initiative participants. After quality control, 222 AD, 136 MCI, and 143 HC participants were entered into case/control association analyses, including candidate gene and whole genome approaches. Although no excess CNV burden was observed in cases (AD and/or MCI) relative to controls (HC), gene-based analyses revealed CNVs overlapping the candidate gene
Alzheimer’s disease (AD) is the most common cause of dementia and accounts for 50–80% of dementia cases. Currently, an estimated 5.3 million Americans have AD, the seventh leading cause of death in the United States. The hallmark abnormalities of AD are deposits of the protein fragment amyloid
Genetic factors play a key role in the development and progression of AD. AD has a high heritability, with 58–79% of phenotypic variation estimated to be caused by genetic factors [
Copy number variants (CNVs) are segments of DNA, ranging from 1 kilobase (kb) to several megabases (Mb), for which differences in the number of copies have been revealed by comparison of two or more genomes. These differences can be copy number gains (duplications or insertional transpositions), losses (deletions), gains or losses of the same locus, or multiallelic or complex rearrangements. CNVs have been implicated in various neuropsychiatric disorders such as autism and schizophrenia [
In the present report, we conducted a preliminary CNV analysis using genotype data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort to examine the role of CNVs in susceptibility to MCI and LOAD. ADNI is an ongoing multiyear public-private partnership to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), genetic factors such as single nucleotide polymorphisms (SNPs) and CNVs, other biological markers, and clinical and neuropsychological assessments can be combined to improve early diagnosis and predict progression of MCI and early AD. Here, we used the genome-wide array data acquired on the ADNI cohort to determine whether AD and MCI participants (cases) showed an excess burden of CNVs relative to controls and to characterize any genomic regions where CNVs were detected in cases but not controls.
The ADNI was launched in 2004 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and nonprofit organizations, as a $60 million, multiyear public-private partnership. The Principal Investigator of this initiative is Michael W. Weiner, M.D., VA Medical Center and University of California—San Francisco. ADNI is the result of efforts of many coinvestigators from a broad range of academic institutions and private corporations. Presently, more than 800 participants, aged 55 to 90, have been recruited from over 50 sites across the US and Canada, including approximately 200 cognitively normal older individuals (i.e., healthy controls or HCs) to be followed for 3 years, 400 patients diagnosed with MCI to be followed for 3 years, and 200 patients diagnosed with early AD to be followed for 2 years [
Participants in the present report included 655 non-Hispanic Caucasian individuals from the ADNI cohort who had DNA samples extracted from peripheral blood. Those with DNA samples derived from cell lines were excluded from the present analysis because cell line transformation might influence CNV results [
Blood samples from each participant were obtained and sent to Pfizer for DNA extraction and were also banked at The National Cell Repository for Alzheimer's Disease (NCRAD;
Normalized bead intensity data for each sample was loaded into GenomeStudioV2009.1 software (Illumina, Inc., CA) along with the manufacturer’s cluster file to generate SNP genotypes. The Log R Ratio (LRR) and B Allele Frequency (BAF) values computed from the signal intensity files by GenomeStudio for each sample were exported and used for the generation of CNV calls. Initial genotyping was performed by TGen using BeadStudio software (Illumina, Inc., CA). In January 2010, we reprocessed the array data using GenomeStudioV2009.1, and this data set will be made available on the ADNI website in a followup data release.
The two alleles of an SNP are designated as allele A and allele B. GenomeStudio software uses a five-step six-degree of freedom affine transformation to normalize signal intensity values of the A and B alleles (referred to as X and Y). The normalized values are then transformed to a polar coordinate plot of normalized intensity
Linear interpolation of the
The BAF for a sample shows the
CNV calls were generated for the 655 non-Hispanic Caucasian participants whose DNA was derived from peripheral blood. PennCNV software (2009Aug27 version) (
Analyses were also restricted to autosomes due to the complications of hemizygosity in males and X-chromosome inactivation in females. Finally, to ensure only high-confidence CNVs were included in the analysis, CNVs for which the difference of the log likelihood of the most likely copy number state and less likely copy number state was less than 10 (generated using the confidence function in PennCNV),CNVs that were called based on data from fewer than 10 SNPs, and CNVs that had >50% overlap with centromeric, telomeric, and immunoglobulin regions as defined in Need et al. [
Case/control association analyses using CNV calls generated for the AD, MCI, and HC participants were performed using PLINK v1.07 [
Representative plots of CNV calls (Figure
Examples of candidate genes (a)
Representative image of B Allele Frequency and Log R Ratio of the participant who had a duplication at 16p11.2. The purple-shaded portion indicates the duplicated region (Human Genome Build 36.1).
The sample demographics and CNV call characteristics of the 501 participants who passed all QC checks are shown in Tables
Sample demographics.
Current diagnosis | Alzheimer's disease | Mild cognitive impairment | Healthy controls | |
---|---|---|---|---|
Number of participants | 222 | 136 | 143 | — |
Gender (Males/Females) | 133/89 | 87/49 | 82/61 | not significant |
Baseline age (Mean ± SD) | not significant | |||
Years of education (Mean ± SD) | 0.009 | |||
73/149 | 70/66 | 108/35 | <0.001 | |
Age of onset (Mean ± SD) | — | — | — |
Characteristics of CNV calls in the three diagnostic groups.
Alzheimer's disease ( | Mild cognitive impairment ( | Healthy controls ( | |
---|---|---|---|
Deletions: | |||
Number of CNVs | 2128 | 1340 | 1278 |
Rate per participant | 9.59 | 9.85 | 8.94 |
Average size (kb) | 73.24 | 76.32 | 79.38 |
Duplications: | |||
Number of CNVs | 886 | 498 | 607 |
Rate per participant | 3.99 | 3.66 | 4.24 |
Average size (kb) | 157.24 | 154.06 | 170.30 |
Participants grouped by CNV call size.
Call size | Alzheimer's disease ( | Mild cognitive impairment ( | Healthy controls ( | |||
Deletions | Duplications | Deletions | Duplications | Deletions | Duplications | |
0.1–0.5 Mb | 174 (78.38) | 183 (82.43) | 104 (76.47) | 100 (73.53) | 114 (79.72) | 120 (83.92) |
0.5–1.0 Mb | 6 (2.70) | 27 (12.16) | 8 (5.88) | 18 (13.24) | 8 (5.94) | 27 (18.88) |
1.0–1.5 Mb | 0 (0.00) | 8 (3.60) | 0 (0.00) | 4 (2.94) | 2 (1.40) | 8 (5.59) |
1.5–2.0 Mb | 0 (0.00) | 2 (0.90) | 0 (0.00) | 1 (0.74) | 1 (0.70) | 0 (0.00) |
>2.0 Mb | 1 (0.45) | 1 (0.45) | 0 (0.00) | 0 (0.00) | 0 (0.00) | 0 (0.00) |
We identified regions overlapping 294 AD candidate genes with CNV calls from at least one case (AD and/or MCI) but no controls (HC). As expected, cell sizes were very small in each group leading to low power. Resulting CNV calls along with
Genes that have CNV calls from at least one Alzheimer's disease (AD) and/or one mild cognitive impairment (MCI) participant and no healthy controls using the candidate gene approach.
Chromosome | Region | Start (bp) | End (bp) | Number of AD participants | Age at onseta | Number of MCI participants | ||
---|---|---|---|---|---|---|---|---|
5 | 145949260 | 146441226 | 1 | e3/e3 | N/A | 0 | — | |
6 | 16407321 | 16869700 | 1 | e3/e4 | 83 years | 0 | — | |
7 | 77484309 | 78920826 | 1 | e2/e4 | N/A | 0 | — | |
7 | 102899472 | 103417198 | 0 | — | — | 1 | e4/e4 | |
9 | 103371455 | 103540683 | 1 | e3/e3 | 74 years | 0 | — | |
10 | 68355797 | 68530873 | 1 | e3/e3 | 55 years | 0 | — | |
10 | 90963305 | 91001640 | 0 | — | — | 1b | e3/e3 | |
12 | 61324030 | 61614932 | 1 | e2/e3 | N/A | 1 | e2/e3 | |
15 | 28440734 | 28473156 | 2 | e3/e3 | N/A | 2 | e3/e3 | |
e3/e4 | N/A | e3/e4 | ||||||
15 | 56675801 | 56829469 | 1 | e3/e3 | N/A | 1b | e3/e3 | |
21 | 33782107 | 33785893 | 1 | e3/e4 | 74 years | 0 | — | |
21 | 36458708 | 36588442 | 0 | — | — | 1 | e3/e4 | |
22 | 22706138 | 22714284 | 1 | e3/e3 | 59 years | 0 | — |
aAge at onset of AD symptoms, available only for participants with a baseline diagnosis of AD; N/A: Not available.
bThe same participant had CNV calls overlapping the two genes.
We also identified CNV calls present in cases (AD and/or MCI) but not controls (HC) within regions overlapping 17,938 genes. There was no significant (
Significant (uncorrected
Chromosome | Region | Start (bp) | End (bp) | Number of AD calls | Number of MCI calls | ||
---|---|---|---|---|---|---|---|
8 | 2780281 | 4839736 | 9 | 0.0114 | 4 | 0.0556 | |
1 | 12829847 | 12831165 | 6 | 0.0493 | 4 | 0.0549 | |
11 | 107166926 | 107234864 | 5 | 0.0820 | 6 | 0.0120 |
The present report represents an initial analysis of CNVs in the ADNI dataset and is the first CNV analysis of patients with MCI. After extensive QC, we analyzed CNV calls generated in cases (AD and MCI) compared to controls (HC), using whole genome and candidate gene association approaches.
Comparison of the CNV calls between the three diagnostic groups showed no excess CNV burden (rate of calls) in AD and MCI participants compared to controls. This is consistent with previously published results [
A case/control association analysis was then performed using a candidate gene approach and a whole genome approach to determine if there was an excess of CNV calls partially overlapping genes in AD or MCI participants relative to controls, suggesting potential involvement of these genes in AD or MCI susceptibility.
The candidate gene approach revealed several interesting genes (Table
The whole genome approach revealed three genes at uncorrected
We also identified CNVs overlapping two candidate genes associated with neuropsychiatric disorders:
The ADNI cohort provides a unique opportunity for discovery analyses such as this initial CNV analysis. With multiple types of potential biomarkers, including structural and molecular imaging, blood and CSF markers, genetic information, and behavioral data, analysis of the ADNI data has the potential to enhance knowledge of the underlying mechanisms leading to MCI and to AD.
The present study has several limitations related to participant inclusion and exclusion and the software and algorithms used in the analyses. CNV calls in the present report were generated from DNA samples derived only from peripheral blood-78 participants whose DNAs were derived from lymphoblastoid cell lines (LCLs) were excluded. LCLs are generated by transforming peripheral B lymphocytes by the Epstein-Barr virus (EBV). EBV-transformed cells are shown to have significant telomerase activity and develop aneuploidy, along with other cellular changes such as gene mutations and reprogramming in the postimmortal cellular stage of transformation [
Another limitation is that the CNV calls analyzed in the current study were generated using only one software program (PennCNV). Several detection algorithms including HMMs, segmentation algorithms,
The heterogeneity of the MCI group of participants also represents a possible limitation of the present study. Although biomarkers such as CSF and PiB-PET can help differentiate MCI participants who have an AD-like profile from those who have a normal profile, this data was only available for a small number of ADNI-1 participants which would have limited power to detect differences in CNVs. In the next phases of the project (ADNI-GO and ADNI-2), all subjects will have CSF and amyloid PET data, enabling further examination of this issue.
In sum, we have conducted an initial CNV analysis in the ADNI cohort dataset. Although no excess CNV burden was found in cases relative to controls, a number of interesting candidate genes and regions were identified. Replication in larger samples will be critical to confirm these findings. Additional region-based analyses may help elucidate the role of these CNVs, and deep resequencing studies may be warranted for some of these regions if they replicate in other cohorts.
Dr. L. Shen receives support from an NIBIB R03 EB008674 Grant and an Indiana CTSI award (IUSM/CTR based on NCRR RR025761). Dr. T. Foroud receives support from an NIH/NIA 5U24AG021886 Grant and U01 AG032984 Alzheimer’s Disease Genetics Consortium (ADGC) (PI: Schellenberg). Dr. S. G. Potkin receives Grant support from the Transdisciplinary Imaging Genetics Center (TIGC) P20 RR020837-01 and NIH/NCRR U24 RR021992 and serves on the Editorial Board of Brain Imaging and Behavior. Dr. M. J. Huentelman receives Grant support from NIH-NINDS R01 N5059873 (awarded to Dr. M. J. Huentelman). Dr. M. W. Weiner serves on the scientific advisory boards for Alzheimer’s Study Group, Bayer Schering Pharma, Eli Lilly and Company, CoMentics Inc., Neurochem Inc., SIRA UCSD, Eisai Inc., Avid Radiopharmaceuticals Inc., Aegis therapies, Genentech Inc., Allergen Inc., Lippincott Williams and Wilkins, Bristol-Myers Squibb, Forest Laboratories, Pfizer Inc., McKinsey and Company, Mitsubishi Tanabe Pharma Corporation, and Novartis; has received funding for travel from Nestle and Kenes International and to attend conferences not funded by industry; serves on the Editorial Board of Alzheimer’s and Dementia, and Brain Imaging and Behavior; has received honoraria from Rotman Research Institute and BOLT International; receives research support from NIH, DOD, VA, Merck and Avid; holds stock in Synarc and Elan Corporation. Dr. A. J. Saykin receives support from the NIH (R01 CA101318, R01 AG19771, RC2 AG036535-01, P30 AG10133-18S1, U01 AG032984, and U01 AG032984 Alzheimer’s Disease Genetics Consortium (ADGC) (PI: Schellenberg)), and investigator-initiated research support from Siemens Medical Solutions and Welch Allyn, Inc. He serves as Editor-in-Chief of
Data used in the preparation of this article were obtained from the ADNI database (
Data collection and sharing for this project was funded by the The Alzheimer’s Disease Neuroimaging Initiative (ADNI)(NIH Grant U01 AG024904). ADNI is funded by the NIA, NIBIB, and through generous contributions from the following: Abbott, AstraZeneca AB, Bayer Schering Pharma AG, Bristol-Myers Squibb, Eisai Global Clinical Development, Elan Corporation, Genentech, GE Healthcare, GlaxoSmithKline, Innogenetics, Johnson and Johnson, Eli Lilly and Co., Medpace, Inc., Merck and Co., Inc., Novartis AG, Pfizer Inc, F. Hoffman-La Roche, Schering-Plough, Synarc, Inc., and Wyeth, as well as nonprofit partners like the Alzheimer's Association and Alzheimer's Drug Discovery Foundation, with participation from the US Food and Drug Administration. Private sector contributions to ADNI are facilitated by the Foundation for the National Institutes of Health (