University of Groningen Identification of Susceptibility Genes of Adult Asthma in French Canadian Women

Susceptibility genes of asthma may be more successfully identified by studying subgroups of phenotypically similar asthma patients. This study aims to identify single nucleotide polymorphisms (SNPs) associated with asthma in French Canadian adult women. A pooling-based genome-wide association study was performed in 240 allergic asthmatic and 120 allergic nonasthmatic women. The top associated SNPs were selected for individual genotyping in an extended cohort of 349 asthmatic and 261 nonasthmatic women. The functional impact of asthma-associated SNPs was investigated in a lung expression quantitative trait loci (eQTL) mapping study (n = 1035). Twenty-one of the 38 SNPs tested by individual genotyping showed P values lower than 0.05 for association with asthma. Cis-eQTL analyses supported the functional contribution of rs17801353 associated with C3AR1 (P = 7.90E − 10). The asthma risk allele for rs17801353 is associated with higher mRNA expression levels of C3AR1 in lung tissue. In silico functional characterization of the asthma-associated SNPs also supported the contribution of C3AR1 and additional genes including SYNE1, LINGO2, and IFNG-AS1. This pooling-based GWAS in French Canadian adult women followed by lung eQTL mapping suggested C3AR1 as a functional locus associated with asthma. Additional susceptibility genes were suggested in this homogenous subgroup of asthma patients.


Introduction
Substantial efforts have been deployed to discover the genetic variants associated with asthma [1,2]. Asthma constitutes a considerable burden for individuals and health services with more than 300 million persons affected worldwide [3]. Various approaches have been used to identify asthma risk loci and genome-wide association studies (GWASs) have discovered the most robust genetic associations [4,5]. This genomic approach still requires substantial resource in terms of sample size and genotyping. To reduce the genotyping burden, an alternative methodology has been developed, which consists of a GWAS on pooled DNA samples (pooled GWAS) followed by further validation of the top associations by individual genotyping. This approach has been shown to be effective for several complex traits [6]. It has also been applied to asthma and has confirmed known asthma loci and led to the identification of new ones [7,8].
GWAS SNPs discovered so far account for a relatively low percentage of the total asthma heritability. Studying more homogeneous subgroups of asthma patients is likely to reveal part of this "missing heritability." The rationale is that individuals within subgroups are more likely to share the same underlying molecular basis. It is known that asthma

Lung eQTL analyses
In silico functional prediction Validation in the SLSJ family collection for asthma prevalence differs between men and women throughout life [9]. This sex difference is not completely understood.
Studying the genetics of asthma by gender and age groups is thus important [10].
In this study, we used a pooled GWAS to detect SNPs associated with asthma in French Canadian women enrolled in the Quebec City Case-Control Asthma Cohort (QCC-CAC). Confirmation by individual genotyping was followed by analyses of expression quantitative trait loci (eQTL) in human lung tissues, in silico analyses for functional prediction and validation in a second collection of French Canadian women.

Methods
The experimental design is summarized in Figure 1.

Participants.
Cases and controls included in this study are part of the Quebec City Case-Control Asthma Cohort [11]. Briefly, the QCCCAC consists of unrelated adults of self-reported European ancestry recruited at the research center of the Institut Universitaire de Cardiologie et de Pneumologie de Québec (IUCPQ). All research participants were ≥18 years old at enrollment. Individuals with chronic obstructive pulmonary disease (COPD), body mass index >40 kg/m 2 , and/or any systemic inflammatory disease were excluded. Individuals with self-reported genetic relatedness were also excluded. Participants provided written informed consent and the study was approved by the ethics committee of the IUCPQ. For the current study, only women were investigated and the asthma diagnosis was confirmed by a respirologist based on clinical symptoms, lung function, and airway responsiveness. Twenty-five inhalant allergens were evaluated by skin-prick tests to measure the allergic status. Participants were considered atopic if at least one allergen caused a wheal diameter of at least 3 mm at 10 min in the presence of a negative saline control and a positive histamine response. Asthma-associated SNPs were tested for replication in women participants of the Saguenay-Lac-Saint-Jean (SLSJ) asthma family collection ( = 353  Figure 1). This SNP array interrogated 730,525 SNPs throughout the genome. The SNP array probes' intensities were formatted and analyzed by the GenePool software [12]. GenePool ranked SNPs by increased likelihood of being genetically associated with asthma. To do so, the Manhattan distance method and the silhouette score clustering method were used as implemented in GenePool. Each SNP was assigned a silhouette score that ranged from 0 to 1. A silhouette score of 1 indicates that the allele frequencies for a specific SNP are unequivocally different between cases and controls. Further details are provided in Supplementary Material.

Individual Genotyping and Genetic Association Tests.
Following the pooled GWAS, a total of 43 SNPs were selected for validation by individual genotyping. First, the 20 SNPs with the best silhouette score were included. Then, SNPs ranked in the top 2000 of the pooled GWAS and found in or near (50 kb) genes previously associated with asthma, COPD, or related phenotypes in candidate gene studies and GWAS were selected [2,13]. This resulted in 23 additional SNPs for validation by individual genotyping. Selected SNPs ( = 43) were genotyped in an extended cohort of 349 asthmatic women and 261 nonasthmatic women derived from the QCCCAC. This extended cohort contained the women used in the pooled GWAS. Genotyping and quality controls are provided in Supplementary Material. After quality control, 38 out of the 43 SNPs remained. Association tests were done using additive logistic regression models as implemented in PLINK v1.07 [14]. Analyses were first performed in allergic cases ( = 299) and allergic controls ( = 154) women to mirror the selection of individuals used in the poolingbased GWAS. Analyses were then repeated in all available cases ( = 349) and controls ( = 261) women with or without allergy in an attempt to increase sample size and evaluate the specificity of genetic signals. value <0.05 was considered suggestive evidence of association. value that passed Bonferroni correction (0.05/38 = 0.0013) was considered statistically significant.

eQTL and In Silico
Analyses. The functional effects of the asthma-associated SNPs were determined by examining their effect on gene expression in human lung tissue. The lung expression quantitative trait loci (eQTL) mapping study has been previously described [15,16]. Briefly, 1111 lung specimens were obtained from patients who underwent lung resection at three sites: Laval University (Quebec City, Canada), University of Groningen (Groningen, Netherlands), and University of British Columbia (Vancouver, Canada). Whole-genome gene expression and genotyping data were obtained from these specimens. The genotyping platform used in the eQTL dataset is the Illumina Human1M-Duo BeadChip. The imputation was performed with SHAPEIT v2 and IMPUTE v2 using the 1000-genome project phase 1 data as a reference set. In the present study, SNPs associated with asthma, including SNPs in linkage disequilibrium (LD), were tested for association with gene expression in the Laval dataset ( = 407). Replication was then performed in the Groningen ( = 341) and UBC ( = 287) cohorts. In silico analyses were also performed to investigate the functional impact of asthma-associated SNPs. Tools used in this study are Combined-Annotation-Dependent Depletion (CADD) V1.3 [17], SNP Function Prediction (FuncPred) [18], Regu-lomeDB V1.1 [19], and Haploreg V4 [20]. Further details are provided in Supplementary Materials.

Subjects.
A total of 240 asthma patients and 120 controls were considered in the pooled GWAS. All subjects were atopic women with mean age of 34.0 ± 13.9 years for cases and 35.2 ± 14.3 years for controls, respectively. Individual genotyping was performed in an extended cohort of 349 asthmatic women and 261 controls. Clinical characteristics are summarized in Table 1.

Pooled GWAS.
Genotyping data from DNA pools were analyzed with the GenePool software. All SNPs were ranked according to the likelihood of being genetically associated with asthma using the silhouette score ( Figure 2). The top 20 SNPs selected for individual genotyping are shown in Table 2 and ranked by the silhouette score. One SNP, rs17093106, failed the Illumina design assay step for individual genotyping and was replaced by the 21st top SNP, rs17500510. The SNP rs12418753 was ranked first with a silhouette score of 0.709. The silhouette scores in Table 2 ranged from 0.636 to 0.709. All SNPs were located in noncoding regions except SNP rs12881815 located in an exon of the SYNE2 gene. None of these SNPs had been previously associated with asthma. Twenty-three additional SNPs located near or within asthma candidate genes were well ranked (i.e., top 2000) and selected for individual genotyping (Table 3).

Individual Genotyping.
From the 43 selected SNPs that were typed by individual genotyping, 38 passed quality controls. For the top 20 ranked SNPs in the pooled GWAS, 18 SNPs passed the quality controls. The genotyping assay failed for the only exonic SNP, rs12881815, and for rs12493799. For the 23 SNPs near candidate genes among the top 2000 ranked SNPs, 20 passed the quality controls.
The association with asthma was first tested in atopic women only to mirror the selection of women included in the pooled GWAS. Associations with asthma were suggestive for 29 SNPs out of 38 (76%) and 7 of them were significant after Bonferroni correction (Supplementary Table 1). All 7 SNPs were part of the top ranked SNPs in the pooled GWAS. For the SNPs located near candidate genes and among the top 2000 ranked SNPs, 13 showed a suggestive evidence of association. Overall, only 9 SNPs had values >0.05, ranging from 0.06 to 0.23.
The same 38 SNPs were then tested in an extended cohort containing all available cases and controls women genotyped in the QCCCAC. Twenty-one out of the 38 SNPs (55%) demonstrated suggestive associations with asthma (Table 4). Three SNPs were significant after correction for multiple testing including two from the top ranked SNPs and one from the SNPs near candidate genes. Association tests were also performed in 255 men (213 cases, 142 controls) of the QCCCAC and all suggestive associations were stronger in women compared to men (Supplementary Figure 2). The two strongest associations involved two SNPs newly implicated in asthma, rs17655581 and rs7980829, located near SLC15A1 and in an intergenic region, respectively. In general, new SNPs identified in the top of the pooled GWAS showed stronger associations with asthma compared to those selected due to their proximity to candidate genes. However, three SNPs near candidate genes were part of the strongest associations. They were rs803010 near PTGDR and PTGER2, rs10932034 near ICOS, and rs17453235 near DPP10. Results for the 21 SNPs with < 0.05 for association with asthma and those in LD are detailed in Supplementary Table 2, including information on the genomic context, the LD mapping, the pooled GWAS, the individual genotyping in all case and control women, the replication in SLSJ family collection, and the in silico functional prediction.

Lung eQTL Analyses.
To investigate the potential function of SNPs associated with asthma in the QCCCAC, we analyzed a large-scale lung eQTL dataset [15]. SNPs in LD ( = 309) with the 21 asthma-associated SNPs were tested in the eQTL dataset. Fifty-five out of these SNPs were genotyped in the lung eQTL dataset and the genotype information for the remaining SNPs was obtained by imputation. Each of them was tested against the expression levels of all noncontrol probesets interrogated by the gene expression microarray ( = 51,627). The most significant lung eQTL-SNP was rs75871129, which passed the Bonferroni correction for multiple testing threshold (3.1 − 9) and was associated with a probeset      interrogating C3AR1 ( = 4.06 − 10). The SNP was in perfect LD with the asthma-associated SNP rs17801353, which had a similar association with mRNA C3AR1 levels ( = 7.90 − 10). All the SNPs in this LD block ( 2 > 0.8) were also associated with mRNA expression of C3AR1 in the lung. Figure 3 illustrates the lung eQTL rs17801353-C3AR1 in Laval as well as replications in Groningen and UBC. rs17801353 is located in an intron of FOXJ2 7.6 kb downstream of C3AR1.
The eQTL was significant and showed the same direction of effect in the two replication cohorts ( < 0.05). The asthma risk allele for rs17801353 in the QCCCAC corresponds to the allele associated with higher mRNA expression levels for C3AR1, suggesting that upregulation of this gene may increase asthma susceptibility. The most significant lung eQTL ( < 10 − 5) are shown in Supplementary Table 3.

In Silico Functional Characterization.
We investigated if any of the LD SNPs were potentially implicated in regulatory mechanisms using bioinformatics tools. First, we employed CADD to score the deleteriousness of the 309 LD SNPs. A scaled -score higher than 10 was obtained for twenty-five of them, indicating that they are predicted to be in the 10% most deleterious genetic variants of the human genome. Of these 25 SNPs, six are part of LD block located in an intronic region of the gene LINGO2 on chromosome 9p21.1. SNP rs2295190 located on chromosome 6q25.1 showed the greatest scaled -score (24.5), implying that it is among the top 0.5% most deleterious SNPs of the human genome. This SNP is in LD with rs6934016 located near the ESR1 asthma susceptibility gene. The scaled -score of each LD SNP is shown in Supplementary Table 2.
The FuncPred software recognised 184 out of the 309 LD SNPs and revealed regulatory predictions (Supplementary Table 4). The intronic SNP rs13081182 showed the strongest regulatory potential score (0.406). This SNP is in LD ( 2 = 0.94) with SNP rs17016738 associated with asthma in this study ( Table 4). The nonsynonymous SNP rs2295190 also showed a strong regulatory potential score (0.395) and is implicated in splicing events. Located in SYNE1, this SNP was found to be probably damaging for SYNE1 based on PolyPhen v2 with a score of 0.994/1.00 (sensitivity: 0.69; specificity: 0.97). This SNP is in LD ( 2 = 0.85) with SNP rs6934016 associated with asthma in this study ( Table 4). The asthma risk allele (A) for rs2295190 corresponds to a missense change that occurs at position 8741 of the protein (L8741M). The FuncPred tool also revealed SNPs with high regulatory potential scores in the LD block located in an intronic region of gene LINGO2 on chromosome 9p21.1, the same LD block mentioned above with several high scaled -score SNPs.
The RegulomeDB V1.1 attributed a valid score to 276 LD SNPs according to the number of regulatory elements they influence (Supplementary Table 5). Two SNPs in the same haplotype block of chromosome 12p13.31, rs10846377 and rs7955798, were ranked highest with important supporting data. The regulatory prediction for rs10846377 was supported by data including eQTL, transcription factor binding, any motif, and DNase peak score. rs7955798 was supported by eQTL and transcription factor binding/DNase peak. These two SNPs were associated with mRNA expression levels of C3AR1 in circulating monocytes [21]. Both SNPs are in the same haplotype block identified by the lung eQTL analyses. Furthermore, the second most significant association following individual genotyping, rs7980829, was ranked third in the scores attributed by RegulomeDB V1.1. Many motifs are predicted to be altered by this variant and it is thus likely to affect binding of transcription factors in this region.   The Haploreg V4 software recognised 283 out of the 309 LD SNPs and indicated that several LD SNPs were associated with mRNA expression levels (Supplementary Table 6). The rs7980829 and rs11177020 have been previously associated with the mRNA expression levels of IFNG-AS1 in various tissues including lymphoblastoid cell lines of Europeans. The LD SNPs on chromosome 12p13.31 were also shown to be associated with mRNA expression levels of C3AR1 in whole blood.

Validation in the SLSJ Asthma Family Collection.
To confirm the genetic associations detected in the QCCCAC, the 21 asthma-associated SNPs were analyzed in the SLSJ asthma study. Clinical characteristics of cases and controls are indicated in Table 1. Eleven out of the 21 SNPs were directly genotyped. None of these SNPs was associated with asthma (Supplementary Table 7). The lowest value for association with asthma was with rs17016738 ( = 0.080). The remaining 10 SNPs were tested with proxies if available. None of them showed association with asthma (Supplementary Table 8).

Discussion
This pooled GWAS was performed on a relatively homogeneous subgroup of asthma patients defined by age, atopic status, ethnicity, and gender (i.e., adult, atopic, French Canadian women with doctor diagnosed asthma). Several loci were associated with asthma and confirmed by individual genotyping in an extended sample of French Canadian women. The functional meaning of asthma-associated SNPs was then evaluated in a large lung eQTL study, which supported C3AR1 as an asthma susceptibility gene. In silico analyses also supported C3AR1 as well as additional genes neighbouring functional asthma-associated SNPs.
Several GWASs have been performed to study the genetics of asthma [4,5]. GWAS revealed numerous susceptibility loci that explain only a small fraction of the total asthma heritability. GWASs have the potential to reveal additional loci using extended study design [22]. Studying subgroups of phenotypically similar asthma patients instead of the traditional broad case-control format may reveal new susceptibility loci. Using this strategy, Bønnelykke et al. identified CDHR3 as a new susceptibility gene in childhood asthma with severe exacerbations [23]. Similarly, we used a homogeneous subgroup of asthma patients. Pooled genotyping was used as it is an economic approach to perform a preliminary screen for evidence of association. As recommended, this screen was followed by individual genotyping for confirmation [24]. SNPs identified with the pooled GWAS were in large part (76%) validated by individual genotyping in atopic women. This pooling-based approach has been used previously to study asthma and new risk loci were identified [7,8]. These two studies were conducted in allergic asthmatic children of white European descent and in a Chinese population, respectively. None of their findings were replicated in this study, which is likely explained by clinical, demographic, and genetic differences.
In this study, one functional SNP located 7.6 kb from C3AR1 was associated with asthma. rs17801353 was associated with asthma in the extended case-control sample ( = 0.028). Lung eQTL analysis demonstrated that rs17801353 is associated with expression levels of C3AR1. This SNP was selected for individual genotyping owing to its proximity with that candidate gene on chromosome 12p13.31. This eQTL is reported for the first time in lung tissue and demonstrated the functionality of rs17801353. However, in silico analyses identified a study conducted in circulating monocytes where two SNPs in LD with rs17801353 were found to act as eQTL for C3AR1 [21]. C3AR1 is involved in the pathogenesis of asthma via the complement system [25,26]. The expression of this receptor is increased during asthmatic lung inflammation [27]. The same direction of effect is observed in the current study as the asthma risk allele increased the mRNA expression levels of C3AR1 in the three cohorts, which is also consistent with the study on circulating monocytes.
The two strongest associations with asthma after individual genotyping in the QCCCAC were with rs17655581 and rs7980829. The first one is located on chromosome 13q32.3 at 5.1 kb 3 of SLC15A1. This SNP received no clear prediction for putative function from the bioinformatic tools employed and was not implicated in any significant eQTL. The closest gene, SLC15A1, is not known to be involved in asthma but is known to cause inflammation in the intestine by the mediation of intracellular uptake of bacterial products [28]. For rs7980829, located on chromosome 12q15, RegulomeDB V1.1 predicted that this SNP may influence protein binding. Furthermore, Haploreg V4 showed that the SNP is associated with the long intergenic noncoding RNA (lincRNA) expression levels of IFNG-AS1. This lincRNA has been shown to influence and regulate the expression of IFNG [29,30]. Interestingly, IFNG is an inflammatory cytokine implicated in the pathophysiology of asthma [31].
One LD block located in an intronic region of the gene LINGO2 on chromosome 9p21.1 contained the most potentially deleterious SNPs predicted by in silico analyses. This leucine-rich repeat and Ig domain-containing 2 (LINGO2) gene is not known to be implicated in asthma. A recent study found that SNPs flanked by LINGO2 were associated with airway responsiveness in chronic obstructive pulmonary disease [32]. These SNPs are not in LD with any of the SNPs in the LD block identified in this study (all 2 < 0.05).
Among the LD SNPs, only rs2295190 was located in an exon and PolyPhen 2 predicted this missense variant as probably damaging for SYNE1. The asthma risk allele for rs6934016 (A) identified by individual genotyping corresponds to the allele in LD with SNP rs2295190 (T) producing a missense change. SYNE1 has never been associated with asthma, but two intronic SNPs of this gene have been associated with forced vital capacity according to the Phenotype-Genotype Integrator (PheGenI) [33]. SYNE1 is expressed in skeletal and smooth muscle, particularly in the sarcomeres [34]. Thus, SYNE1 could potentially have an effect on airway smooth muscle and influence muscle function during asthma attacks. Interestingly, one missense SNP in a related gene, SYNE2, was significant in the pooled GWAS but failed individual genotyping. SYNE1 and SYNE2 are known to be implicated in several diseases such as lung cancer [35]. Their role in the physiopathology of asthma remains to be confirmed.
Associations were revealed with SNPs among the top 2000 ranked SNPs and near candidate genes. The third most important association with asthma was found with rs803010 ( = 4.4 × 10 −4 ) located in the promoter of the prostaglandin D2 receptor (PTGDR). This gene is known to be involved in asthma in Caucasian populations [36,37]. PTGDR is activated by PGD 2 and leads to an increase of intracellular cAMP [38]. This augmentation is notably associated with more pronounced Th2 inflammation [39]. PTGDR may be implicated in bronchial hyperreactivity since knockout mice (Ptgdr −/− ) are protected against ovalbumin-induced hyperreactivity [40].
Genetic associations observed in the QCCCAC were not replicated in SLSJ asthma family collection. This could be partly due to the relatively small sample size of both datasets. Attempt to replicate was worthy as these two cohorts share similar age, asthma definition, and genetic background (both being French Canadian). The analyses were also restricted to women. However, asthma heterogeneity, not captured by our study design, may also be responsible for the lack of replication.
The main limitation of this study is the small sample size in which we performed the initial pooled GWAS screen and individual genotyping. However, the identification of loci known to be associated with asthma demonstrated the validity of our study and lends credence to the findings of novel loci. The value and innovation of this study were to investigate the genetic factors involved in a phenotypically similar group of asthma patients. Only asthmatic and nonasthmatic women with allergy were selected for the pooled GWAS. This strategy is likely to be more powerful and to require smaller sample size in order to identify the genetic factors underpinning asthma.
In summary, we used pooling-based GWAS to study asthma in a homogeneous population of adult French Canadian women. We then took advantage of a large-scale lung eQTL dataset and bioinformatics tools to examine the functional significance of our discoveries. Our data supported the potential role of C3AR1 in asthma. This study also suggests a potential role for new loci, namely, SYNE1, LINGO2, and IFNG-AS1, in the pathogenesis of asthma.
Network (CRRN). Yohan Bossé was the recipient of a Junior 2 Research Scholar award from the FRQS and now holds a Canada Research Chair in Genomics of Heart and Lung Diseases. This study was partly supported by the Chaire de Pneumologie de la Fondation JD Bégin de l'Université Laval, the Fondation de l'Institut Universitaire de Cardiologie et de Pneumologie de Québec, the Respiratory Health Network of the FRQS, and the Canadian Institutes of Health Research (MOP-123369). Catherine Laprise is the director of the Inflammation and Remodeling Strategic Group of the Respiratory Health Network of the FRSQ and member of Allergen network. Catherine Laprise holds a Canada Research Chair in Environment and Genetics of Respiratory Diseases and Allergy and is supported by CIHR.