Fast and Accurate Genome-Wide Association Test of Multiple Quantitative Traits

Multiple correlated traits are often collected in genetic studies. By jointly analyzing multiple traits, we can increase power by aggregating multiple weak effects and reveal additional insights into the genetic architecture of complex human diseases. In this article, we propose a multivariate linear regression-based method to test the joint association of multiple quantitative traits. It is flexible to accommodate any covariates, has very accurate control of type I errors, and offers very competitive performance. We also discuss fast and accurate significance p value computation especially for genome-wide association studies with small-to-medium sample sizes. We demonstrate through extensive numerical studies that the proposed method has competitive performance. Its usefulness is further illustrated with application to genome-wide association analysis of diabetes-related traits in the Atherosclerosis Risk in Communities (ARIC) study. We found some very interesting associations with diabetes traits which have not been reported before. We implemented the proposed methods in a publicly available R package.

The log likelihood is proportional to Hence we can easily show that the score vector for testing H 0 : β β β = 0 is where ⊗ means the matrix Kronecker product, and vec() is the vector operator which stacks the columns of a matrix into a column vector. Note Cov[vec(Y )] = I ⊗ Σ. Hence where S 11 = G T G/n is the sample variance of genotype. Denote S 21 = Y G/n, which is the sample covariance vector of multi-traits and genotype. The multi-trait association can be based on the following chi-square statistic, U T Cov(U ) −1 U , which is where we have plugin the estimated null covariance matrixΣ using the sample covariance matrix S 22 of multi-traits. The CCA test statistic used in Ferreira and Purcell (2009) is Therefore the proposed MLM based Wald test can be treated as a natural and flexible generalization of the CCA: (1) it can accommodate any covariates; (2) it is based on the more powerful Wald test instead of the Score test for association tests of quantitative traits; and (3) it has an exact F-distribution for the multivariate normally distributed multiple continuous traits, and hence has very accurate control of type I errors.
It is not hard to verify that previous derivations still hold when we replace Y and G by their residuals regressing on a common set of p covariates. Therefore operationally we can apply the popular PLINK tool (Purcell et al., 2007) to test multi-trait association as follows. We first obtain the residuals of multivariate traits and genotypes adjusting for all covariates. We then input the residuals into the CCA test approach (Ferreira and Purcell, 2009) implemented in PLINK. Technically we need to adjust the PLINK output p-value T using a F-distribution with different DFs as 1 − F m,n−m−1−p (F −1 m,n−m−1 (1 − T ) n−m−1−p n−m−1 ), where F d 1 ,d 2 (·) is the distribution function of F-distribution with (d 1 , d 2 ) DFs. Note that when the set of covariates are independent of the genotypes (e.g., age and gender), we can directly use the genotypes instead of the genotype residuals, which can further save some computation time.
2 Multivariate trait association detection using the

1-DF Wald test
Consider the linear combination U = a Tβ 1 , which follows a normal distribution, U ∼ N (a T γ, (G T e G e ) −1 a T Σa), where γ is the true value of β 1 . Assuming a common genotype effect across the multivariate traits, we have γ = η1 m . The effect size of U is then proportional to Taking b ∝ Σ −1/2 1 m will maximize the effect size (note b T b = 1). Therefore we use the following statistic With a common scaled genotype effect across the multivariate traits, we have γ = ηS, Similarly we can derive the following test statistic 3 Chi-square and F-distribution based p-value calculation The chi-square statistic n−p−1 n Q is commonly used in practice and referred to a m-DF chi-square distribution to compute multi-trait association test p-values, which can lead to significantly inflated type I errors at stringent genome-wide significance levels. Figure 1 shows the ratio of actual significance level of Wald test p-values computed using the chi-square distribution and F-distribution respectively. We can see that the type I error based on the chi-square distribution is inflated: more so for larger number of traits, smaller significance level, and smaller sample size. For example, when testing m = 8 traits with p = 2 covariates and n = 500 samples, under genome-wide significance level 5 × 10 −8 , the actual significance level of chi-square distribution p-value is 3.42 × 5 × 10 −8 = 1.7 × 10 −7 .
Using the chi-square distribution to compute p-values will lead to very small inflation only when the sample size is large, such as in the meta-analysis of multiple GWAS studies. Significance Inflation Factor q q q q q q q m=8,p=2 q n=500 n=1000 n=2000 n=5000 5e−8 5e−7 5e−6 5e−5 5e−4 5e−3 5e−2 size almost increases linearly with number of traits m. For typical GWAS with small to medium sample sizes, we recommend using the appropriate F-distribution to compute significance p-values to reduce false positive findings. Table 1 lists the 62 genome-wide significant SNPs that were identified in the ARIC joint association test of fasting glucose (FG), fasting insulin (FI) and 2 hour fasting glucose levels (2hFG), and were also significant in the MAGIC consortium meta-analyses of these three traits (Dupuis et al., 2010;Saxena et al., 2010). In Table 1   and FI, and relatively small p-values (around 10 −5 ) for the 2hFG. Interestingly 6 of them (rs4502156, rs7163757, rs8037894, rs6494307, rs7167878, rs7172432) were genome-wide significant in the MAGIC meta analysis of fasting proinsulin level (FP, Strawbridge et al., 2011), with meta-analysis p-values ranging from 3.8 × 10 −11 to 8.7 × 10 −11 . We have highlighted these 6 SNPs in yellow at Table 2. We also provided the corresponding MAGIC meta-analysis p-values for FG, FI, 2hFG, and FP.

Simulation study
We consider a common Bernoulli covariate Z with probability of 0.5 (population indicator), and separately simulate a standard normal covariate X k for each trait Y k . The SNP genotype score G is simulated from a Binomial distribution, Binom(2,f 0 ), where the minor allele frequency (MAF) f 0 = p 0 + p 1 Z.
We conducted simulations for testing m = 2, 4, 8 related traits of 1,000 unrelated individuals respectively. Each time we simulate the m traits from a multivariate normal distribution with a compound symmetry correlation matrix with correlation ρ. The first trait has a variance of 2 and all the other traits have unit variance. We set E(Y i ) = 1+0.5X i +0.5Z +γ i G for i = 1, . . . , m−1, and E(Y k ) = 1+X k +Z +γ k G for k = 2, . . . , m.
We used 10 million experiments to evaluate the type I error, and 10 5 experiments to evaluate the power under various combinations of (γ 1 , . . . , γ m ). We conducted simulations   Tables 5 and 6 summarize the power for m = 2 and m = 8 respectively. T is the most powerful when γ j are close to each other, and T is the most powerful when γ j /σ j are close to each other. In general the proposed MLM based Wald tests perform better than the corresponding GEE based score tests. This agrees with the general principle that the Wald test is typically more powerful than the GEE based test.  The developed algorithms are very efficient and extremely scalable to genome-wide association test. For example, it takes 30 minutes to conduct joint association tests for around 2.5 million HapMap SNPs in the ARIC data on a single Linux desktop with 3.0 GHz CPU and 24 GB memory. The eigen decompositions involved are computed very efficiently, since we just need to compute the top eigen vectors for the covariate matrix.
The covariance matrix involved in the Wald tests has the same dimension as the number of traits, and its inverse can also be computed efficiently.  T  T  Q  T  T  Q  T  T

Discussions
We note that the multi-trait test approach generally benefits the most when the marginal effects have different directions from the trait correlations. For example, consider a bivariate normal random vector Z = (z 1 , z 2 ) T with covariance matrix Σ having unit variance and 0.5 correlation. Consider the Wald test Z T Σ −1 Z. Under 5 × 10 −8 significance level, its test power is 0.5% with E(z 1 ) = 3, E(z 2 ) = 2, while the power is 24.8% when E(z 1 ) = 3, E(z 2 ) = −2.
The proposed Wald tests are generally robust to deviation from normality, partly because GWAS are based on large sample sizes, and the F-distribution based Wald test is robust. Here we conduct a simple simulation study to investigate the type I errors of the proposed Wald tests when we simulate the outcomes from the multivariate tdistribution instead. We consider 5000 individuals with two outcomes following K-DF bivariate t-distribution with unit variance and 0.5 correlation. The genotype is simulated from Binom(2, 0.2). We conducted 10 6 simulations to investigate the type I errors at significance levels α = 10 −2,−3,−4 under K = 5, 10, 20. Table 7 summarizes the results.
Overall we can see that all proposed Wald tests have well controlled type I errors.
GWAS are typically useful to identify common variants (MAF≥5%). The proposed Wald tests are applicable to any MAF and number of traits, in the sense that the Wald test F-distributions always hold. However we do recommend that the tests are only applied to proinsulin levels and provides new insights into the pathophysiology of type 2 diabetes.