Statistical Method Based on Bayes-Type Empirical Score Test for Assessing Genetic Association with Multilocus Genotype Data

Simultaneous testing of multiple genetic variants for association is widely recognized as a valuable complementary approach to single-marker tests. As such, principal component regression (PCR) has been found to have competitive power. We focus on exploring a robust test for an unknown genetic mode of all SNPs, an unknown Hardy-Weinberg equilibrium (HWE) in a population, and a large number of all SNPs. First, we propose a new global test by means of the use of codominant codes for all markers and PCR. The new global test is built on an empirical Bayes-type score statistic for testing marginal associations with each single marker. The new global test gains power by robustly exploiting the Hardy-Weinberg equilibrium in the control population and effectively using linkage disequilibrium among test markers. The new global test reduces to PCR when the genotype for each marker is coded as the number of minor alleles. This connection lends insight into the power of the new global test relative to PCR and some other popular multimarker test methods. Second, we propose a robust test method based on the new global test and the ordinary PCR test built on a prospective score statistic for testing marginal associations with each single marker when the genotype for each marker is coded as the number of minor alleles by taking the minimum p value of these two tests. Finally, through extensive simulation studies and analysis of the association between pancreatic cancer and some genes of interest, we show that the proposed robust test method has desirable power and can often identify association signals that may be missed by existing methods.


Introduction
Association analyses that test multiple genetic markers as a set rather than individually have been appreciated for their potential power. These statistical methods largely fall into three classes: those for summarizing p values from the tests of each single marker [1][2][3][4][5], those that synthesize singlemarker test statistics, such as Hotelling T 2 (standard Chisquared) statistic [6][7][8] and the burden test [9,10], and those based on a direct test of joint associations of multiple markers, such as variance component tests (VC) [11][12][13], the sequence kernel association test (SKAT) [14][15][16][17][18], and principal component regression (PCR) methods [19][20][21]. The relative performance of these methods has been comprehensively compared in previous work [22]. When the number of single-nucleotide polymorphisms (SNPs) is small, these methods have similar power; however, when the number of SNPs is large, the effects of SNPs are not constant and may have different directions, the linkage disequilibrium (LD) among multiple markers is somewhat strong, and the SNPs adopt additive genetic code. Three methods, namely, VC, SKAT, and PCR, have been found to have competitive power in this case [22,23]. A major reason is that all 3 methods can decrease the degrees of freedom of the test to some extent [12]. In this work, we focus on exploring a robust test for unknown genetic modes of SNPs of interest, unknown Hardy-Weinberg equilibrium (HWE) in a population, and a large number of SNPs of interest.
We first propose a novel multi-SNP test under the case-control study design, which we term the principal Chi-squared test. The principal Chi-squared test applies a two-degree-of-freedom score statistic based on the empirical Bayes method for each SNP and derives a global test based on the eigenvalue decomposition of the asymptotic variance-covariance matrix of each SNP test. The global test achieves improved power by robustly exploiting the HWE in the control population and effectively exploiting the LD among all SNPs. We denote the global test by PChiB (see Methods). In addition to competitive power, PChiB is conveniently implemented and is easily comprehensible by the nonstatistics community because of the well-known eigenvalue decomposition method. The global test is closely related to standard PCR in that it reduces to the score test of PCR when each SNP is coded as the number of minor alleles. This relation not only lends insight into its power relative to PCR but also into the connection between PCR and variancecomponent-based tests. We show that both classes of these methods are weighted combinations of uncorrelated Chisquared random variables, each of which is a weighted combination of a single SNP test with weights equal to the loadings of the eigenvectors of their joint asymptotic variance-covariance matrix. This observation, while supporting documented conclusions that none of the two classes of methods is uniformly more powerful than the other [22], reveals theoretically that the LD structure among SNPs plays a critical role in the powers of these methods. When a real disease causal SNP adopts recessive and dominate codes, test PChiB can gain desirable power. When a real disease causal SNP adopts an additive code, test PChiB may have somewhat lower power. Thus, we propose a robust test by taking the minimum p value of the new global test PChiB and the ordinary prospective score test of PCR in which each SNP is coded as the number of minor alleles, regardless of the actual genetic code of each SNP. We denote the robust test by Min2.
Suppose that q diallelic SNPs in a genomic region of interest are genotyped for n 1 case samples and n 0 control samples. Let Y i denote the binary case-control status (Y i = 1: case; Y i = 0: control) for sample i (i = 1, 2, ⋯, n), where n = n 1 + n 0 , the first n 1 samples are cases, and the remaining n 0 samples are controls. Denote G ik as the count of the minor alleles of SNP k from sample i for i = 1, 2, ⋯, n, and k = 1, 2, ⋯q. A new global test is designed to test the null hypothesis that the genomic region spanned by q SNPs is not associated with the phenotype status of interest against the general alternative that one or more SNPs, which may or may not be genotyped, are associated with the phenotype status of interest. We fit an ordinary logistic regression model for the binary case-control status and all SNPs.
Incorporating HWE constraints into the control population based on the retrospective likelihood for testing a diallelic marker may lead to increased power under dominant and recessive genetic models compared to standard prospective likelihood-based tests [24]. To address the issue that deviation from HWE may lead to an inflated type I error rate in this test, an empirical Bayes score test, which is a dataadaptive linear combination of the prospective likelihood score test and retrospective likelihood score test under the HWE constraint, was proposed [25]. This test can maintain nominal type I error rates under deviations from HWE that are observed in real settings and largely maintains the power gain under the recessive genetic model. Here, our new global statistic uses this test principal as the building block. We expect that our method achieves considerably improved power when aggregating the small power gains at each SNP.
The rest of this paper is organized as follows. In Results, we demonstrate, through simulation studies and analysis of pancreatic cancer data [26,27], that the proposed robust test can often have desirable power compared to some popular tests across a broad range of scenarios. In Discussion, we further discuss the merits and disadvantages of our proposed test method and note some directions for future research. In Methods, we present the new global test in detail and discuss its connections to PCR and other existing methods. We also briefly introduce the robust test by taking the minimum p value of the new global test and the score test of PCR, where each single SNP is coded as the number of minor alleles, regardless of the actual genetic code of each SNP.

A Robust Statistical Method
Based on Two Types of Principal Chi-Squared Tests. For real genotype data, we can first calculate the prospective score test, denoted byŨ P = ðŨ P,1 ,⋯,Ũ P,q Þ, in which all SNPs are supposed to adopt additive codes. We denote a consistently estimated covari-anceṼ P forŨ P and calculate the ordinary principal components regression (PCR) score statistic, which is denoted by PChiP (selecting the top PCs explaining 85% of genetic variability) based on the estimated covarianceṼ P , as in Gauderman et al. [19]. Second, we can obtain the p value of PChiP, which is denoted by PV A,P because PChiP follows a Chisquared distribution asymptotically under the null hypothesis. Third, we calculate the empirical Bayes score denoted by U B = ðU B,1 ,⋯,U B,q Þ and its consistently estimated covariance, which is denoted by V B , based on codominant codes (see Methods). Similarly, we calculate the new aforementioned principal Chi-squared statistic PChiB based on the estimated covariance V B . Note the dimension of U B is 2q, and we can estimate the p value of PChiB, which is denoted by PV C,B , because PChiB also follows a Chi-squared distribution asymptotically under the null hypothesis. Finally, we take the minimum of the two p values of PChiP and PChiB as a robust test, as follows: We estimate the p value of Min2 via statistical permutation. We conduct extensive simulations to investigate the power performance of Min2.
To view the performance of Min2 comprehensively, we can compare it to 4 other tests, namely, PChiB, PChiP, SSUP (see Methods) and GOLD, where GOLD is constructed as follows. Suppose the first SNP is the real causal SNP satisfying the logistic regression model logitPrðY = 1Þ = β 0 + β 1 * G 1 , where β 0 and β 1 represent log odds ratios. Other SNPs are correlated with the first SNP with genotypes G 2 , ⋯, G q . The Gold method (denoted by GOLD) is an ordinary score test based on the above real statistical model. Clearly, in real data analysis scenarios, we do not know the causal SNP. GOLD only has a value in simulation studies and is not 2 International Journal of Genomics practical in real data analysis. We consider 3 scenarios for analysing genotype data. First, we apply PChiB, Min2, PChiP, SSUP, and GOLD to analyse genotype data, including all SNPs. Second, we apply PChiB, Min2, PChiP, SSUP, and GOLD to analyse genotype data, excluding the first SNP, which is the causal SNP. Third, we apply PChiB, Min2, PChiP, SSUP and GOLD to analyse genotype data, including only labelled SNPs. To comprehensively assess the performance of these 5 methods, we designate every SNP among all q SNPs as the causal SNP in turn in the simulation procedure.

Simulation Procedure.
We conduct extensive simulation studies to assess the relative power of Min2 by comparing its performance with that of 4 other test statistics, namely, PChiB, PChiP, SSUP, and GOLD. We consider real LD structures defined by haplotypes inferred from the International Hapmap Project CEU samples. We set the haplotype information for gene NAT2 studied by Kwee et al. [28] as the basis of our simulations. To generate multilocus genotype data based on real haplotypes, we estimated haplotypes and their frequencies in a genomic region via HaploView software [29]. The LD structures plot based on the complete set of SNPs for gene NAT2 is displayed in Supplementary  Figure 1 The haplotype phase information was then deleted, and only locus-specific genotype data were retained. To generate multilocus genotype data for each case sample (total number n 1 ), we generated the pair of haplotypes ðH, H ′ Þ using the following probabilities: where R Aa and R aa are the odds ratios for genotypes "Aa" and "aa", 'A' is the major allele for the disease causal SNP, 'a' is the minor allele of the disease-causal SNP, and indicator functions IðHH ′ includes } Aa } Þ and IðHH ′ includes } aa } Þ refer to whether haplotype pair ðH, H′Þ has allele combinations (A,a) and (a, a), respectively, at the causal SNP.
To evaluate the impact of deviation from HWE on the power of PChiB, we additionally generated multilocus genotype data from real haplotypes based on gene NAT2, as described above, but with the frequency of haplotype pairs and F st is the fixation parameter, which represents mild deviation from HWE, as observed in real gene association analysis studies.
We set n 1 = 1000 and n 0 = 1000 and consider two scenarios with HWE indicator F st = 0 and 0:5 log ð2Þ, as in Luo et al. [25]. Furthermore, we designate every SNP as the causal SNP in turn. When the causal SNP adopts an additive code, we obtain the genotype and case-control status based on the logistic model with causal SNP odds ratio 1 for estimating the empirical type I error rates and with causal SNP odds ratio 1.2 for estimating the empirical power. When the causal marker adopts a dominant code, we obtain the genotype and case-control status based on the logistic model with causal SNP odds ratio 1.3 for estimating the empirical power. When the causal marker adopts a recessive code, we obtain the genotype and case-control status based on the logistic model with causal SNP odds ratio 1.5 for estimating the empirical power. With the genotype and case-control status information, we calculate the p value of Min2 via 200 permutations. The empirical type I error rates and powers of the 4 tests were considered under a significance level of 0.05 by means of 500 repetitions, as Kwee et al. [28] examined the type I error and power of the semiparametric and single-tag SNP approaches assuming a nominal significance level of 0.05.

Numerical Results.
To comprehensively assess the performance of Min2, we construct test statistics under 3 scenarios, namely, using all SNPs, using all SNPs except the causal SNP, and using only tag SNPs.
Because the empirical type I error rates are nearly the same when the real causal SNP adopts an additive code, dominant code, and recessive code, we present the empirical type I error rates for only the case where the real causal SNP adopts an additive code. The results based on all 18 SNPs with F st = 0 are displayed in Figure 1, and the results based on all 18 SNPs with F st = 0:05logð2Þ are displayed in   Figures 1 and 2, we can see that Min2 can control the type I error rate well when the HWE indicator coefficient F st equals 0 or 0.5log(2.0), but PChiB has a conservative empirical type I error rate when F st equals 0. We further investigate this phenomenon: when the real genetic model adopts additive code, PChiB adopts a codominant code with F st equal to 0, so the correlations between every two SNPs are decreased and test PChiB may absorb a large number of degrees of freedom. For example, when considering the scenario with all 18 SNPs and designating the 1st SNP as the causal SNP, PChiP absorbs 2 degrees of freedom and PChiB absorbs 5 degrees of freedom, according to the simulation data. When the real genetic model adopts recessive and dominant codes, all 5 tests control the type I error rate well, regardless of whether F st is 0 or 0.5log(2.0).
When the real causal SNP adopts an additive code, we display all results based on all 18 SNPs in Tables 6 and 7 for    5 International Journal of Genomics scenarios, the real genetic codes are additive, so it is not unexpected that the performance of PChiP is always a little better than that of Min2, regardless of which of the 18 SNPs is the causal SNP. Although SSUP sometimes has slightly better power than PChiP and Min2, it can sometimes have very low power. For example, in Table 6, the empirical powers of PChiP, SSUP, GOLD, Min2, and PChiB are 0.626, 0.402, 0.770, 0.616, and 0.400 when the 11th SNP is the causal SNP. In Table 7, for F st = 0:5logð2Þ, the empirical powers of PChiP, SSUP, GOLD, Min2, and PChiB are 0.660, 0.670, 0.800, 0.645, and 0.570, respectively, when the 9th SNP is the causal SNP.

The Analysis of High-Density Lipoprotein Cholesterol
(HDL-C) Data from GWAS Pancreatic Cancer Data. Herein, we present an analysis of HDL-C data from GWAS pancreatic cancer data [26,27] to illustrate our method. Plasma levels of high-density lipoprotein cholesterol are known to be heritable, but only a fraction of the heritability is explained. We developed a high-density genotyping array populated with HDL-C candidate loci selected based on the known biology of HDL metabolism, mouse genetic studies, human genetic association studies, and available GWAS data. SNP selection was based on tag SNPs but also included lowfrequency nonsynonymous SNPs. We performed association analysis on the majority of reported GWAS loci (including ABCA1, CETP, GALNT2, LCAT, LIPG, LIPC, and LPL).
The data set consists of 1231 samples (case: 625 and control: 606) with 64 SNPs from the above 13 genes. Basic information about the 13 genes is presented in Supplementary  Table 1 (additional file 2). We calculate the p values of 4 test methods, i.e., PChiP, SSUP, Min2, and PChiB, when analysing the data set. The numerical results are displayed in Table 8. From Table 8, we can see that the numerical results of Min2 are consistent with those of the other tests. Because the number of SNPs in each gene is not very large in the real data, the real data do not provide a good

Discussion
One key factor of the improved power of kernel-machinebased tests [17] and PCR is the reduced degrees of freedom. Kernel-machine-based tests make full use of possible correlations among score statistics, which is known to be advantageous for high-dimensional data [30], and are robust to the directions of association of different SNPs. Principal component analysis is a standard method of reducing the dimensionality of a large number of variables. Despite this seemingly obvious argument, the relative merits of PCR and kernel-machine-based tests remain understudied. We provide insights into the theoretical connection between kernel-machine-based tests and the PCR method. We find that when the LD extent of each pair of SNPs is somewhat strong, principal component analysis methods may have higher power than kernel-machine-based tests. PCR often has similar or higher power than kernel-machine-based tests, where the LD pattern is an important parameter for power. We will further explore the principle of selecting the number of PCs in future work.
In this work, we consider an association test between human complex diseases and genetic SNPs based on principal component analysis (PCA) since PCA is widely used in the recent literature. PCA accounts for linear combinations among SNPs. If this linearity exists, PCA is optimal. However, when how the multiple genetic SNPs influence the risk of disease is unknown, one alternative strategy is to use haplotype analysis since haplotypes can capture the LD information between markers [31][32][33][34][35][36][37].
We propose a novel global test (PChiB) based on the empirical Bayes score test, which is a data-adaptive linear combination of the prospective likelihood score and the retrospective likelihood score under the HWE constraint in the control population. PChiB can maintain desirable power when the real causal SNP adopts recessive and dominant codes under the HWE constraint in the control population. A small disadvantage of PChiB is that when the genetic code of the real causal SNP is additive, PChiB does not have desirable power because of the large degrees of freedom. Thus, we propose a robust test (Min2) that maintains the power gain under deviations from HWE observed in real settings, regardless of which genetic code the real causal SNP adopts. Min2 gains power by effectively using the LD among all the tested SNPs over all scenarios. Because PChiP is based on the assumption that all SNPs adopt an additive code, while PChiB and Min2 are based on the assumption that all SNPs adopt a codominant code, PChiP has low degrees of freedom and performs best when the causal SNP adopts an additive code. PChiB and Min2 may have less power than PchiP in this scenario. When the causal SNP adopts dominant or recessive codes, Min2 has desirable power, regardless of whether HWE is satisfied in the control population. We propose to use our new test Min2 for the association analysis of multilocus genotypes and complex diseases.
We propose the robust test Min2, where the p values are obtained via permutation and compared it with PChiB (empirical score based on all SNPs adopting codominant codes), PChiP (prospective score based on all SNPs adopting additive codes), and SSUP (a VC method based on the prospective score and all SNPs adopting an additive code). The main purpose of this article is to introduce the proposed test Min2, not to compare it with other existing tests for GWAS.
Notably, it would be a good idea to extend the proposed tests to include covariate adjustments in the logistic models. The derivation will be very complex and requires additional research. We will consider this problem in our future work. In simulations, we need to set a large sample size n as the number of MAF is low, so we have not considered rare variants. We may investigate the robustness about PChiB when the number of MAF is low in our further work.
For k = 1, ⋯, q, denotef k as the estimated minor allele frequency (MAF) for the kth SNP in the pooled casecontrol sample and denote g k as the number of minor allele in a genotype for the kth SNP in a population with values 0, 1, and 2. For k = 1, ⋯, q, denote P̂f k ðg k Þ as the estimated genotype frequency for the kth SNP. We can then  [25] and Chatterjee et al. [38] when an additive (dominant or recessive) code is adopted. The weight matrix W k is data adaptive. When codominant coding is adopted, by means of W k , we propose the empirical Bayes score for the kth ðk = 1, ⋯, qÞ SNP with the following form: where I 2×2 is an identity matrix with dimension 2.
Let U ðBÞ denote the vector of empirical Bayes scores for all q SNPs, namely, U ðBÞ = ðU B,1 , U B,2 , ⋯, U B,q Þ, which is of length 2q. Denote the estimated asymptotic covariance matrix by V B (See Supplementary File) for empirical Bayes score vector U ðBÞ . A common test for whether all q markers can be jointly built, similar to the Hotelling T 2 statistic, is indicates the transpose of a vector or matrix. Our proposed new global statistic is based on the eigenvalue decomposition of covariance matrix V B , as follows. For k = 1, 2, ⋯, 2q, denote λ k and ξ k (a 2q × 1 column vector) as the eigenvalue and corresponding eigenvector of covariance matrix V B . Let λ = ðλ 1 , λ 2 , ⋯, λ 2q Þ and ξ = ðξ 1 , ξ 2 , ⋯, ξ 2q Þ denote the eigenvalues and corresponding eigenvectors of covariance matrix V B . We then have ξ T V B ξ = diag ðλ 1 , λ 2 , ⋯, λ 2q Þ, and V B can be written as ξ diag ðλ 1 , λ 2 , ⋯, λ 2q Þξ T . Since the norm of the eigenvector is unity and V −1 B can be written as ξ diag ðλ −1 1 , λ −1 2 , ⋯, λ −1 2q Þξ T , the test statistic T 2 can be written as Note that U ðBÞ ξ k , is a linear combination of the score for each individual SNP U B,k with var ½U ðBÞ ξ k = λ k for k = 1, 2, ⋯, 2q: We propose to utilize the first s(1 ≤ s ≤ 2q) summands in T 2 to test the null hypothesis and denote the resultant test statistic as follows: Due to the orthogonality of ξ 1 , ⋯, ξ s , ðU ðBÞ ξ 1 Þ 2 /λ 1 , ⋯, ðU ðBÞ ξ s Þ 2 /λ s are independent. Because ðU ðBÞ ξ 1 Þ 2 /λ 1 , ⋯, ðU ðBÞ ξ s Þ 2 /λ s are all asymptotically normally distributed with mean 0 and variance 1 under the null hypothesis that the genomic region spanned by the q SNPs is not associated with the phenotype status of interest, PChiB is asymptotically distributed as a Chi-squared variable with s degrees of freedom under the null hypothesis.
A remaining issue is how to select the number of summands s. Note that PChiB is based on eigenvalue decomposition, similar to the standard PCR. Many criteria for selecting s have been introduced in the literature [39]. It has been shown that using the top principal components that explain 80~90% of the genetic variability is sufficient [19,20,23]. We select s according to the same principal, i.e., that the top s principal components can explain approximately 85% of the genetic variability. This strategy is supported by the connection between PChiB and PCR (see the next subsection). In fact, the number of principal components affects the power of the principal component test [40]. When the LD extent of each pair of SNPs is very strong, the top one principal component alone has desirable power. When the LD extent of each pair of SNPs is somewhat strong, using the top principal components that explain 80~90% of the genetic variability is a robust method.

Understanding PChiB through an Exposition of PCR.
We revisit PChiB based on only the standard prospective likelihood score under additive coding for kth (k = 1, ⋯, q) and establish its equivalence to PCR [19,20]. This equivalence sheds light on the promise of increased power of PChiB since PCR has been established to be a promising method for multi-SNP association analysis. In PCR, the phenotype variable is regressed on only a few of the top principal components (PCs) that summarize approximately 80-90% of the genetic variability. The PCs represent the directions in which most of the variability in the data occurs, as identified by the eigenvalue decomposition of the variancecovariance matrix of the centred raw genotype scores. Each principal component is a linear combination of genotype scores for all SNPs, and all principal components are uncorrelated with each other.
Here, we present the standard prospective likelihood score under additive genetic coding. The collection of all q prospective score functions, denoted byŨ ðPÞ = ðŨ P,1 ,Ũ P,2 , ⋯,Ũ P,q Þ, is asymptotically distributed as multivariate normal with mean ð0,⋯,0Þ q×1 and variance-covariance matrix V P under the null hypothesis. Let Y = ðY 1 , ⋯, Y n Þ T , Y = ∑ n i=1 Y i /n. For k = 1, 2, ⋯, q, let G k = ∑ n i=1 G ik /n and G = ð G 1 , ⋯, G q Þ. For i = 1, 2, ⋯, n, let G ðiÞ = ðG i1 , ⋯, G iq Þ T . Denote G as a genotype matrix with ith row and kth column element G ik for i = 1, ⋯, n,and k = 1, ⋯, q. Let 1 be a column vector with all elements 1 and length n. In matrix form,Ũ T ðPÞ = ∑ n i=1 ðY i − YÞG ðiÞ = G T ðY − Y 1Þ, and its covariance matrix ⋯, a q be a q × q matrix whose kth column is the characteristic vector of the matrixṼ P ðk = 1, ⋯, qÞ, and letλ 1 ≥λ 2 ≥ ⋯ ≥λ q be its eigenvalues. Denote orthogonal transformatioñ G = GA. The likelihood score based on a logistic regression of Y onG isŨ ðPÞ =G T ðY − Y 1Þ. The covariance matrix of U ðPÞ is a diagonal matrix with elementsλ 1 ,λ 2 , ⋯,λ q : Suppose that we consider the firsts(1 ≤s ≤ q) PCs as follows. Let A~s = ½a 1 , a 2 , ⋯, a~s be a q ×s matrix containing 8 International Journal of Genomics the firsts eigenvectors, and letG~s = GA~s. The standard PCA test based on the score statistic for testing the association between Y andG~s from the logistic regression model is Y 1Þ, which is denoted by PChiP, and is the same as our proposed method when the adopted genetic code is additive code. DenoteT k = ðY − Y 1Þ T Ga k a T k G T ðY − Y 1Þ/λ k , k = 1, 2, ⋯, q: When adopting additive code, the standard Hotelling T 2 statistic is equal to ∑ q k=1T k , and the PChiP statistic reduces to ∑~s k=1Tk .
The proposed statistic (in this situation, equivalent to PCR) can be shown to be closely related to a statistic called the sum of squared score test based on prospective likelihood [12], which is denoted by SSUP. SSUP is obtained as SSUP=Ũ ðPÞŨ T ðPÞ = ∑ q j=1Ũ 2 P,k , and it can be expressed as SSUP= ∑ q j=1λk ½ðŨ ðPÞ a T k Þ 2 /λ k . Therefore, SSUP and PChiB use different weights for the contributions of the PCs: SSUP weights all PCs by the eigenvalues, whereas PChiB assigns equal weights to the top PCs. SSUP allows PCs with small eigenvalues to make additional contributions to the test, but PChiB discards PCs with small eigenvalues to reduce the degrees of freedom. This difference has implications on their relative power, which depends critically on the structure of variance-covariance matrix and, therefore, the LD structure of the assessed genomic region.

Data Availability
Data available on request.

Conflicts of Interest
The authors declare no competing interests.