JPS Journal of Probability and Statistics 1687-9538 1687-952X Hindawi Publishing Corporation 10.1155/2016/3937056 3937056 Research Article Estimating the Proportion of True Null Hypotheses in Multiple Testing Problems Oyeniran Oluyemi 1 http://orcid.org/0000-0002-1245-7644 Chen Hanfeng 2 Chow Shein-chung 1 Manufacturing Toxicology and Applied Statistical Sciences Janssen Research & Development Spring House PA 19002 USA janssen.com 2 Department of Mathematics and Statistics Bowling Green State University Bowling Green OH 43403 USA bgsu.edu 2016 8122016 2016 26 07 2016 19 10 2016 08 11 2016 2016 Copyright © 2016 Oluyemi Oyeniran and Hanfeng Chen. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The problem of estimating the proportion, π 0 , of the true null hypotheses in a multiple testing problem is important in cases where large scale parallel hypotheses tests are performed independently. While the problem is a quantity of interest in its own right in applications, the estimate of π 0 can be used for assessing or controlling an overall false discovery rate. In this article, we develop an innovative nonparametric maximum likelihood approach to estimate π 0 . The nonparametric likelihood is proposed to be restricted to multinomial models and an EM algorithm is also developed to approximate the estimate of π 0 . Simulation studies show that the proposed method outperforms other existing methods. Using experimental microarray datasets, we demonstrate that the new method provides satisfactory estimate in practice.

1. Introduction

Estimating the proportion π 0 of true null hypotheses in a multiple testing setup is very crucial in wanting to assess and/or control false discovery rate, which is quite significant in genomics, disease discovery, and cancer discovery. Langaas et al.  remarked “An important reason for wanting to estimate π 0 is that it is a quantity of its own right. In addition, a reliable estimate of π 0 is important when we want to assess or control multiple error rates, such as the false discovery rate FDR of Benjamini and Hochberg .” In the case of testing for differential expression in DNA microarrays, the proportion of differentially expressed genes is 1 - π 0 , and it is important to know whether 5% or 35% of the genes, for example, are differentially expressed, even if we cannot identify these genes (see Langaas et al. ). Multiple testing refers to any instance that involves the simultaneous testing of several hypotheses. A common feature in genomes studies is the analysis of a large number of simultaneous measurements in a small number of samples. One must decide whether the findings are truly causative correlations or just the byproducts of multiple hypothesis testing (Gyorffy et al. ). If one does not take the multiplicity of tests into account, then the probability that some of the true null hypotheses are rejected by chance alone may be unduly large.

In a multiple hypothesis testing problem, m null hypotheses are tested simultaneously; that is, we test (1) H 0 i   versus   H 1 i , for i = 1 , 2 , , m , simultaneously. Assume that the m tests are constructed based on the observed p values, p 1 , , p m , respectively. The unknown quantity π 0 to be estimated is the proportion of the true null hypotheses among H 01 , , H 0 m . Introduce the i.i.d Bernoulli random variables H 1 , , H m with P H i = 0 = π 0 . Then H i can be interpreted in terms of the multiple testing problems as follows: (2) H i = 0 , if H 0 i   is  true , 1 , otherwise for i = 1 , , m .

We assume the p values, p 1 , , p m , are continuous and independent random variables, so that the p values are independently and identically distributed as U n i f 0,1 when the null hypotheses are all true. One chooses to reject or fail to reject each null hypothesis based on the corresponding p value. Consequences of the tests are summarized in Table 1.

Outcomes from m null hypotheses tests.

Not rejected Rejected Total
H 0 true U V m 0
H 0 false T S m - m 0

Total W R m

In Table 1   m 0 is the number of null hypotheses; R is the observable random variable representing the number of hypotheses rejected. Note that all other random variables U , V , S , and T in Table 1 are unobservable.

The problem of estimating the proportion π 0 has naturally arisen in assessing or controlling an overall false rejection rate in a simultaneous hypotheses testing problems. A reliable estimate of π 0 is crucial when we want to control and/or assess the false discovery rate (FDR) proposed by Benjamini and Hochberg , defined as (3) F D R = E Q I R > 0 , where Q = V / R and (4) I R > 0 = 1 if   R > 0 , 0 otherwise .

Benjamini and Hochberg  prove that Simes’ procedure (Simes ) has the FDR controlled at level α if the underlying test statistics and the corresponding p values are continuous and identically and independently distributed. Specifically, they show (5) F D R m 0 m α = π 0 α . Consequently, if γ is the FDR level that one wishes to achieve and if π 0 can be estimated efficiently, say by π ^ 0 , then α can be chosen to be α = γ / π ^ 0 to gain additional testing powers in the multiple testing problem with FDR being under control. If π ^ 0 is overestimated substantially, however, the value of α is underestimated substantially, leading to significantly narrower simultaneous confidence intervals for multiple comparisons and significant reduction of testing power for the multiple testing problems. On the other hand, if α is chosen via some other procedure, say the Bonferroni method in the multiple comparison problem, assessing FDR accurately via estimating π 0 efficiently is then of great interest.

The paper is organized as follows. Section 2 is a review of existing estimating methods; Section 3 introduces the new estimating procedure; Section 4 contains simulation results and Section 5 presents application of the new estimating procedure to real life examples.

2. Existing Estimating Methods 2.1. Mixture Model Framework

A mixture model can be used to fit the p values in order to estimate the proportion, π 0 , of the true hypotheses, where large scale parallel hypotheses are performed independently. The estimate for π 0 can be based on the mixture model for the common density f of the p values described as follows: (6) f = π 0 + 1 - π 0 h , where h p is the conditional probability density function of the p value under an alternative (see Langaas et al. ). Using this mixture representation, we are able to characterize the maximum likelihood estimate for f . The estimation method is derived under the assumption of independent and identically distributed p values. The null p values are uniformly distributed on [ 0,1 ] . We should note that h p describes the configuration of true alternative populations among the m underlying populations; it seems that a nonparametric approach to the estimation of π 0 is more appealing. Without loss of generality, we define H = 1 as the alternative hypothesis, which will be used throughout this section. Three nonparametric estimators proposed recently by other authors are described and discussed in the sequel subsections.

2.2. Storey’s Method

Consider the common marginal mixture density of the p values, for any λ ( 0 , 1 ) : (7) P p > λ = π 1 - λ + 1 - π P p > λ H = 1 . On the basis that P p > λ H = 1 is typically small, a large majority of the p values in the interval [ λ , 1 ] should be corresponding to the true null hypothesis and thus uniformly distributed on the interval ( 0 , 1 ) , of which most should be close to 1, so that 1 - π P p > λ H = 1 is approximately zero. Let W λ = # p i : p i > λ . Note that W λ should be approximately equal to the product of m π 0 and the length of the interval [ λ , 1 ] ; that is, E W λ m π 0 1 - λ . Hence, Storey  proposes that the proportion of true null hypotheses, π 0 , is estimated by (with an appropriately chosen λ ) (8) π ^ 0 s λ = W λ 1 - λ m . The value of λ has impacts on the behavior of π ^ 0 s λ . π ^ 0 s λ has a large bias and small variance when λ is small and a small bias and large variance when λ is big, respectively. Since both extreme values of λ have a bias-variance trade-off, Storey et al.  propose bootstrapping, which is a resampling technique, to choose λ when estimating π ^ 0 s λ so as to minimize the mean square error of π ^ 0 s λ . The resulting estimator is denoted by π ^ 0 s .

Note that the idea leading to Storey’s estimator π ^ 0 s is to treat the addition 1 - π P p > λ H = 1 term as zero to get rid of the complication caused by the unknown alternative distribution. As a consequence, Storey’s estimator π ^ 0 s will tend to overestimate π 0 in a size of 1 - π 0 P p > λ H = 1 / 1 - λ , at least theoretically. However, the anticipated size of overestimate becomes less significant when π 0 is closer to 1; that is, 1 - π 0 is closer to 0. The biasedness of π ^ 0 s as an estimator for P p > λ / 1 - λ can be dominating, when π 0 is close to 1, as evident in the simulation results in Section 4 and as observed by other authors as well, for example, Langaas et al. .

2.3. Convest Method

Let f and h be defined in (6). Langaas et al.  prove that if h p is twice differentiable, convex, and decreasing, f can be expressed as (9) f p = 0 1 f θ p μ d θ , where the kernel density (10) f θ p = 2 θ - p θ 2 I 0 , θ p if θ 0,1 , I 0,1 p θ = 0 , with μ being any probability measure on [ 0,1 ] . Thus Langaas et al.  are able to characterize the nonparametric maximum likelihood estimate for f for the density f p as (11) f ^ p = 0 1 f θ p μ ^ d θ , where μ ^ is the nonparametric maximum likelihood estimate of μ . Let p values p 1 , , p m be the ordered statistics of the p values. The nonparametric maximum likelihood estimator of f · is given by (12) f ^ = arg max i = 1 m f ^ p i . Langaas et al.  then propose to estimate the proportion π 0 by (13) π ^ 0 c = f ^ 1 . It is noted that as the estimator π ^ 0 c is constructed via a density estimate f ^ p at or about the upper bound of support, namely, f ^ 1 , it can be conservative and overestimate π 0 when the assumption h 1 = 0 is questionable or h p 0 slowly as p 1 . The overestimating problem can be even more severe when π 0 is not large; that is, 1 - π 0 is not so small. However, the remarks in the end of Section 2.2 are applicable to π ^ 0 c as well.

2.4. Average Estimate Approach

The average estimate approach was motivated by Storey’s method. Jiang and Doerge  observe (and many other authors point out as well) that Storey’s estimator has a large bias and small variance when λ is small and a small bias and large variance when λ is big. Since both extremes of λ have a bias-variance trade-off, Jiang and Doerge  propose to combine Storey’s estimate with different values of λ that vary from a small extreme to a large extreme.

Let 0 < λ 1 < λ 2 < < λ n < 1 , and suppose that, for each λ i , we have Storey’s estimate (14) π ^ 0 λ i = W λ i 1 - λ i m . Jiang and Doerge  propose to estimate π 0 by the average of π ^ 0 λ over the values of λ i ; that is, (15) π ^ 0 a = 1 n i = 1 n π ^ 0 λ i . This estimate aims to minimize both the bias and the variance at the same time, if λ i ’s are selected appropriately.

Define 0 = t 1 < t 2 < < t B < t B + 1 = 1 as equally spaced points in the interval [ 0,1 ] such that the interval [ 0,1 ] is divided into B small intervals with equal length 1 / B ; specifically, t i = i - 1 / B . Let b i = # p k : p k t i and c i = # p k : t i p k < t i + 1 for i = 1 , , B . Define (16) i = min i : c i b i B - i + 1 . Jiang and Doerge  then propose to estimate π 0 by (17) π ^ 0 B = 1 B - i + 1 j = 1 B π ^ 0 t j . To apply the estimate, one has to choose a value of B . Jiang and Doerge  develop a bootstrapping algorithm to pick the optimal B . It should be noted that, as an average of Storey’s estimates, this estimate is expected to inherit conservativeness from Storey’s method.

3. New Method

We propose a finite mixture model of the uniform distribution and a multinomial distribution M k , q 1 , , q k with a mixing proportion π 0 to fit the p values. Denote UM π 0 ; k , q 1 , , q k for this finite mixture distribution. By this approach, the alternative distribution h defined in (6) is restricted to the multinomial distribution family, M k , q 1 , , q k . Alternatively, the multinomial distribution can be viewed as a parametric approximation to the nonparametric unknown density h , similarly to the idea of the empirical likelihood method (see Owen ). So this procedure is considered as nonparametric.

To apply the approach, we need to settle two things first: (a) convert the continuous-type observations p 1 , , p m into discrete data with k categories and (b) select an integer k .

3.1. Selection of <inline-formula> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M174"> <mml:mrow> <mml:mi>k</mml:mi></mml:mrow> </mml:math></inline-formula>

It is often the case in applications that the p values are highly skewed (see Storey and Tibshirani , Zhao et al. , and Markitsis and Lai ). In this case, we recommend Doane’s modification of Sturges’ rule for selection of k as follows, to count for skewness of the mixture distribution (6) (see Doane  and Sturges ): (18) k ^ = log 2 1 + γ ^ , where γ ^ is an estimate of the skewness coefficient. In the case of symmetry, we recommend to adapt Sturges’ rule to determine the selection of k : (19) k ^ = 1 + log 2 m .

3.2. Transformation of <inline-formula> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M181"> <mml:mrow> <mml:msub> <mml:mrow> <mml:mi>p</mml:mi></mml:mrow> <mml:mrow> <mml:mi>i</mml:mi></mml:mrow> </mml:msub></mml:mrow> </mml:math></inline-formula>’s

To transform p i ’s, partition the unit interval into k ^ subintervals with equal width 1 / k . Define (20) ξ i j = 1 , if   j - 1 k ^ p i < j k ^ , 0 , otherwise for i = 1 , , m , j = 1 , , k ^ . Keep in mind that (21) j = 1 k ^ ξ i j = 1 , i = 1 , , m , i = 1 m j = 1 k ^ ξ i j = m . From Storey’s Bayesian interpretation for the multiple testing problem, given that the alternative is true with a probability of 1 - π 0 , the p value follows the distribution h . In the same way, the transformed data ξ i j ’s can be interpreted as follows. Given that the alternative is true with a probability of 1 - π 0 , ξ i = ( ξ i 1 , , ξ i k ) is a multinomial random vector and is distributed as M k , q 1 , , q k , for i = 1 , , m . Therefore, ξ 1 , , ξ m are independently and identically distributed as the finite mixture distribution UM ( π 0 ; k , q 1 , , q k ) and the maximum likelihood estimate π ^ 0 for π 0 thus results from the transformed data ξ i j ’s. Explicitly, π ^ 0 together with q ^ 1 , , q ^ k maximizes the log-likelihood function (22) l π 0 , q 1 , , q k = i = 1 m log π 0 + 1 - π 0 j = 1 k q j ξ i j .

3.3. EM Algorithm

Note that maximizing the nonlinear log-likelihood function (22) can be complicating. The EM algorithm may be used easily to obtain an approximation to π ^ 0 . In order to do so, we introduce a latent Bernoulli variable w that indicates the component membership in the finite mixture distribution UM π 0 ; k , q 1 , , q k . That is, P w = 1 = π 0 and given w = 0 , ξ 1 follows M k , q 1 , , q k . Note that w , ξ 1 has the distribution (23) g w , ξ 1 π 0 , q 1 , , q k = π 0 w 1 - π 0 q 1 ξ 11 q 2 x i 12 q k ξ 1 k 1 - w , for w = 0 or 1 and ξ 1 j = 0 or 1, j = 1 , , k .

Let w i ; ξ i 1 , , ξ i k , i = 1 , , m , be a random sample of size m from model (23). In the present problem, ξ i j ’s are only available data for analysis and w i ’s are unobservable and so considered as missing values. This defines a missing-value model. The log-likelihood with the complete data is given by (24) l π 0 , q 1 , , q k = T log π 0 + m - T log 1 - π 0 + j = 1 k m - T ξ . j log q j , where T = i = 1 m w i , and ξ . j = i = 1 m ξ i j , for j = 1 , , k . We are ready to describe the EM algorithm.

E-Step. Let π 0 s a n d    q s be the current approximations to the maximum likelihood estimates under the model UM π 0 , k , q 1 , , q k . For the next approximation, the E-step establishes the expected log-likelihood function as (25) Q π 0 , q = E π 0 s , q s l π 0 , q 1 , , q k η 1 , , η m = E π 0 s , q s T η log π 0 + m - E π 0 s , q s T η log 1 - π 0 + j = 1 k m - E π 0 s , q s T η ξ . j log q j

M-Step. In the M-step, Q π 0 , q is maximized to yield the next approximation π 0 s + 1 and q s + 1 . Consider (26) Q π 0 = 0 π 0 s + 1 = E π 0 s , q s T η m = i = 1 m E π 0 s , q s w i η . where (27) E π 0 s , q s w i η = P π 0 s , q s w i = 1 η = π 0 s π 0 s + 1 - π 0 s j = 1 k q j s ξ i j .

From Q / q = 0 , we have (28) q j s + 1 = ξ . j m - E π 0 s , q s T η = ξ . j m 1 - π 0 s + 1 .

To sum up, let π 0 s , q 1 s , , q k s be the s th approximations to the maximum likelihood estimates of π ^ 0 , q ^ 1 , , q ^ k that maximize the log-likelihood function defined in (22). Then the s + 1 th approximation with the EM algorithm is given by (29) π 0 s + 1 = 1 m i = 1 m π 0 s π 0 s + 1 - π 0 s j = 1 k q j s ξ i j , q j s + 1 = ξ . j m 1 - π 0 s + 1 . So the EM iteration process goes as shown in Algorithm 1.

<bold>Algorithm 1: </bold>EM iteration.

Input: ξ i j , i = 1 , , k , j = 1 , , m transformed from the observed p -values p 1 , , p m

Output: Estimate of π 0 : π ^ 0

(1) begin

(2)  Initialization: Set π 0 = π 0 o , q j = q j o , 1 j k .

(3)  repeat

(4)     Set π 0 = π 0 s , q j = q j s be the sth (current) approximation.

Compute

π 0 s + 1 = 1 m i = 1 m π 0 s π 0 s + 1 - π 0 s j = 1 k q j s ξ i j ,

q j s + 1 = ξ . j m 1 - π 0 s + 1

(5)  until   l π 0 s + 1 , q 1 s + 1 , , q k s + 1 - l π 0 s , q 1 s , , q k s ϵ ;

(6) Then π ^ 0 = π 0 s + 1 .

It is well known that each EM iteration gets closer to the maximum of log-likelihood but only in a linear convergency rate. If the components are similar in their densities, then the convergence is extremely slow. The convergence will also be slow when the maximum likelihood solution requires some of the weight parameters to be zero, because the algorithm can never reach such a boundary point. An additional and related problem is that of deciding when to stop the algorithm. One risk to a naive user is the natural tendency to use a stopping rule for the algorithm based on the changes in the parameters or the likelihood being sufficiently small. Taking smalls steps does not indicate that we are close to the solution.

To combat this problem, Lindsay  exploited the regularity of the EM algorithm process to predict, via the device known as Aitken acceleration, the value of the log-likelihood at the maximum likelihood solution. The Aitken’s acceleration rule is usually recommended to predict the maximum efficiently and suitably whenever one is using a linearly convergent algorithm with a slow rate of convergence. If we let l i - 2 , l i - 1 , and l i be the log-likelihood values for the three consecutive iterations, then the predicted final value is (30) l i = l i - 2 + l i - 1 - l i - 2 1 - c i , where c i = l i - l i - 1 / l i - 1 - l i - 2 . Terminate the EM iteration when l i - l i is sufficiently small.

4. Simulation Studies

In order to investigate the properties of the estimators described in Section 2 and compare the performance to that of the newly developed estimating procedure described in Section 3, we conducted simulation experiments. The generation of simulated data and the calculation of the estimates were both done in the language R . Simulation studies were conducted with the p values based on a one-sided z -test in the finite normal mixture model z ~ π 0 N 0,1 + 1 - π 0 N 1,1 with various values of m and π 0 for performance comparisons.

Monte Carlo data were simulated independently from z ~ π 0 N 0,1 + 1 - π 0 N 1,1 and each p value was computed by p i = 1 - Φ z . where Φ is the cumulative distribution function of standard normal N 0,1 . For the generation of simulated data, three true values of π 0 were considered, namely, 0.25,0.5,0.75 , with the sample size m = 200,500 , and 1,000. In each case, 1,000 Monte Carlo trials were performed. In implementation of the EM algorithm to compute the proposed new estimate π ^ 0 , the determination rule of the EM algorithm is l i - l i 1 0 - 4 . The EM algorithm converged for all the simulated datasets. In each case, the performance of the proposed new estimate π ^ 0 was compared with those of several existing procedures through the same set of Monte Carlo data. Specifically, in the simulations, we considered the following existing methods:

Storey’s bootstrap estimate denoted by π ^ 0 b

Langaas et al.’s convex estimate denoted by π ^ 0 c

Jiang and Doerge’s average method estimate denoted by π ^ 0 a

Storey’s estimate π ^ 0 s was computed by using the R package nFDR and the convex estimate π ^ 0 c by the R package limma with the function convest. Table 2 summarizes the simulation results. It is clearly shown that the new estimate outperformed substantially over the existing methods with comparable standard errors, while π ^ 0 s 0.5 performs much worse than π ^ 0 s as expected.

Monte Carlo empirical estimates with standard deviations (in parentheses) by the new innovative approach along with existing estimators. The Monte Carlo size is 1,000 in each case. Denote π ^ 0 for the new estimator, π ^ 0 s for Storey’s estimator, π ^ 0 s 0.5 for Storey’s estimator with λ = 0.5 , π ^ 0 c for the convex method estimator, and π ^ 0 a for the average method estimator.

m π 0 π ^ 0 π ^ 0 s π ^ 0 s ( 0.5 ) π ^ 0 c π ^ 0 a
Estimates of π 0
200 0.25 0.235 0.30 0.500 0.314 0.430
(0.016) (0.021) (0.035) (0.022) (0.03)
0.5 0.489 0.613 0.660 0.584 0.626
(0.035) (0.043) (0.047) (0.041) (0.044)
0.75 0.78 0.717 0.79 0.685 0.773
(0.055) (0.051) (0.056) (0.048) (0.054)

500 0.25 0.249 0.34 0.528 0.33 0.491
(0.011) (0.015) (0.024) (0.014) (0.022)
0.5 0.52 0.646 0.720 0.623 0.704
(0.023) (0.028) (0.032) (0.027) (0.031)
0.75 0.72 0.806 0.860 0.785 0.819
(0.032) (0.036) (0.038) (0.035) (0.035)

1000 0.25 0.242 0.393 0.482 0.349 0.394
(0.005) (0.012) (0.015) (0.011) (0.013)
0.5 0.51 0.61 0.66 0.61 0.63
(0.016) (0.019) (0.021) (0.019) (0.019)
0.75 0.72 0.67 0.86 0.79 0.81
(0.023) (0.021) (0.027) (0.025) (0.026)
5. Application to Real Life Microarray Data

To further evaluate the performance of the new method in comparison to the three existing methods, consider the real life data from the DNA microarray experiments reported in Golub et al.  that can be downloaded from the R package multtest. The dataset was used by many authors (see [11, 16] and the references therein) to illustrate the applications of proposed estimating methods for the proportion of true null hypotheses in multiple testing problems. The dataset consists of 38 bones marrow samples, 27 acute lymphoblastic leukemia (ALL) samples and 11 acute myeloid leukemia (AML) samples, obtained from acute leukemia patients at the time of diagnosis, before chemotherapy. RNA prepared from bone marrow mononuclear cells was hybridized to high-density oligonucleotide microarrays, produced by Affymetrix and containing probes for 7,129 human genes. But after preprocessing, there were only 3,051 genes accurately readable resulting in a microarray 3051 × 38 matrix. For comparison of gene expressions between two groups, let μ 1 and μ 2 be the true mean intensities for each gene, in groups 1 and 2, respectively, to determine whether the gene is differentially expressed, that is, to test (31) H 0 : μ 1 = μ 2   vs   H 1 : μ 1 μ 2 . With each of 3,051 genes, a two-sample Welch t -statistic is computed and its two-sided p value is computed under a central t -student distribution with degree 36 of freedom. A histogram of these p values is displayed in Figure 1. It is clear that the p values are highly skewed to the right.

Histogram of the p values for leukemia data reported in Golub et al. .

The four estimates for the proportion π 0 of nondifferentially expressed genes among 3,051 genes, π 0 , π ^ 0 , π ^ 0 s , π ^ 0 c , and π ^ 0 a , are given in Table 3. It appears that the four estimates are significantly different, varying from as low as 0.4150 to as high as 0.4913 . Noting that it has been commented by many authors that the estimates π ^ 0 s , π ^ 0 c , and π ^ 0 a are usually conservative, we conclude that the lower estimate of 0.42 by the proposed new estimating procedure appears as expected and might be closer to the true value of π 0 .

The estimates of the proportion of true null hypotheses with the microarray data of Golub et al. .

π ^ 0 π ^ 0 s π ^ 0 c π ^ 0 a
0.4150 0.4643 0.4701 0.4913
Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Langaas M. Lindqvist B. H. Ferkingstad E. Estimating the proportion of true null hypotheses, with application to DNA microarray data Journal of the Royal Statistical Society, Series B: Statistical Methodology 2005 67 4 555 572 10.1111/j.1467-9868.2005.00515.x MR2168204 2-s2.0-25144456446 Benjamini Y. Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing Journal of the Royal Statistical Society, Series B: Methodological 1995 57 1 289 300 MR1325392 Gyorffy B. Gyorffy A. Tulassay Z. The problem of multiple testing and solutions for genome-wide studies Orvosi Hetilap 2005 146 559 563 Simes R. J. An improved Bonferroni procedure for multiple tests of significance Biometrika 1986 73 3 751 754 10.1093/biomet/73.3.751 MR897872 2-s2.0-67649345185 Golub T. R. Slonim D. K. Tamayo P. Huard C. Gaasenbeek M. Mesirov J. P. Coller H. Loh M. L. Downing J. R. Caligiuri M. A. Bloomfield C. D. Lander E. S. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring Science 1999 286 5439 531 537 10.1126/science.286.5439.531 2-s2.0-0033569406 Storey J. D. A direct approach to false discovery rates Journal of the Royal Statistical Society. Series B. Statistical Methodology 2002 64 3 479 498 10.1111/1467-9868.00346 MR1924302 2-s2.0-0036020892 Storey J. D. Taylor J. E. Siegmund D. Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach Journal of the Royal Statistical Society, Series B: Statistical Methodology 2004 66 1 187 205 10.1111/j.1467-9868.2004.00439.x MR2035766 2-s2.0-1142273091 Jiang H. Doerge R. W. Estimating the proportion of true null hypotheses for multiple comparisons Cancer Informatics 2008 6 25 32 2-s2.0-49649089074 Owen A. Empirical likelihood for linear models The Annals of Statistics 1991 19 4 1725 1747 10.1214/aos/1176348368 MR1135146 Storey J. D. Tibshirani R. Statistical significance for genomewide studies Proceedings of the National Academy of Sciences of the United States of America 2003 100 16 9440 9445 12883005 10.1073/pnas.1530509100 MR1994856 2-s2.0-0042424602 12883005 Zhao H. Wu X. Zhang H. Chen H. Estimating the proportion of true null hypotheses in nonparametric exponential mixture model with appication to the leukemia gene expression data Communications in Statistics. Simulation and Computation 2012 41 9 1580 1592 10.1080/03610918.2011.611308 MR2924004 2-s2.0-84862865067 Markitsis A. Lai Y. A censored beta mixture model for the estimation of the proportion of non-differentially expressed genes Bioinformatics 2010 26 5 640 646 10.1093/bioinformatics/btq001 2-s2.0-77949647700 Doane D. P. Aesthetic frequency classifications The American Statistician 1976 30 4 181 183 10.2307/2683757 Sturges H. A. The choice of a class interval Journal of the American Statistical Association 1926 21 153 65 66 10.1080/01621459.1926.10502161 Lindsay B. G. Mixture models: theory, geometry and applications Proceedings of the NSF-CBMS Regional Conference Series in Probability and Statistics 1995 Institute for Mathematical Statistics Guan Z. Wu B. Zhao H. Application of Bernstein polynomial in the estimation of false-discovery-rate Statistica Sinica 2008 18 905 923