Control of the False Discovery Proportion for Independently Tested Null Hypotheses

Consider the multiple testing problem of testing m null hypotheses H1, . . . ,Hm, among which m0 hypotheses are truly null. Given the P-values for each hypothesis, the question of interest is how to combine the P-values to find out which hypotheses are false nulls and possibly to make a statistical inference onm0. Benjamini and Hochberg proposed a classical procedure that can control the false discovery rate FDR . The FDR control is a little bit unsatisfactory in that it only concerns the expectation of the false discovery proportion FDP . The control of the actual random variable FDP has recently drawnmuch attention. For any level 1−α, this paper proposes a procedure to construct an upper prediction bound UPB for the FDP for a fixed rejection region. When 1 − α 50%, our procedure is very close to the classical Benjamini and Hochberg procedure. Simultaneous UPBs for all rejection regions’ FDPs and the upper confidence bound for the unknown m0 are presented consequently. This new proposed procedure works for finite samples and hence avoids the slow convergence problem of the asymptotic theory.


Introduction
In this paper, we consider the problem of testing m null hypotheses H 1 , . . ., H m , among which m 0 hypotheses are truly null.We shall assume that P -values are available for individual hypotheses.In a seminal paper, Benjamini and Hochberg 1 proposed the false discovery rate FDR as an alternative to the classically defined family-wise error rate FWER .The proposed FDR achieves a good balance between the P -value itself and the FWER correction 2 ; the former may give too many false positives, and the latter may give too many false negatives.However, the control of the FDR is a little bit unsatisfactory in that it only concerns the expectation of the false discovery proportion FDP .In practice, researchers may be interested in more detailed statistical inference on the actual random variable FDP, not just its expectation.The goal of this paper is to provide a simple procedure to control the FDP.
Let us first introduce some notation.Given m hypotheses H 1 , . . ., H m , let the complete index set be M {1, . . ., m}, M 0 the unknown subset of M for which the null hypotheses are true, and M 1 M \ M 0 the subset for which null hypotheses are false.Denote that m 0 |M 0 |, m 1 |M 1 |, where | • | denotes the cardinality of a set.The P -values for testing the m hypotheses are P 1 , . . ., P m .A fixed rejection region for the P -values can conveniently be taken as 0, t 0 < t < 1 .The value of t could be 0.05, for example.Define the number R t of all rejected hypotheses and the number V t of falsely rejected hypotheses, respectively, Following the notation of Korn et al. 3 , and Genovese and Wasserman 4 , Lehmann and Romano 5 , the false discovery proportion is defined to be the proportion of falsely rejected null hypotheses among the rejected ones, where the ratio is defined to be zero when the denominator is zero.For a given fixed rejection region 0, t , R t , V t , and Q t are random variables.R, V , and Q will be shorthand for R t , V t , and Q t respectively, when the rejection region 0, t is clear from the context.The false discovery rate of Benjamini and Hochberg 1 is A good understanding of Q will lead investigators to pick an appropriate rejection region 0, t of P -values.As Q is an unobservable random variable depending on the observed Pvalues and the rejection region 0, t , the quantity FDR just describes the expectation of this random variable Q.One way to have a more detailed statistical inference on the random variable Q is to derive its distribution, which is very difficult unless a strong assumption can be imposed on the P -values from the false null hypotheses.A conservative approach is to compute an upper prediction bound for Q so that we can safeguard against excessive type I errors.In Section 2, for a fixed rejection region 0, t , we can compute an upper prediction bound If Q 1−α t had been a nonrandom variable, then it should be always no less than the 1 − α quantile of the random variable Q t .When 1 − α 50%, our procedure is very close to the classical BH procedure of Benjamini and Hochberg 1 .In other words, the BH procedure gives us an approximate 50% upper prediction bound UPB for Q.With different degrees of being conservative, one should take 1 − α at 0.9, 0.95, and 0.99 to ensure high coverage of the false discovery proportion.We also describe how to compute an upper confidence bound UCB for m 0 , the number of true null hypotheses.The UCB for m 0 can be used to improve the estimate Q 1−α t .In practice, the rejection region 0, t needs to be adapted to the actual dataset.In Section 3, we give a procedure to construct an upper prediction band for Q t for all t ∈ 0, 1 , and this upper prediction band can be used to pick a data-defined rejection region 0, τ of P -values such that the false discovery proportion Q can be controlled at target level γ with prediction accuracy 1 − α, that is, Thus with probability at least 1 − α, the value of Q is γ or less.For the independent true null P -values, Genovese and Wasserman 4 , Meinshausen and Rice 6 also worked on the control of the FDP in the sense of the above equation.However, their results are based on asymptotic theory, while our focus is on the finite-sample results and avoids the slow convergence problem of the asymptotic theory.Other works such as Lehmann and Romano 5 , Romano and Shaikh 7, 8 , and van der Laan et al. 9 proposed procedures that allow dependence in the P -values but have potentially lost statistical power as the dependence information is not exploited.Section 4 presents a focused statistical inference by restricting the rejection regions onto { 0, t : t ∈ t 0 , t 0 }, which unifies the results of Sections 2 and 3. Section 5 generalizes the results from independent data to less-independent situations, and Section 6 gives our discussion.

Finding a 1 − α UPB for the False Discovery Proportion for a Fixed Rejection Region
For the sake of simplicity, we will first assume that the P -values from the true null hypotheses are following mutually independently uniform distribution U 0, 1 .We have no further assumptions on the P -values from false null hypotheses.This assumption is the same as in Benjamini and Hochberg 1 .In Section 5 we will generalize the result to less independent situations.For a fixed rejection region 0, t of the P -values, we would like to find the 1 − α upper prediction bound UPB for the false discovery proportion Q t .As we mentioned in Section 1, the distribution of Q t is unknown.However, for any given experimental data, the total number of rejections, R t , can be easily obtained by 1.1 .Under the assumption that true null P -values are independently distributed as U 0, 1 , V t has a binomial distribution Bin m 0 , t .Let U i , i 1, . . ., m be random variables mutually independently distributed as U 0, 1 , and N m 0 ,t The 1 − α quantile for V t is the 1 − α quantile C 1−α m 0 , t of the distribution Bin m 0 , t .Here C 1−α m 0 , t is defined as As R t can be computed from the observed data, a 1 − α UPB for Q t can be estimated by The values that g k 1 − g k and h k 1 − h k take can only be zero or one.
Applying the definition of C 1−α , we obtain the result for part a .
Note that

2.4
Therefore, Similarly, we can obtain that By noting that Combining this with 2.5 and 2.6 leads to the result for part b .Parts a and b , respectively, say that g k and h k are both increasing functions of k.Simple algebra can establish that nonnegative due to the increasing property of functions g k and h k , and hence 0 ≤ g k 1 − g k ≤ 1 and 0 ≤ h k 1 − h k ≤ 1.The only values that {g k 1 − g k } and {h k 1 − h k } take can only be zero and one as functions g k and h k only take integer values.Thus, we complete the proof for part c .

Lemma 2.2. For any given t,
2.9 The proof is straightforward by using the fact that V t d N m 0 ,t .We have

2.10
In the third line, we have used the fact that {V t ≤ C 1−α m 0 , t and R t 0} is the same as the set {R t 0}, which is obtained by noting that V t must be zero when R t 0. Following this proof, we can easily see that The basic construction of 3 is the idea central to formulating prediction inference for Q t .In practice, m 0 is an unknown parameter.The most conservative approach is to replace m 0 with m, in which case we obtain a conservative 1 − α UPB for Q t .The independence assumption among true null P -values can be used to give a confidence inference for m 0 ; thus, we can find a better estimate of the UPB for Q t .For any given 0 < λ < 1, a 1 − α UCB for m 0 is given by

c . Since h 0 0 and h k − h k takes value of only zero and one, there exists at least one
12 is well defined.The parameter λ in 2.12 is used to construct a UCB for m 0 ; more discussion about it can be seen in Remark 2.6 of the following theorem.
b Especially, if λ takes the same value as t in the P -value rejection region, then Proof.Use m 0 as a shorthand of m 0,1−α λ in this proof.We want to establish that The fact that function h k is increasing in k leads to {m 0 ≤ m 0 } ⊆ {h m 0 ≤ h m 0 }.On the other hand, if m 0 > m 0 , then m 0 is strictly less than m, and we must have h m 0 m − R λ according to 2.12 .m 0 is the maximum of k such that h k m − R λ , and hence h m 0 / m − R λ h m 0 as m 0 > m 0 .The increasing property of h k leads to h m 0 ≥ h m 0 .Combining this with h m 0 / h m 0 , we obtain that h m 0 > h m 0 ; therefore, we conclude and complete the proof of 2.15 .Note that

2.18
Hence, we have the proof of part a .When λ is set to be t, we have that the set

2.19
Thus, we have the proof of part b of this theorem.inference for m 0 , and part b gives a simultaneous statement for the Q t and m 0 , which is more interesting.Meinshausen 13 gives a confidence for m 0 by using resampling methods, while ours exploited the independence information so that it works for finite samples.
Remark 2.5.Theorem 2.3 of G öb 14 implies that for a binomial distribution, the difference between the median and mean is less than 1, that is, |C 0.5 m 0 , t − m 0 t| < 1.From 2.3 , we know the 50% UPB for the Q t can be estimated by C 0.5 m 0 , t /R t .This UPB for Q t is very close to m 0 t/R t with a difference smaller than 1/R t .Replacing m 0 by m in m 0 t/R t is equivalent to the classical BH procedure.For a very large R t , the term 1/R t can be ignored, and the BH procedure offers an approximate estimate of the 50% UPB for Q t .
Remark 2.6.When k is large, the distribution Bin k, λ can be closely approximated by N kλ, kλ 1 − λ .Let z 1−α be the 1 − α quantile of a standard normal distribution.After some algebraic manipulations, we obtain a 1

2.20
Taking 1 − α 0.5, we have m − R λ / 1 − λ , which is equivalent to 2.3 of Storey et al. 10 .For most practical applications, one can set the value of λ 0.5.Fine tuning of the parameter λ will be discussed in Section 3.3.
When the rejection region 0, t is small, the UCB for m 0 obtained from part b of Theorem 2.3 may be too conservative.It may be advantageous to have separate values for λ and t.Part a of Theorem 2.3 implies the following.

The Setup
In Section 2, the UPBs for Q are only valid for a fixed rejection region 0, t of P -values.In practice, researchers will not fix the rejected region 0, t but adapt it to the actual data.The logic is the same as with single hypothesis testing.In single hypothesis testing with nested rejection regions {Γ α : 0 < α < 1}, for an observed statistic T , one will find the rejection region that contains the observed statistic with the smallest type I error α, that is, The same logic can be applied to our false discovery proportion.In this case, we will try to find the largest rejection region 0, t such that the false discovery proportion Q is not more than γ, say 10%, with probability 1 − α.Define We then reject any hypothesis whose P -value is no greater than τ.If τ is independent of Q and Q, then we can expect that Asymptotically, τ and Q, Q may be independent: this question is open for future research.
To overcome the independence assumption of τ and Q, Q , we seek an alternative approach: to find simultaneous UPBs for all rejection regions 0, t , t ∈ 0, 1 , that is, to find an upper prediction band Hence we have the simultaneous inferences on Q t for each rejection region 0, t , t ∈ 0, 1 .
Following the definition of C 1−α n, t in 2.2 to construct the UPB for Q t , we want to define the simultaneous critical values of N n,t .Using the distribution of max t∈ 0,1 N n,t is unwise as max t∈ 0,1 N n,t N n,1 , which takes value n with probability one.A better approach is to center N n,t , that is, This leads to a test statistic related to the Kolmogorov-Smirnov test statistic, which gives an upper confidence band for a cumulative distribution function F x .It turns out that this method leads to very high UPBs when t is close to zero or one.Therefore, we normalize N n,t , that is, Note that Z n is continuously distributed even though each N n,t is discretely distributed.Let We can then redefine Q as Corresponding to Lemma 2.2 and Corollary 2.7, we have similar results below.
Corollary 3.1.For any given 0 < t < 1, Q 1−α m 0 , t of 3.8 is an exactly 1 − α upper prediction band for the false discovery proportion Q t , that is, 3.9 Corollary 3.2.Denote that m 0 m 0,1−α 2 λ .Define 3.11 Remark 3.3.Using the same idea as in the proof of Lemma 2.2, the proof of the above corollaries is straightforward after converting the comparison between Q and Q to the comparison between V t and m 0 t z 1−α m 0 m 0 t 1 − t .This conversion provides a powerful tool for understanding the false discovery proportion.
Remark 3.4.The formulation of Q 1−α m 0 , t in 3.8 is motivated by the normal approximation of N n,t .But our definition of Q 1−α m 0 , t gives exact UPBs simultaneously for all t ∈ 0, 1 due to the exactness of the quantile z 1−α .
Remark 3.5.Meinshausen and Rice 6 and Donoho and Jin 15 also utilize the empirical process Z n .However, they focus on the asymptotic theory for Z n , which may face the slow convergence problem described in the next section.Our focus is on the finite sample control.
Remark 3.6.Let f t k − nt / nt 1 − t .After some simplifications, we have f t where U k is the kth smallest ordered one among the n samples of U 0, 1 distribution.This formulation facilitates the computation of the distribution of Z n by Monte Carlo methods.The standard error associated with the Monte Carlo simulations in computing the probability in 3.7 is no greater than α 1 − α /B, where B is the number of simulations.The black solid curve is computed from 10 6 Monte Carlo simulations.The red-dashed curve is based on 3.13 .The green dot-dashed curve is computed from 3.14 .

Computing the Distribution of Z n
In order to make simultaneous inferences, we need to know the distribution of Z n defined in 3.6 .Example 1 of Jaeschke 16 showed that asymptotically, for any x,

3.13
This implies that Z n / √ 2 lnlnn converges to 1 in probability as n goes to ∞. Jaeschke 16 claimed that this probability convergence is of almost no practical use.This is where we need to be cautious using asymptotic results.Figure 1 shows the poor approximation of the asymptotic result, even for a very large n 10 5 .Noe and Vandewiele 17 gave an iterative algorithm to compute the exact probability pr Z n ≤ z .Their algorithm is only good for very small n due to the computational time and propagation of precision errors in representing real numbers in computer.Equation 24 of their paper gives an approximate formula for n 1, . . ., 100,

3.14
This approximation is very good for z ≥ 4 but is away from the true probability when z < 4. For our applications, the 50% quantile median of Z is very useful, but the approximation of 3.14 is poor there.In order to overcome the above poor approximation, we propose to use the Monte Carlo method to obtain the probability pr Z ≤ z .Figures 2 and 3 give the probability for z ∈ 1, 20 with 10 6 simulations for n 1, 10, 10 2 , 10 3 , 10 4 , and 10 5 .The Monto Carlo method generated quantiles z for n 1, . . ., 100 are almost the same as those quantiles that were able to be computed by the exact algorithm in Noe and Vandewiele 17 .Our two figures show that the distribution of Z n does not change dramatically from n 1, . . ., 10 5 .This property is beneficial for a multiple testing problem with large number of hypotheses, as it will not be overpenalized.Table 1 gives the quantiles of Z n with n 10 5 .

More about the Upper Confidence Bound for m 0
In computing the UCB for m 0 and consequently the UPB for Q t , we rely on the unspecified parameter λ.A conventional choice of λ is 0.5.It is tempting to use min λ∈ 0,1 m 0,1−α λ as the best UCB for m 0 .This approach should be avoided as it may lead to an overoptimistic UCB.We can use the same idea in computing the simultaneous upper prediction bounds for Q t to find an UCB for m 0 .Equation 2.20 motivates to the following theorem.
Proof.Note that when x > 0,

3.21
The last step follows: i z 1−α m is no less than z 1−α n for any n ≤ m, and ii m−R λ ≥ m 0 −V λ .The fact ii gives

3.22
Following the same idea as in the proof of Theorem 2.3 part b , we can show that the set 3.23

A Focused Inference on Q and m 0 : A Unified Approach
In many applications, it may be unnecessary to compute the simultaneous UPBs for Q t for all t in 0, 1 and using m 1−α λ for all λ in 0, 1 to derive a 1−α UCB for m 0 .In most applications, it may be reasonable to restrict the rejections onto { 0, t : t ∈ t 0 , t 0 }.The t 0 can take value of 0.01/m based on Bonferroni FWER control at level 0.01.It is rare to consider a smaller rejection region than this.The t 0 can take value of 0.05 as it is rare to consider a larger rejection region than 0, 0.05 even in a single hypothesis testing problem.For the same reason, we can also restrict λ onto λ 0 , λ 0 in 3.17 .The interval λ 0 , λ 0 can be taken as a region close to one as the minimum of m 0,1−α λ is reached when λ is close to 1 18 , but if λ is too close to 1, m 0,1−α λ is not stable.One good choice of λ 0 can be 0.8, and λ 0 can be 0.95.The above scenario is a focused inference on Q and m 0 .The z in Section 3.2 will be a little bit more conservative for us.We can redefine Z * n as in the following: From this Z * n we can define the 1 − α quantile z * 1−α n , and derive results similar to Theorem 3.7.Figure 4 shows quantiles z * 1−α n for n 1, . . ., 10 5 for t 0 , t 0 0.01/n, 0.05 and λ 0 , λ 0 0.8, 0.95 .Table 2 gives the numerical values of z * 1−α n for n 10 5 .It clearly shows that z * 1−α n is around 10% smaller than the unrestricted quantiles z * 1−α n .For small values of α, say that α ≤ 0.01, the former is at least 25% smaller than the latter.Note that the 1−α UCB for m 0 takes not only the minimum of m * 0,1−α λ for λ ∈ λ 0 , λ 0 , but also the minimum of m * 0,1−α t for t ∈ t 0 , t 0 .This advantage is due to the construction of the Z * n , which takes maximum over these two intervals.The details of the calculation are summarized in Algorithm 2. The proof of this corollary is the same as that in Theorem 3.7.
By setting λ 0 λ 0 t and t 0 t 0 t, this corollary is equivalent to Theorem 2.3 through some algebra manipulations, while Theorem 2.3 uses the exact confidence bound from the binomial distribution without relying on the quantiles of Z * n .Furthermore, Theorem 3.7 is exact a special case of Corollary 4.1 by setting λ 0 0, λ 0 1, and t 0 0, t 0 1 and by considering open intervals rather close intervals.The focused inference thus unifies both the fixed rejection approach and simultaneous approach.We should be cautious of selecting t 0 , t 0 and λ 0 , λ 0 based on the observed data, which may result in overoptimistic false discovery proportions.These settings have to be decided before the data are generated.A careful study of choosing appropriate values for t 0 , t 0 and λ 0 , λ 0 is open for future research.

Generalizing the Results to Less-Independent Situations
The results of Sections 2, 3, and 4 are based on the assumption that the true null P -values are independently distributed as U 0, 1 .Given this, we need no further assumptions concerning the false null P -values.This independence assumption can be weakened as in the following:  The probability distribution of Z * n is computed with 10 6 Monte Carlo simulations.The interval t 0 , t 0 takes value of 0.01/n, 0.05 , and interval λ 0 , λ 0 takes value of 0.80, 0.95 .The curves from the top to the bottom correspond to n 1, 10, 10 2 , 10 3 , 10 4 , and 10 5 .
γ}, reject the hypotheses whose P -values are no greater than τ, which ensures that the false discovery proportion Q is not exceeding γ with probability 1 − α.
Algorithm 2: Focused simultaneous inferences on the UPBs for the false discovery proportion and the UCB for m 0 .

Binomial Dominant Condition
The notation X d ≤ Y means that random variable X is stochastically no greater than random variable Y , that is, pr X ≤ x ≥ pr Y ≤ x for any x.Replacing the independence assumption by the binomial dominant condition, the results corresponding to Lemma 2.2, Theorem 2.3, and Corollary 2.7 in Section 2 still hold for a fixed rejection region.For the simultaneous UPBs in Sections 3 and 4, we need a stronger assumption than the binomial dominant condition as the joint distribution of {V t , t ∈ 0, 1 } needs to be specified.We can replace the binomial dominant condition by the following.
Replacing the independence assumption by this joint binomial dominant condition, the results in Sections 3 and 4 are still valid for the upper prediction band for Q and the UCB for m 0 .A special case for this joint binomial dominant condition is that when the true null P -values are independent with distribution stochastically no smaller than U 0, 1 .This happens when the null hypothesis is composite or the statistic to test the null hypothesis is not a continuous random variable.
More generally, we would like the construction of upper prediction band for Q not to rely on the independence assumption or any kind of weak dependence assumption the binomial dominant condition or the joint binomial dominant condition .The method of Romano and Shaikh 7, 8 can be applied without any assumptions on the dependence, but may potentially have lost power due to that the correlation structure of the data has not been exploited.A resampling procedure 13, 19 has been proposed to address this limitation.

Discussion
The method of this paper applies to data where true null P -values are independent, or to slightly dependent data where the joint binomial dominant condition is satisfied.This assumption does not rely on any specification for the false null P -values.In this paper we used the idea of considering a fixed rejection region to construct a UPB for Q t and a UCB for m 0 .By utilizing the normalized empirical process Z n sup t∈ 0,1 N n,t − nt / nt 1 − t , we find simultaneous UPBs for Q t for all t ∈ 0, 1 and can further modify the construction of the UCB for m 0 .The result of Theorem 3.7 gives the joint statement about the UCB for m 0 and the simultaneous UPBs for the false discovery proportions Q.A focused approach in Corollary 4.1 unifies the result of the fixed rejection region method and the simultaneous approach.
The method in this paper is based on finite samples and avoids the slow convergence problem of the asymptotic theory for the empirical process Z n .The Monte Carlo simulations give very accurate estimates of the quantiles for Z n .The standard error associated with the Monte Carlo simulations in computing the probability in 3.7 is no greater than α 1 − α /B, where B is the number of simulations.
In the dataset where the test statistics are not independent or do not satisfy joint binomial dominant condition, the method in this paper may not be guaranteed to work.One can alternatively use the methods proposed in Romano and Shaikh 7, 8 , Meinshausen 13 , Ge et al. 19 .The method proposed in this paper can be potentially extended to dependent data by using resamplings, and this work is open for future research.

Remark 2 . 4 .
Theorem 2.3 gives researchers a good sense of the total number m 0 of true null hypotheses.Other papers, for example, Storey et al. 10 , Benjamini and Hochberg 11 , and Langaas et al. 12 , gave only point estimates of m 0 or π 0 m 0 /m.Part a gives a confidence

Figure 1 :
Figure 1: Plot of the probability pr Z n ≤ z for 1 ≤ z ≤ 20 with n 10 5 .The x-axis is plotted on a log scale.The black solid curve is computed from 10 6 Monte Carlo simulations.The red-dashed curve is based on 3.13 .The green dot-dashed curve is computed from 3.14 .

Figure 2 :
Figure 2: Plot of the probability pr Z n ≤ z for 1 ≤ z ≤ 20.The probability is computed with 10 6 Monte Carlo simulations.The from the top to the bottom correspond to n 1, 10, 10 2 , 10 3 , 10 4 , and 10 5 .

Figure 3 :
Figure 3: A blowup of Figure2.This shows the part of the probability pr Z n ≤ z that is close to 1 as 1−P is drawn on a log scale for the y-axis.

Table 1 :
The quantiles of Z n of Figure3, where n 10 5 .This table is estimated by 10 6 Monto Carlo simulations.The column 1 − α gives the probabilities, and the column z gives the quantiles of Z n .

Table 2 :
The quantiles of Z * n of Figure4, where n 10 5 .The column 1 − α gives the probabilities, and the column z * gives the quantiles of Z * n .