The problem of estimating the proportion,
Estimating the proportion
In a multiple hypothesis testing problem,
We assume the
Outcomes from
Not rejected  Rejected  Total  











Total 



In Table
The problem of estimating the proportion
Benjamini and Hochberg [
The paper is organized as follows. Section
A mixture model can be used to fit the
Consider the common marginal mixture density of the
Note that the idea leading to Storey’s estimator
Let
The average estimate approach was motivated by Storey’s method. Jiang and Doerge [
Let
Define
We propose a finite mixture model of the uniform distribution and a multinomial distribution
To apply the approach, we need to settle two things first: (a) convert the continuoustype observations
It is often the case in applications that the
To transform
Note that maximizing the nonlinear loglikelihood function (
Let
From
To sum up, let
(1)
(2) Initialization: Set
(3)
(4) Set
Compute
(5)
(6) Then
It is well known that each EM iteration gets closer to the maximum of loglikelihood but only in a linear convergency rate. If the components are similar in their densities, then the convergence is extremely slow. The convergence will also be slow when the maximum likelihood solution requires some of the weight parameters to be zero, because the algorithm can never reach such a boundary point. An additional and related problem is that of deciding when to stop the algorithm. One risk to a naive user is the natural tendency to use a stopping rule for the algorithm based on the changes in the parameters or the likelihood being sufficiently small. Taking smalls steps does not indicate that we are close to the solution.
To combat this problem, Lindsay [
In order to investigate the properties of the estimators described in Section
Monte Carlo data were simulated independently from
Storey’s bootstrap estimate denoted by
Langaas et al.’s convex estimate denoted by
Jiang and Doerge’s average method estimate denoted by
Monte Carlo empirical estimates with standard deviations (in parentheses) by the new innovative approach along with existing estimators. The Monte Carlo size is 1,000 in each case. Denote








Estimates of  
200  0.25  0.235  0.30  0.500  0.314  0.430 
(0.016)  (0.021)  (0.035)  (0.022)  (0.03)  
0.5  0.489  0.613  0.660  0.584  0.626  
(0.035)  (0.043)  (0.047)  (0.041)  (0.044)  
0.75  0.78  0.717  0.79  0.685  0.773  
(0.055)  (0.051)  (0.056)  (0.048)  (0.054)  


500  0.25  0.249  0.34  0.528  0.33  0.491 
(0.011)  (0.015)  (0.024)  (0.014)  (0.022)  
0.5  0.52  0.646  0.720  0.623  0.704  
(0.023)  (0.028)  (0.032)  (0.027)  (0.031)  
0.75  0.72  0.806  0.860  0.785  0.819  
(0.032)  (0.036)  (0.038)  (0.035)  (0.035)  


1000  0.25  0.242  0.393  0.482  0.349  0.394 
(0.005)  (0.012)  (0.015)  (0.011)  (0.013)  
0.5  0.51  0.61  0.66  0.61  0.63  
(0.016)  (0.019)  (0.021)  (0.019)  (0.019)  
0.75  0.72  0.67  0.86  0.79  0.81  
(0.023)  (0.021)  (0.027)  (0.025)  (0.026) 
To further evaluate the performance of the new method in comparison to the three existing methods, consider the real life data from the DNA microarray experiments reported in Golub et al. [
Histogram of the
The four estimates for the proportion
The estimates of the proportion of true null hypotheses with the microarray data of Golub et al. [





0.4150  0.4643  0.4701  0.4913 
The authors declare that there is no conflict of interests regarding the publication of this paper.