Testing Normal Means: The Reconcilability of the P Value and the Bayesian Evidence

The problem of reconciling the frequentist and Bayesian evidence in testing statistical hypotheses has been extensively studied in the literature. Most of the existing work considers cases without the nuisance parameters which is not the frequently encountered situation since the presence of the nuisance parameters is very common in practice. In this paper, we consider the reconcilability of the Bayesian evidence against the null hypothesis H 0 in terms of the posterior probability of H 0 being true and the frequentist evidence against H 0 in terms of the P value in testing normal means where the nuisance parameters are present. The reconcilability of evidence can be obtained both for testing a normal mean and for the Behrens-Fisher problem.


Introduction
In the problem of testing a statistical hypothesis 0 , a frequentist may give evidence against 0 by the observed significance level, the value, while a Bayesian may give it by the posterior probability that 0 is true. Lindley [1] illustrated the possible discrepancy between the Bayesian and the frequentist evidence. The relationship of these two measures of evidence is then extensively studied in the literature. Pratt [2] revealed that the values are usually approximately equal to the posterior probabilities in the onesided testing problems. Casella and Berger [3] considered testing the one-sided hypothesis for a location parameter and showed that the lower bounds of the posterior probability over some reasonable classes of priors are exactly equal to the corresponding values in many cases. Some important papers which deal with the reconcilability of the Bayesian and frequentist evidence are Bartlett [4], Cox [5], Shafer [6], Berger and Delampady [7], and Berger and Sellke [8].
Although many researches have been carried out to deal with the problem of reconciling the Bayesian and frequentist evidence and some of them show that evidence is reconcilable in several specific situations, most of the existing work assumes that no other unknown parameters are present except the parameters of interest. In fact, we may be confronted with the nuisance parameters in various situations. In the location-scale settings, for example, when the location parameter is unknown, so is the scale parameter, in general.
However, in significance testing of hypotheses with the nuisance parameters, the classical values are typically not available. Tsui and Weerahandi [9], considering testing the one-sided hypothesis of the form where is the parameter of interest and is a fixed constant, introduced the concept of the generalized value, which appears to be useful in situations where conventional frequentist approaches do not provide useful solutions. Tsui and Weerahandi [9] and some later relevant works formulated the generalized values for many specific examples. Hannig et al. [10] provided a general method for constructing the generalized value via fiducial inference.
In this paper, for the one-sided testing situations about normal means where the nuisance parameters are present, we study the reconcilability of the Bayesian evidence and , from which we know that Consequently, the posterior probability of 0 being true is where ] is a -variable with ] degrees of freedom. Notice Proof. Suppose that is a nonpositive random variable obtained by the negative part of ; that is, the density of is The Scientific World Journal 3 Then the density of = /√ is By Theorem 3.3.2 in Lehmann [11], for any fixed nonpositive constant , we have that ( ≤ ) is nonincreasing in since it can be verified that the family of densities ( , ) has monotone likelihood ratio in . This implies that Lemma 1 holds for the case when ≤ 0 since ( ≤ ) = ( ≤ )/2. Since when > 0, we have ( ≤ ) = 1/2 + (0 ≤ ≤ ), the proof for the latter case is completely analogous if we introduce a nonnegative random variable obtained by the positive part of . Now take = √ ( − )/( √ − 1 ). By Lemma 1, for ≤ 0, we have Then comparing (3) and (10), for 0 = and any fixed nonnegative ] 0 , we have which implies that The reconcilability of the Bayesian and frequentist evidence is therefore obtained in this testing problem. We summarize this as the following theorem.

Theorem 2.
For testing the hypothesis of the form (2) under a normal distribution ( , 2 ) with 2 unknown, the Bayesian and frequentist lines of evidence are reconcilable under the conjugate class of priors (4).

Behrens-Fisher
Problem. Now we turn to consider the Behrens-Fisher problem. It is a classical testing situation in which the nuisance parameters are present and no useful pivotal quantities are available. Suppose that 1 , . . . , and 1 , . . . , are two independent random samples from two normal populations ( 1 , 2 1 ) and ( 2 , 2 2 ), respectively, where both 2 1 and 2 2 are completely unspecified. We are interested in testing the hypothesis of the form where is a fixed constant. In situations where the traditional frequentist approaches fail to provide useful solutions, the conception of the generalized values introduced by Tsui and Weerahandi [9] appears to be helpful in deriving the frequentist evidence for testing a statistical hypothesis. For this specific problem of testing hypothesis (16), we can give the generalized value as where In this problem, we consider the reconcilability of evidence under the following conjugate class of prior distributions 2 : 2 ) . (18) where (20) Then the posterior density of ( , 2 , 2 1 , 2 2 ) is The Scientific World Journal So that the posterior probability of 0 is It is straightforward to check that lim 01 , 02 , 01 , 02 → 0 where and are the observation of the sample mean and , respectively, 2 1 and 2 2 are that of the sample variance 2 1 and 2 2 respectively, ∼ (0, 1), ∼ (0, 1), 2 +] 01 +1 ∼ 2 ( + ] 01 + 1) and 2 +] 02 +1 ∼ 2 ( + ] 02 + 1). Now we prove an interesting result that, when and are sufficiently large, the frequentist and Bayesian lines of evidence given respectively by (17) and (23) Proof. Let (I) We first prove that, given 2 , as is sufficiently large, we have In fact, for any > 0, as is sufficiently large, where ( , ) = (] 02 +2) 2 2 /[( + ] 02 +2) ] → 0, as → ∞. On the other hand, we have Since 2 +] 02 +1 > 2 −1 holds for any , it follows that, as → ∞, That is, Similarly, as → +∞, we have Therefore, we have Consequently, we have The Scientific World Journal 5 (II) We now show that the final conclusion holds. In fact, if we let ( , ) = − ( − ), then where Φ(⋅) stands for the cumulative distribution function of a standard normal distribution and the last equation is due to the fact that and are independent normal distributions. Similarly, for (24), we have lim 01 , 02 , 01 , 02 → 0 Note that for each ( , ) in (−∞, 0), Φ( ( , )/ ) is increasing in ∈ (0, ∞). Therefore, by (35), we have Φ ( ( , ) Consequently, we have √ 1 + 2 ))) , as ( , ) < 0.
The following theorem shows that, even for fixed and with 2 < , < ∞, we still obtain the reconcilability of the frequentist and Bayesian evidence.
The Scientific World Journal 7 The following simulation results show that even for small and fixed values of and or ] 01 and ] 02 , the generalized value and Bayesian evidence for testing the Behrens-Fisher problem are still reconcilable.

Conclusions
In the presence of the nuisance parameters, we study the reconcilability of the value and the Bayesian evidence in the one-sided hypothesis testing problem about normal means. For the problem of testing a normal mean where the nuisance parameter is present, it is shown that the Bayesian and frequentist lines of evidence are reconcilable. For the Behrens-Fisher problem, it is illustrated that if the sample sizes and tend to infinity, then for fixed prior parameters ] 01 and ] 02 , both lines of evidence are reconcilable. Furthermore, it is illustrated that if the prior parameters ] 01 and ] 02 tend to infinity, then for any fixed sample sizes and , lines of evidence are reconcilable. Simulation results show that even for small and fixed values of sample sizes and or for small values of prior parameters ] 01 and ] 02 , the reconcilable conclusion of the Bayesian and frequentist evidence still holds.
This provides another illustration of testing situation where the Bayesian and frequentist evidence can be reconciled and may therefore to some extent prevent people from debasing or even dismissing values as evidence in hypothesis testing problems. Furthermore, our results of the reconcilability in the one-sided testing situations may help us to come to the idea that maybe it is arbitrary to assert the irreconcilability of the evidence in the two-sided (point or interval) hypothesis testing problems and perhaps we should be concerned more about the appropriateness of the methods we employ to tackle a two-sided hypothesis in both the frequentist and the Bayesian frameworks.