This paper studies the relationship between Kendall's tau and Pearson correlation coefficient under the so-called bivariate homogeneous shock (BHS) model. We find Capéraà-Genest-type inequality may not hold for general BHS model. Computational simulations suggest that the Denials' inequality is likely to be true.
1. Introduction
Pearson's correlation coefficient r is the mostly used nonparametric measure of association for two random variables. Besides it, the Spearman's ρ and Kendall's τ are two very useful measures of association. As we all know, the Spearman's ρ is the ordinary Pearson's correlation coefficient.
The relationship between ρ and τ has received considerable attention recently. For instance, Hutchinson and Lai [1] conjecture that -1+1+3τ≤ρ≤min{3τ/2,2τ-τ2} for stochastically increasing random variables. Hürlimann [2] has shown that the entire Hutchinson and Lai conjecture holds for bivariate extreme value distributions. Munroe et al. [3] show that the Hutchinson and Lai conjecture, 1+3τ≤(1+ρ)2, does not hold when X and Y are stochastically increasing. Fredricks and Nelsen [4] show that, under mild regularity conditions, the limit of the ratio ρ/τ is 3/2 as the joint distribution of the random variables approaches to independence. Capéraà and Genest [5] have shown that ρ≥τ≥0 when one variable is simultaneously left-tail decreasing and the other right-tail increasing. Genest and Nešlehová [6] give a short analytical proof for classical Daniels' inequality |3τ-2ρ|≤1, found by Daniels [7].
However, relatively little attention has been paid on the relationships between r and τ. One example on this track is Edwardes [8], in which the author has shown that r=τ under a bivariate exponential (BVE) model introduced by Marshall and Olkin [9].
Based on a two-component series system subjected to some fatal shocks, Wang and Li [10] introduced a more general bivariate model, which is referred to as bivariate homogeneous shock (BHS) model. In this paper, we study the relationships between r and τ under BHS model. We find that the most revealed relationships between ρ and τ will no longer hold for the relationships between r and τ. However, computational simulations suggest that the Daniels-type inequality, |3τ-2r|≤1, is likely to be true.
2. Main Results
For any pair of random variables (X,Y), let F(x,y) be its bivariate cumulative distribution function. The classical Pearson correlation coefficient r of (X,Y) is defined as follows:
(2.1)r=Cov(X,Y){Var(X)Var(Y)}1/2,
and Kendall's τ is defined as follows:
(2.2)τ=4∬F(x,y)dF(x,y)-1.
Consider a two-component series system subjected to some fatal shocks. Assume there are three kinds of fatal shocks. Shock A governed by random variable U destroys component 1, shock B governed by random variable V destroys component 2, and shock C governed by random variable W destroys both components simultaneously. We refer to such a system as bivariate homogeneous shock (BHS) model. Clearly, under this model the life length of component 1 is X=min(U,W) and that of component 2 is Y=min(V,W). Especially, if the random variables U, V, and W are all exponential, the BHS model is just the BVE model proposed by Marshall and Olkin [9].
A prominent feature of BHS model is its singularity. More specifically, even though U, V, and W all are continuous random variables, the joint distribution of X and Y is usually discontinuous.
Denote the survival functions of U,V, and W as u(x)=pr(U>x), v(x)=pr(V>x), and w(x)=pr(W>x), respectively. Wang and Li [10] have shown that under BHS model,
(2.3)τ=-∫u2v2dw2,(2.4)r=∬u(x)v(y)w(x⊙y)dxdy{∬(uw)(x⊙y)dxdy}1/2{∬(vw)(x⊙y)dxdy}1/2,
where f(x⊙y) is defined as f(x⊙y)=f(x∨y)-f(x)f(y) for any function f(x) with ∨ the maximization operator.
In the same manner as Chen et al. [11], we define a proportional hazard model as a submodel of BHS. We say a BHS model is a proportional hazards model if there exist some constants λ, α, and β, such that
(2.5)u(x)=u0λ(x),v(x)=u0α(x),w(x)=u0β(x),
for some baseline survival function u0(x). Clearly, the BVE model is a proportional hazard model with u0(x)=e-x,x≥0.
We refer the constants λ, α, and β as to shape constants since they are determined by the structural representation of system. If an association measure depends only on the structural representation of system, we then say it is fully structure determined.
Under proportional hazard model, by (2.3), we can easily obtain
(2.6)τ=β(λ+α+β).
Thus, the Kendall's τ is fully structure determined. The correlation coefficient r, as we can verify, is not fully structure determined.
The exact expression of r is not so easy to obtain in general, except for two special cases when u0(x)=e-x,x≥0, or u0(x)=1-x,0≤x≤1, that is, when the baseline distribution is exponential or uniform. For our convenience, we refer to the first model as model A and the second one as model B. Clearly, model A is just the BVE model. Now we list the main results in the following theorem.
Theorem 2.1.
Let r and τ be the Pearson's correlation coefficient and Kendall's tau. Then under models A or B, one has the following.
r≥τ.
The limit of the ratio of r/τ can be any number larger than 1 when the model approaches to independence situation.
|2r-3τ|≤1.
Proof.
The results of Theorem 2.1 in model A are just that of Edwardes [8]. So, we only need to focus on model B. Denote g(θ)=∬uθ(x⊙y)dxdy, and f(λ,α,β)=∬uλ(x)uα(y)uβ(x⊙y)dxdy. When u(x)=1-x,0≤x≤1, we have,
(2.7)g(θ)=∬uθ(x⊙y)dxdy=2∫01(1-x)θdx∫0x(1-(1-y)θ)dy=θ(θ+1)2(θ+2).(2.8)f(λ,α,β)=∬uλ(x)uα(y)uβ(x⊙y)dxdy=∬x≥yuλ(x)uα(y)uβ(x){1-uβ(y)}dxdy+∬x<yuλ(x)uα(y)uβ(y){1-uβ(x)}dxdy=1α+1⋅1λ+β+1-1α+1⋅1λ+α+β+2-1λ+β+1⋅1α+β+1+1α+β+1⋅1λ+α+2β+2+1λ+1⋅1α+β+1-1λ+1⋅1λ+α+β+2-1λ+β+1⋅1α+β+1+1λ+β+1⋅1λ+α+2β+2=β(α+β+1)(λ+β+1)(λ+α+β+2).
Thus, we obtain,
(2.9)r=β(λ+α+β+2)⋅(λ+β+2)(α+β+2)(λ+β)(α+β).
We want to show Γ>1. Denote α+β=u, λ+β=v, and λ+α+β=w. Then,
(2.11)Γ=rτ=ww+2u+2u⋅v+2v=11+(2/w)(1+2u)(1+2v).
Clearly, we have, max{u,v}≤w≤u+v. With a little bit notational confusion, we relabel 2/u, 2/v and 2/w as u, v, and w, respectively. Then, we have, w≤min{u,v}. Without loss of generality, we assume u≥v, and then,
(2.12)Γ=11+w(1+u)(1+v)≥11+v(1+v)(1+v)=1.
As we can see, the equality holds only when u=v=w, that is, α=λ=β=0. Hence, when the three parameters are not all zero, Γ>1, and thus r>τ.
Consider
(2.13)Φ=limβ→0+rτ=λ+αλ+α+2(1+2λ)(1+2α).
In a similar way, we can show that Φ>1. We can show that Φ can be any number that is larger than 1. Let α=kλ, then,
(2.14)Φ=(1+λ)(1+kλ)1+(k/(1+k))λ.
As k→0, Φ→1+λ, which can be any number that is larger than 1.
Denote Ψ=2r-3τ, then,
(2.15)Ψ=2β(λ+α+β+2)(λ+β+2)(α+β+2)(λ+β)(α+β)-3βλ+α+β.
Since Ψ is symmetric about α and λ, the minimum or maximum of Ψ will be attained on α=λ. So we just need to show that the minimum or maximum of Ψ will be between -1 and 1.
When α=λ, Ψ becomes
(2.16)Ψ=2β(2α+β+2)α+β+2α+β-3β2α+β=2β(α+β+2)(2α+β)-3β(2α+β+2)(α+β)(α+β)(2α+β)(2α+β+2)=ND,
where
(2.17)N=-[β3+β2(3α+2)+β(2α2-2α)],D=β3+β2(5α+2)+β(8α2+6α)+4α3+4α.
We have,
(2.18)D+N=2αβ2+(6α2+8α)β+4α2(α+1)>0,D-N=2β3+(8α+4)β2+(10α2+4α)β+4α2(α+1)>0.
Hence, Ψ=N/D will be between -1 and 1.
3. Computational Simulations
In order to investigate the relationships between r and τ under general BHS model, we conduct some computational simulations. For the sample data {(xi,yi),i=1,2,⋯,n}, the sample's r is computed as follows:
(3.1)r^=∑(xi-x-)(yi-y-)∑(xi-x-)2⋅∑(yi-y-)2.
While the sample's Kendall's τ is computed as follows:
(3.2)τ^=(n2)-1∑1≤i<j≤naijbij,
where aij=1 if xi<xj, and -1 if xi>xj, and bij=1 if yi<yj, and -1 if yi>yj. We compute the sample values of r, τ, r/τ, and 2r-3τ under several cases. The sample size is set as n=100, and for each computation, the iteration is 100. Table 1 gives the results.
r
τ
r/τ
2r-3τ
Case 1
0.3984
0.3312
1.2032
−0.1966
Case 2
0.1774
0.1788
0.9919
−0.1817
Case 3
0.5801
0.4271
1.3581
−0.1212
Case 4
0.6208
0.3924
1.5821
0.0644
Case 5
0.1100
0.1788
0.6151
−0.3165
In case 1, we set U, V, and W as uniform variables on [0,1]. By (2.6) and (2.9), we obtain, r=0.4 and τ=1/3, and the ratio r/τ=1.2. In case 2, the variables U, V, and W all follow exponential distributions with means 1,2, and 3, respectively. In this case, r=τ=(1/3)/(1+1/2+1/3)=2/11=0.1818, so the ratio is exactly 1. Based on the computation results for these two cases, we can see that the numerical computations are quite precise. In case 3, we set all variables to follow standard normal distribution. In case 4, we set U=Z2, V=(Z+0.2)2, and W=(Z+1.0)2, where Z is the variable of standard normal. In case 5, we set U=E12, V=E22, and W=E32, where E1~ exp(1), E2~ exp(2), and E3~ exp(3). Surprisingly, in this case, the ratio r/τ is less than 1. In all the cases, we find the Daniels' inequality |2r-3τ|≤1 holds.
4. Concluding Remarks
The relationship between Spearman's ρ, which is the ordinary Pearson correlation coefficient, and Kendall's τ has been of interest for a long time. However, little attention has been paid on the relationship of Pearson correlation coefficient and Kendall's τ. In this paper, we investigate their relationship under the so-called BHS model. We find that even though for some typical BHS models, the Capéraà-Genest type inequality, r≥τ≥0, holds, but for general BHS model, the inequality may not hold. Our simulation studies suggest that the Daniels-type inequality, -1≤2r-3τ≤1, will hold under BHS model. We thus conjecture that the Daniels-type inequality will be valid in general. However, theoretical confirmation for such conjecture merits further study.
HutchinsonT. P.LaiC. D.1990Adelaide, AustraliaRumsby Scientific Publishingxxxii+4121070715HürlimannW.Hutchinson-Lai's conjecture for bivariate extreme value copulas200361219119810.1016/S0167-7152(02)00349-81950669ZBL1101.62340MunroeP.RansfordT.GenestC.Un contre-exemple à une conjecture de Hutchinson et Lai20103485-630531010.1016/j.crma.2010.01.0262600129ZBL1194.60022FredricksG. A.NelsenR. B.On the relationship between Spearman's rho and Kendall's tau for pairs of continuous random variables200713772143215010.1016/j.jspi.2006.06.0452325421ZBL1120.62045CapéraàP.GenestC.Spearman's ρ is larger than Kendall's τ for positively dependent random variables19932218319410.1080/104852593088325511256381GenestC.NešlehováJ.Analytical proofs of classical inequalities between Spearman's ρ and Kendall's τ2009139113795379810.1016/j.jspi.2009.05.0172553765DanielsH. E.Rank correlation and population models1950121711810040629ZBL0040.22302EdwardesM. D.Kendall's τ is equal to the correlation coefficient for the BVE distribution199317541541910.1016/0167-7152(93)90264-J1237790MarshallA. W.OlkinI.A multivariate exponential distribution19676230440215400ZBL0147.38106WangJ.LiY.Dependency measures under bivariate homogeneous shock models2005391738010.1080/023318804123313298632125229ZBL1075.62049ChenY. Y.HollanderM.LangbergN. A.Small-sample results for the Kaplan-Meier estimator198277377141144648036ZBL0504.62033