SyrexeSRX Mathematics1687-83021687-8299Hindawi Publishing Corporation92157410.3814/2009/921574921574Research ArticleMultiple Hypotheses LAO Testing for Many Independent ObjectsHaroutunianEvgueniHakobyanParandzemInstitute for Informatics and Automation ProblemsArmenian National Academy of Sciences1 P. Sevak street0014 YerevanArmeniasci.am2009180820092009592008275200915720092008Copyright © 2009This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The procedure of many hypotheses logarithmically asymptotically optimal (LAO) testing for a model consisting of three or more independent objects is analyzed. It is supposed that M probability distributions are known and each object follows one of them independently of others. The matrix of asymptotic interdependencies (reliability-reliability functions) of all possible pairs of the error probability exponents (reliabilities) in optimal testing for this model is studied. This problem was introduced (and solved for the case of two objects and two given probability distributions) by Ahlswede and Haroutunian; it is a generalization of two hypotheses LAO testing problem for one object investigated by Hoeffding, Csiszár and Longo, Tusnády, Longo and Sgarro, Birgé, and others.

1. Introduction

In  Ahlswede and Haroutunian formulated an ensemble of new problems on multiple hypotheses testing for many objects and on identification of hypotheses. These problems are extensions of those investigated in the books mentioned in [4, 5]. Problems of distribution identification and distributions ranking for one object were solved in . Also the problem of hypotheses testing for the model consisting of two independent or two strictly dependent objects (when they cannot admit the same distribution) with two possible hypothetical distributions was investigated in . In this paper we study the specific characteristics of the model consisting of K(3) objects which independently of others follow one of given M(2) probability distributions. The study concerns certain number K of similar objects (cities, institutions, schools, hospitals, factories, etc.), or one object in a series of K different periods of time. The problem is a generalization of two hypotheses testing investigated in  and of testing of many hypotheses concerning one object solved in . The case of two independent objects with three hypotheses was examined in advanced edition , in a local publication of small circulation.

Investigation of testing of the distributions of many uniform objects is an interesting not yet fulfilled task. It is natural to begin this study with the simplest case of statistically independent objects.

Let 𝒫(𝒳) be the space of all probability distributions (PDs) on finite set 𝒳. There are given M distinct PDs Gm𝒫(𝒳), m=1,M¯, which are known as possible distributions of the objects.

Let us recall main definitions from  for the case of one object. The random variable (RV) X, which is a characteristic of the studied object, takes values on 𝒳 and follows unknown PD G which is one of M given PDs Gm, m=1,M¯. The statistician have to accept one of M hypotheses Hm:G=Gm,m=1,M¯, on the base of a sequence of results x=(x1,,xn,,xN), xn𝒳, n=1,N¯ of N independent observations of the object. The procedure of the decision making is a nonrandomized test φN, which can be defined by division of the sample space 𝒳N into M disjoint subsets 𝒜lN={x:φN(x)=l}, l=1,M¯. The set 𝒜lN contains all vectors x for which the hypothesis Hl is adopted. The probability αlm(φN) of the erroneous acceptance of hypothesis Hl provided that Hm is true is equal to GmN(𝒜lN), lm, where GmN(x)=n=1NGm(xn). We define the probability to reject Hm, when it is true, as αmm(φN)lmαlm(φN).

The exponential decrease of the error probabilities as N is studied. The error probability exponents which is pertinent to call reliabilities, of the sequence of tests φ, are defined as follows: Elm(φ)limN¯-1Nlogαlm(φN),m,l=1,M¯. From (1) and (2) we see that Emm(φ)=minlmElm(φ),m=1,M¯.

The matrix E(φ)=(E11El1EM1E1mElmEMmE1MElMEMM) we call it the reliability matrix of the sequence φ of tests. It was studied in . The question is values of which number of elements of E(φ) can be given in advance and which optimal values can be guarantied by the best test for the others.

The sequence of tests φ* is called logarithmically asymptotically optimal (LAO) if for given positive values of first M-1 diagonal elements of the matrix E(φ*) maximum possible values are provided to all other elements of it. The concept of LAO test was introduced by Birgé  and also elaborated in [11, 12].

Let us now consider the model with three objects. Let X1, X2, and X3 be independent RVs taking values in the same finite set 𝒳 with one of M PDs, this RVs are the characteristics of the corresponding independent objects. The random vector (X1,X2,X3) assumes values (x1,x2,x3)𝒳3.

Let (x1,x2,x3)((x11,x12,x13),,(xn1,xn2,xn3),, (xN1,xN2,xN3)), xnk𝒳,k=1,3¯,n=1,N¯, be a sequence of results of N independent observations of the vector (X1,X2,X3). The test have to determine unknown PDs of the objects on the base of observed data. The selection for each object should be made from the same set of hypotheses: Hm:G=Gm, m=1,M¯. We call this procedure the compound test for three objects and denote it by ΦN, it can be composed of three individual tests φN1, φN2, φN3 for each of three objects. We denote the infinite sequence of compound tests by Φ. When we have K independent objects the test Φ is composed of tests φ1,φ2,,φK.

Let αl1,l2,l3m1,m2,m3(ΦN) be the probability of the erroneous acceptance of the hypotheses triple (Hl1,Hl2,Hl3) by the test ΦN provided that the triple of hypotheses (Hm1,Hm2,Hm3) is true, where (m1,m2,m3)(l1,l2,l3), mi,li=1,M¯, i=1,3¯. The probability to reject a true triple of hypotheses (Hm1,Hm2,Hm3) by analogy with (1) is the following: αm1,m2,m3m1,m2,m3(ΦN)=(l1,l2,l3)(m1,m2,m3)αl1,l2,l3m1,m2,m3(ΦN).

We study corresponding reliabilities El1,l2,l3m1,m2,m3(Φ) of the sequence of tests ΦEl1,l2,l3m1,m2,m3(Φ)limN¯-1Nlogαl1,l2,l3m1,m2,m3(ΦN),mi,li=1,M¯,i=1,3¯.

Definitions (5) and (6) imply (cf. (3)) that Em1,m2,m3m1,m2,m3(Φ)=min(l1,l2,l3)(m1,m2,m3)El1,l2,l3m1,m2,m3(Φ).

We call the test sequence Φ* LAO for the model with K objects if for given positive values of certain part of elements of the reliability matrix E(Φ*) the procedure provides maximal values for all other elements of it.

Our aim is to analyze the reliability matrix E(Φ*)={El1,l2,l3m1,m2,m3(Φ*)} of LAO tests for three objects.

We consider the problem for three objects for brevity; the generalization of the problem for K independent objects will be discussed hereafter along the text and in Section 4, but before that we recall the results for one object. The generalization of the problem for cases when RVs Xi take values in different sets 𝒳i and have hypothetical PDs Gmi, m=1,M¯, i=1,3¯, will be only more complicated in notations.

2. LAO Testing of Many Hypotheses for One Object

We define the divergence (Kullback-Leibler distance) D(Q||G) for PDs Q and G from 𝒫(𝒳) as usual: D(Q||G)=xQ(x)logQ(x)G(x),

For given positive diagonal elements E11,E22,,EM-1M-1 of the reliability matrix we consider sets of PDs l{Q:D(Q||Gl)Ell},l=1,M-1¯,M{Q:D(Q||Gl)>Ell,l=1,M-1¯}, and the following values for elements of the future reliability matrix of the LAO tests sequence: Ell*=Ell*(Ell)Ell,l=M-1¯,Elm*=Elm*(Ell)infQlD(Q||Gm),m=1,M¯,ml,l=1,M-1¯,EMm*=EMm*(E11,,EM-1M-1)infQMD(Q||Gm),m=1,M-1¯,EMM*=EMM*(E11,,EM-1M-1)minl=1,M-1¯EMl*.

We recall the theorem concerning one object.

Theorem 1 ([<xref ref-type="bibr" rid="B11">11</xref>]).

If the distributions Gm are different, that is, all divergences D(Gl||Gm), lm, l,m=1,M¯, are strictly positive, then two statements hold:

(a)when the given numbers E11,E22,,EM-1M-1 satisfy the conditions

0<E11<minl=2,M¯D(Gl||G1),0<Emm<min[minl=1,m-1¯Elm*(Ell),minl=m+1,M¯D(Gl||Gm)],m=2,M-1¯, then there exists an LAO sequence of tests φ*, the elements of the reliability matrix of which E(φ*)={Elm*} are defined in (10)–(13) and all of them are strictly positive;

(b) even if one of the conditions (14) or (15) is violated, then the reliability matrix of any such test includes at least one element equal to zero.

Corollary 1.

The diagonal elements of the reliability matrix of the LAO test in each row are equal only to the element in the same row and in the last column: Emm*=EMm*,Emm*<Elm*,l=1,M-1¯,lm,m=1,M¯.

That is, the elements of the last column are equal to the diagonal elements of the same row and due to (3) are minimal in this row. Consequently first M-1 elements of the last column also can be considered as given parameters for construction of the LAO test.

Proof.

For m=M (16) is the sequence of (3). From the conditions (14) and (15) we see that Emm*<Elm*(Ell*), m=2,M-1¯, l=1,m-1¯, hence Emm* can be equal only to one Elm*, for l=m+1,M¯. Assume that (16) is not true, that is, Emm*=Elm*, for one l[m+1,M-1].

Applying Kuhn-Tucker theorem for (11) we can derive (the proof is not difficult, but long, so we avoid the exposition) that the elements Ell*, l=1,M-1¯ may be determined by elements Elm*, ml, m=1,M¯, with the following inverse functions:

Ell*(Elm*)=infQ:D(Q||Gm)Elm*D(Q||Gl),l=1,M-1¯. Then it follows from (11) and our provisional supposition that Ell*(Elm*)=infQ:D(Q||Gm)Elm*D(Q||Gl)=infQ:D(Q||Gm)Emm*D(Q||Gl)=Eml*,m=1,l-1¯, but one can see from conditions (14) and (15) that Ell*<Eml* for m=1,l-1¯. Our assumption is not correct, hence (16) is valid and equality (3) implies Emm*=EMm*.

3. LAO Testing of Hypotheses for Three Independent Objects

Now let us consider the model of three independent objects and M hypotheses. It was noted that the compound test ΦN may be composed from separate tests φN1, φN2, φN3. Let us denote by E(φi) the reliability matrices of the sequences of tests φi, i=1,3¯, for each of the objects. The following lemma is a generalization of lemmas from [2, 12].

Lemma 1.

If the elements Elm(φi), m,l=1,M¯, i=1,3¯, are strictly positive, then the following equalities hold for Φ=(φ1,φ2,φ3): El1,l2,l3m1,m2,m3(Φ)=i=13Elimi(φi),li,mi=1,M¯,mili,i=1,3¯,El1,l2,l3m1,m2,m3(Φ)=i[[1,2,3]-k]Elimi(φi),mk=lk,mili,ik,k=1,3¯,El1,l2,l3m1,m2,m3(Φ)=Elimi(φi),mk=lk,mili,i=1,3¯,k[[1,2,3]-i].

Equalities (19) are valid also if Elimi(φi)=0 for several pairs (mi,li) and several i’s.

Proof.

It follows from the independence of the objects that αl1,l2,l3m1,m2,m3(ΦN)=i=13αlimi(φNi),ifmili,αl1,l2,l3m1,m2,m3(ΦN)=(1-αlkmk(φNk))×i[[1,2,3]-k]αlimi(φNi),mk=lk,mili,k=1,3¯,ik,αl1,l2,l3m1,m2,m3(ΦN)=αlimi(φNi)k[[1,2,3]-i]×(1-αlkmk(φNi)),mk=lk,mili,i=1,3¯.

Remark that here we consider also the probabilities of right (not erroneous) decisions. Because Elm(φi) are strictly positive then the error probability αlm(φNi) tends to zero, when N. According to this fact we have

limN¯-1Nlog(1-αlm(φNi))=limN¯αlm(φNi)Nlog(1-αlm(φNi))-αlm(φNi)=0. From definitions (5) and (6), equalities (22), and applying (23), we obtain relations (19)–(21).

Now we will show how we can erect the LAO test from the set of compound tests when 3(M-1) strictly positive elements of the last column of the reliability matrix EM,M,Mm,M,M, EM,M,MM,m,M and EM,M,MM,M,m, m=1,M-1¯, are preliminarily given.

The following subset of tests:

𝒟={Φ:Emm(φi)>0,m=1,M¯,i=1,3¯} is distinguished by the property that when Φ𝒟 the elements EM,M,Mm,M,M(Φ), EM,M,MM,m,M(Φ), and EM,M,MM,M,m(Φ), m=1,M-1¯, of the reliability matrix are strictly positive.

Really, because Emm(φi)>0, then in view of (3) EMm(φi) are also strictly positive. From equalities (23) keeping in mind (6), (16), and (22) we obtain that the noted elements are strictly positive for Φ𝒟 and

EM,M,Mm,M,M(Φ)=EMm(φ1),  EM,M,MM,m,M(Φ)=EMm(φ2),EM,M,MM,M,m(Φ)=EMm(φ3),m=1,M-1¯. Define the following family of decision sets of PDs for given positive elements EM,M,Mm,M,M, EM,M,MM,m,M, and EM,M,MM,M,m, m=1,M-1¯: m(i){Q:D(Q||Gm)EM,M,Mm1,m2,m3,mi=m,mj=M,ij,j=1,3¯},m=1,M-1¯,i=1,3¯,M(i){Q:D(Q||Gm)>EM,M,Mm1,m2,m3,mi=m,mj=M,ij,j=1,3¯},m=1,M-1¯,i=1,3¯. Define also the values of the reliability matrix of the LAO test for three objects: EM,M,Mm,M,M*EM,M,Mm,M,M,EM,M,MM,m,M*EM,M,MM,m,M,EM,M,MM,M,m*EM,M,MM,M,m,El1,l2,l3m1,m2,m3*infQRli(i)D(Q||Gmi),mk=lk,mili,ik,i=1,3¯,k[[1,2,3]-i],Em1,m2,m3l1,l2,m3*ikinfQRli(i)D(Q||Gmi),mk=lk,mili,k=1,3¯,i[[1,2,3]-k],El1,l2,l3m1,m2,m3*El1,m2,m3m1,m2,m3*+Em1,l2,m3m1,m2,m3*+Em1,m2,l3m1,m2,m3*,mili.

The following theorem is the main result of the present paper. It is a generalization and improvement of the corresponding theorem proved in  for the cases K=2, M=2.

Theorem 2.

If all distributions Gm, m=1,M¯, are different, (and equivalently D(Gl||Gm)>0, lm, l,m=1,M¯), then the following statements are valid:

(a) when given strictly positive elements EM,M,Mm,M,M, EM,M,MM,m,M, and EM,M,MM,M,m, m=1,M-1¯, meet the following conditions:

max(EM,M,M1,M,M,EM,M,MM,1,M,EM,M,MM,M,1)<minl=2,M¯D(Gl||G1),EM,M,Mm,M,M<min[minl=1,m-1¯El,m,mm,m,m*,minl=m+1,M¯D(Gl||Gm)],m=2,M-1¯,EM,M,MM,m,M<min[minl=1,m-1¯Em,l,mm,m,m*,minl=m+1,M¯D(Gl||Gm)],m=2,M-1¯,EM,M,MM,M,m<min[minl=1,m-1¯Em,m,lm,m,m*,minl=m+1,M¯D(Gl||Gm)],m=2,M-1¯, then there exists an LAO test sequence Φ*𝒟, the reliability matrix of which E(Φ*)={El1,l2,l3m1,m2,m3(Φ*)} is defined in (27)–(30) and all elements of it are positive,

(b) when even one of the inequalities (31)–(34) is violated, then there exists at least one element of the matrix E(Φ*) equal to 0.

Proof.

The test Φ*=(φ1,*,φ2,*,φ3,*), where φi,*, i=1,3¯, are LAO tests of objects Xi, belongs to the set 𝒟. Our aim is to prove that such Φ* is a compound LAO test. Conditions (31)–(34) imply that inequalities analogous to (14) and (15) hold simultaneously for the tests for three separate objects.

Let the test Φ𝒟 be such that

EM,M,Mm,M,M(Φ)=EM,M,Mm,M,M,EM,M,MM,m,M(Φ)=EM,M,MM,m,M,EM,M,Mm,M,M(Φ)=EM,M,MM,M,m,m=1,M-1¯.

Taking into account (25) and (28) we can see that conditions (31)–(34) may be replaced by the following inequalities:

EMm(φi)<min[minl=1,m-1¯infQ:D(Q||Gm)EMm(φi)D(Q||Gl),minl=m+1,M¯D(Gl||Gm)],m=1,M-1¯.

According to Corollary 1 in case of LAO test φi,*, i=1,3¯, we obtain that (36) meets conditions (14)-(15) of Theorem 1. For each test Φ𝒟, Emm(φi)>0, i=1,3¯, hence it follows from (3) that Eml(φi) are also strictly positive. Thus for a test Φ𝒟 conditions of Lemma 1 are fulfilled and the elements of the reliability matrix E(Φ) coincide with elements of matrix E(φi), i=1,3¯, or sums of them (see (19)–(21)). Then from definition of LAO test it follows that Elm(φi)Elm(φi,*), then El1,l2,l3m1,m2,m3(Φ)El1,l2,l3m1,m2,m3(Φ*). Consequently Φ* is an LAO test and El1,l2,l3m1,m2,m3(Φ*) verify (27)–(30).

(b) When even one of the inequalities (31)–(34) is violated, then at least one of inequalities (36) is violated. Then from Theorem 1 one of elements Eml(φi,*) is equal to zero. Suppose E32(φ1,*)=0, then the elements E3,m,l2,m,l(Φ*)=E32(φ1,*)=0.

Theorem 2 is proved.

4. On the Case of <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M227"><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:math></inline-formula>(<inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M228"><mml:mo>></mml:mo><mml:mi>  </mml:mi><mml:mn>3</mml:mn></mml:math></inline-formula>) Objects

When we consider the model with K independent objects the generalization of Lemma 1 will take the following form.

Lemma 2.

If elements Elimi(φi), m,l=1,M¯, i=1,K¯, are strictly positive, then the following equalities hold for Φ=(φ1,φ2,,φK): El1,l2,,lKm1,m2,,mK(Φ)=i=1KElimi(φi),ifmili,i=1,K¯,El1,l2,,lKm1,m2,,mK(Φ)=iElimi(φi),ifmj=lj,mili,i,j=1,K¯,ij.

For given K(M-1) strictly positive elements EM,M,,Mm,M,,M, EM,M,,MM,m,,M,, EM,,M,M,M,M,m, m=1,M-1¯, for K independent objects we can find the LAO test Φ* in a way similar to case of three independent objects. So the problem of many hypotheses testing for the model with K independent objects can be solved by corresponding sets m(i),  i=1,K¯,  m=1,M¯, as in (27)–(30) and conditions analogous to (31)–(34).

5. Example

Some illustrations of exposed results are in examples concerning two objects. The set 𝒳={0,1} contains two elements and the following PDs are given on 𝒳: G1={0,10;0,90}, G2={0,85;0,15}, G3={0,23;0,77}. As it follows from relations (28)–(30) of Lemma 2, several elements of the reliability matrix are functions of one of given elements, there are also elements which are functions of two or three given elements. For example, for a case of two objects in Figures 1 and 2 the results of calculations of functions E1,22,1(E3,31,3,E3,33,2) and E1,22,2(E3,31,3) are presented. For these distributions we have min(D(G2||G1),D(G3||G1))2,2 and min(E2,22,1,D(G3||G2))1,4. We see that when the inequalities (32) or (33) are violated, then E1,22,1=0 and E1,22,2=0.

6. Conclusion

We exposed a solution of multiple hypothesis LAO testing problem for many objects. The first idea may be to study matrix E(Φ) by renumbering K-vectors of PDs from 1 to MK as PDs of one complex object. We can give MK-1 diagonal elements of such matrix E(Φ) and apply Theorem 1 concerning one object. In this case the number of the preliminarily given elements of the matrix E(Φ) would be greater (because MK-1>K(M-1), M2,K2), and the procedure of calculations would be longer than in our algorithm presented in Section 3.

Proposed approach to the problem gives also the possibility to define the LAO tests for each of the separate objects. It must be noted that the approach with renumbering of the triples of hypotheses does not have this opportunity.

In applications one of two approaches may be used in conformity with preferences of the investigator.

AhlswedeR. F.HaroutunianE. A.Testing of hypothesis and identificationElectronic Notes in Discrete Mathematics20052118518910.1016/j.endm.2005.07.020EID2-s2.0-34247171372AhlswedeR. F.HaroutunianE. A.On logarithmically asymptotically optimal testing of hypotheses and identificationGeneral Theory of Information Transfer and Combinatorics20064123New York, NY, USASpringer462478Lecture Notes in Computer ScienceHaroutunianE. A.Reliability in multiple hypotheses testing and identification problems198Proceedings of the NATO-ASI Conference2005Yerevan, ArmeniaIOS Press189201NATO Science Series III: Computer and Systems SciencesBechhoferR. E.KieferJ.SobelM.Sequential Identification and Ranking Procedures1968Chicago, Ill, USAThe University of Chicago PressAhlswedeR. F.WegenerI.Search Problems1987New York, NY, USAJohn Wiley & SonsHoeffdingW.Asymptotically optimal tests for multinomial distributionsAnnals of Mathematical Statistics196536369401CsiszárI.LongoG.On the error exponent for source coding and for testing simple statistical hypothesesStudia Scientiarum Mathematicarum Hungarica19716181191TusnádyG.On asymptotically optimal testsAnnals of Statistics197752385393LongoG.SgarroA.The error exponent for the testing of simple statistical hypotheses, a combinatorial approachJournal of Combinatories, Informational System Sciences1980515867BirgéL.Vitesses maximals de décroissance des erreurs et tests optimaux associeśZeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete198155261273HaroutunianE. A.Logarithmically asymptotically optimal testing of multiple statistical hypothysesProblems of Control and Information Theory1990195-6413421EID2-s2.0-0025637429HaroutunianE. A.HakobyanP. M.On logarithmically asymptotically optimal hypothesis testing of three distributions for pair of independent objectsMathematical Problems of Computer Science2005247681