Binary Classification of Multigranulation Searching Algorithm Based on Probabilistic Decision

Multigranulation computing, which adequately embodies the model of human intelligence in process of solving complex problems, is aimed at decomposing the complex problem into many subproblems in different granularity spaces, and then the subproblems will be solved and synthesized for obtaining the solution of original problem. In this paper, an efficient binary classification of multigranulation searching algorithm which has optimal-mathematical expectation of classification times for classifying the objects of the whole domain is established. And it can solve the binary classification problems based on both multigranulation computing mechanism and probability statistic principle, such as the blood analysis case. Given the binary classifier, the negative sample ratio, and the total number of objects in domain, this model can search the minimum mathematical expectation of classification times and the optimal classification granularity spaces for mining all the negative samples. And the experimental results demonstrate that, with the granules divided into many subgranules, the efficiency of the proposed method gradually increases and tends to be stable. In addition, the complexity for solving problem is extremely reduced.


Introduction
With the rapid development of modern science and technology, the daily information which people are facing is dramatically increasing, and it is urgent to find a simple and effective way to process the complex information.So the rise and development of multigranulation computing [1][2][3][4][5][6][7][8] have been promoted by this demand.Information granulation had attracted researchers' great attention since a paper that focused on discussing information on granulation published by professor Zadeh in 1997 [9].In 1985, a paper named "Granularity" was published by Professor Hobbs in the International Joint Conference on Artificial Intelligence held in Los Angeles, United States.It focuses on the granulation of decomposition and synthesis and how to obtain and generate different granularity [10].These studies play a leading role not only in granular computing methodology [11][12][13][14][15][16][17], but also in dealing with complex information [18][19][20][21][22]. Subsequently, the number of researches focused on granular computing has rapidly increased.Many scholars successfully use the basic theoretical model of multigranulation computing to deal with practical problems [23][24][25][26][27][28][29].Currently, multigranulation computing becomes a basic theoretical model to solve the complex problems and discover knowledge from mass information [30,31].
Multigranulation computing method is mainly aimed to establish a multilevels or multidimensional computing model, and then we need to find the solving way and synthesis strategies in different granularity spaces for solving complex problem.Complex information will be subdivided into lots of simple information in the different granularity spaces [32][33][34][35]; then, effective solutions will be obtained by data mining and knowledge discovery techniques to deal with simple information.So it can solve the complex problems in different granularity or dimensions.Aiming at large-scale binary classification problem, it is an important issue to determine the class of all objects in domain by using as little times as possible.And this issue attracts a large number of Mathematical Problems in Engineering researchers' attention [36][37][38][39].Supposing we have a binary classification algorithm and  is the number of all objects in domain and  is the probability of negative samples in domain.Then we need to give the class of all objects in domain.The traditional method will search each object one by one by the binary classification algorithm and it is simple but has many heavy workloads when facing a large number of objects set.Therefore, some researchers have proposed group classification that each object is composed of a lot of samples, and this method improves the searching efficiency [40][41][42].For example, how can we effectively complete the binary classification tasks with minimum classification times when facing the massive blood analysis?The traditional method is to test per person once.However, it is a heavy workload when facing many thousands of objects.But, to a certain extent, the single-level granulation method (namely, single-level group testing method) can reduce the workload.
In 1986, Professor Mingmin and Junli proposed that using the single-level group testing method can reduce the workloads of massive blood analysis when the prevalence rate  of a sickness is less than about 0.3 [43].In this method, all objects will be subdivided into many small subgroups, and then every subgroup will be tested.If the testing result of a subgroup is sickness, we will diagnose the result of each object by testing all the objects in this subgroup one by one.Else if its result is health, we can diagnose that all objects in this subgroup are heathy.But for millions and even billions of objects, can this kind of single-level granulation method still effectively solve the complex problem?At present, there are lots of methods about the single-level granulation searching model [44][45][46][47][48][49][50][51][52][53][54], but the studies on the multilevels granulation searching model are few [55][56][57][58][59].A binary classification of multilevels granulation searching algorithm, namely, establishing an efficient multigranulation binary classification searching model based on hierarchical quotient space structure, is proposed in this paper.This algorithm combines the falsity and truth preserving principles in quotient space theory and mathematical expectations theory.Obviously, on the assuming that  is the probability of negative samples in domain, the smaller , the higher efficiency of this algorithm.A large number of experimental results indicate that the proposed algorithm has high efficiency and universality.
The rest of this paper is organized as follows.First, some preliminary concepts and conclusions are reviewed in Section 2.Then, the binary classification of multigranulation searching algorithm is proposed in Section 3. Next, the experimental analysis is discussed in Section 4. Finally, the paper is concluded in Section 5.

Preliminary
For convenience, some preliminary concepts are reviewed or defined at first.Definition 1 (quotient space model [60]).Suppose that triplet (, , ) describes a problem space or simply space (, ), where  denotes the universe and  is the structure of universe . indicates the attributes (or features) of universe .Suppose that  represents the universe with the finest granularity.When we view the same universe  from a coarser granularity, we have a coarse-granularity universe denoted by Theorem 3 (truth preserving principle [61]).A problem ) is a connected set on , and problem  →  on (, , ) has a solution.ℎ :  → [] is a natural projection that is defined as follows: (1) Definition 4 (expectation [62]).Let  be a discrete random variable.The expectation or mean of  is defined as  = () = ∑  ( = ), where ( = ) is the probability of  = .
In the case that  takes values from an infinite number set,  becomes an infinite series.If the series converges absolutely, we say the expectation () exists; otherwise, we say that the expectation of  does not exist.
Lemma 5 is the basis of the following discussion.

Binary Classification of Multigranulation Searching Model Based on Probability and Statistics
Granulation is seen as a way of constructing simple theories out of more complex ones [1].At the same time, the transformation between two different granularity layers is mainly based on the falsity and truth preserving principles in quotient space theory.And it can solve many classical problems such as scale ball game and branch decisions.All the above instances just clearly embody the ideas to solve complex problems with multigranulation methods.
In the second section, the relevant concepts about multigranulation computing theory and probability theory are reviewed.Of course, a multigranulation binary classification searching model not only solves practical complex problems with less cost, but also can be easily constructed.And this model may also play a certain role in the inspiration for the applications of multigranulation computing theory.
Example 8. On the assumption that many people need to do blood analysis of physical examination for diagnosing a disease (there are two classes, normal stands for health and abnormal stands for sickness), the domain  = { 1 ,  2 , . . .,   } stands for all the people.Let  denote the number of all people, and  stands for the prevalence rate.So the quotient space of blood analysis of physical examination is (, 2  , ).Besides, we also know a binary classifier (or a reagent) that diagnoses a disease by testing blood sample.How can we complete all the blood analysis with the minimal classification times?Namely, how can we determine the class of all objects in domain.There are usually three ways as follows.
Method 9 (traditional method).In order to accurately diagnose all objects, every blood sample will be tested one by one, so this method needs to search  times.This method is just like the classification process of machine learning.
Method 10 (single-level granulation method).This method is to mix  blood samples to a group where  may be 1, 2, 3, . . ., ; namely, the original quotient space will be random partition to ( 1 ,  2 , . . .,   ) (⋃  =1   = ,   ∩  = Φ, ,  ∈ {1, 2, . . ., },  ̸ = ).And then each mixed blood group will be tested once.If the testing result of a group is abnormal, and according to Theorem 2, we know that this abnormal group has abnormal object(s).In order to make a diagnosis, all of the objects in this group should be tested once again one by one.Similarly, if the testing result of a group is normal, and according to Theorem 3, we know that all of the objects are normal in this group.Therefore, all  objects in this group only need one time to make a diagnosis.The binary classifier can also classify the new blood sample that consists of  blood samples, in this process.
If every group has been mixed by large-scale blood samples (namely,  is a large number) and when some groups are tested to be abnormal, that means lots of objects must be tested once again one by one.In order to reduce the classification times, this paper proposes a multilevels granulation model.Method 11 (multilevels granulation method).Firstly, each mixed blood group which contains  1 samples (objects) will be tested once, where  1 may be 1, 2, 3, . . ., ; namely, the original quotient space will be random partition to ( 1 ,  2 , . . .,   ) (⋃  =1   = ,   ∩   = Φ, , ∈ {1, 2, . . ., },  ̸ = ).Next, if some groups are tested to be normal, that means all objects in those groups are normal (health).Therefore, all  1 objects only need one time to make a diagnosis in this group.Else if some groups are tested to be abnormal, those groups will be subdivided into many smaller subsets (subgroups) which contain  2 ( 2 <  1 ) objects: namely, the quotient space of an abnormal group   will be random partition to ( 1 ,  2 , . . .,   ) (⋃  =1   =   ,   ∩   = Φ, ,  ∈ {1, 2, . . ., },  ̸ = ,  <  1 ).Finally each subgroup will be tested once again.Similarly, if a subgroup is tested to be normal, it is confirmed that all objects are health in corresponding subgroup, and if a subgroup is tested to be abnormal, it will be subdivided into smaller subgroups which contain  3 ( 3 <  2 ) objects once again.Therefore, the testing results of all objects can be ensured by repeating the above process in a group until the number of objects is equal to 1 or its testing result is normal in a subgroup.Then, the searching efficiency of the above three methods is analyzed as follows.
In Method 9, every object has to be tested once for diagnosing a disease, so it must take up  times in total.
In Method 10, the original problem space is subdivided into many disjoint subspaces (subsets).If some subsets are tested to be normal that means all objects need only to be tested once.Therefore, the classification times can be reduced if the probability  is small enough in some degree [9].
In Method 11, the key is trying to find the optimal multigranulation space for searching all objects, so a multilevels granulation model needs to be established.There are two questions.One is grouping strategy: namely, how many objects are contained in a group?The other one is optimal granulation levels: namely, how many levels should be granulated from the original problem space?
In this paper, we mainly solve the two questions in Method 11.According to the truth and falsity preserving principle in quotient space theory, all normal parts of blood samples could be ignored.Hence, the original problem will be simplified to a smaller subspace.This idea not only reduces the complexity of the problem, but also improves the efficiency of searching abnormal objects.
Algorithm Strategy.Example 8 can be regarded as a tree structure which each node (which stands for a group) is an -tuple.Obviously, the searching problem of the tree has been transformed into a hierarchical reasoning process in a monotonous relation sequence.The original space has been transformed into a hierarchical structure where all subproblems will be solved in different granularity levels.
Firstly, the general rule can be concluded by analyzing the simplest hierarchy and grouping case which is the single-level granulation.Secondly, we can calculate the mathematical expectation of the classification times of blood analysis.
Finally, an optimal hierarchical granulation model will be established by comparing with the expectation of classification times.
Analysis.Supposing that there is an object set  = { 1 ,  2 , . . .,   }, the prevalence rate is .So the probability of an object that appears normal in blood analysis is  = 1 − .
The probability of a group which is tested to be normal is   1 , and to be abnormal is 1 −   1 , where  1 is the objects number of a group.

3.1.
The Single-Level Granulation.Firstly, the domain (which contains  objects) is randomly subdivided into many subgroups, where each subset contains  1 objects.In other words, a new quotient space [ 1 ] is obtained based on an equivalence relation .Then supposing that the average classification time of each object is a random variable  1 , so the probability distribution of  1 is shown in Table 1.Thus, the mathematical expectation of  1 can be obtained as follows: Then, the total mathematical expectation of the domain can be obtained as follows: When the probability of  keeps unchanged and  1 satisfies inequality  1 ( 1 ) < 1, this single-level granulation method can reduce classification times.For example, if  = 0.5 and  1 > 1, and according to Lemma 5,  1 ( 1 ) > 1 no matter what the value of  1 .Then this single-level granulation method is worse than traditional method (namely, testing every object in turn).Conversely, if  = 0.001 and  1 = 32,  1 ( 1 ) will reach its minimum value, and the classification time of single-level granulation method is less than the traditional method.Let  = 10000; the total of classification times is approximately equal to 628 as follows: This shows that this method can greatly improve the efficiency of diagnosing and reduce 93.72% classification times in the single-level granulation method.If there is an extremely low prevalence rate, for example,  = 0.000001, the total of classification times reaches its minimum value when each group contains 1001 objects (namely,  1 = 1001).If every group is subdivided into many smaller subgroups again and repeating the above method, can the total of classification times be further reduced?
3.2.The Double-Levels Granulation.After the objects of domain are granulated by the method of Section 3.1, the original objects space becomes a new quotient space in which each group has  1 objects.According to the falsity and truth preserving principles in quotient space theory, if the group is tested to be abnormal, it can be granulated into many smaller subgroups.The double-levels granulation can be shown in Figure 2.
Then, the probability distribution of the double-levels granulation is discussed as follows.
If each group contains  1 objects and tested once in the 1st layer, the average of classification times is 1/ 1 for each object.Similarly, the average of classification times of each object is 1/ 2 in the 2nd layer.When a subgroup contains  2 objects and is tested to be abnormal, every object has to be retested one by one once again in this subgroup, so the total of classification times of each object is equal to 1/ 2 + 1.
For simplicity, suppose that every group in the 1st layer will be subdivided into two subgroups which, respectively, contain  21 and  21 objects in the 2nd layer.
The classification time is shown in Table 2 (M represents the testing result which is abnormal and ✓ represents normal).
Case 1.If a group is tested to be normal in the 1st layer, so the total of classification times of this group is Case 2. If a group is tested to be abnormal in the 1st layer and its one subgroup is tested to be abnormal, and the other subgroup is also tested to be abnormal in the 2nd layer, so the total of classification times of this group is Case 3. If a group is tested to be abnormal in the 1st layer, its one subgroup is tested to be normal, and the other subgroup is tested to be abnormal in the 2nd layer, so the total of classification times of this group is Case 4. If a group is tested to be abnormal in the 1st layer, and its two subgroups are tested to be abnormal in the 2nd layer, so the total of classification times of this group is Suppose each group contains  1 objects in the 1st layer, and their every subgroup has  2 objects in the 2nd layer.Supposing that the average classification times of each object is a random variable  2 , then the probability distribution of  2 is shown in Table 3.
Thus, in the 2nd layer, the mathematical expectation of  2 which is the average classification times of each object is obtained as follows: As long as the number of granulation levels increases to 2, the average classification times of each object will be further reduced: for instance, when  = 0.001 and  = 10000.
As we know, the minimum expectation of the total of classification times is about 628 with  1 = 32 in the singlelevel granulation.And according to (6) and Lemma 6,  2 ( 2 ) will reach minimum value when  2 = 16.The minimum mathematical expectation of each object's average classification times is shown as follows: The mathematical expectation of classification times can save 96.62% compared with traditional method and save 46.18% compared with single-level granulation method.Next we will discuss -levels granulation ( = 3, 4, 5, . . ., ).

The i-Levels Granulation.
For blood analysis case, the granulation strategy in th layer is concluded by the known objects number of each group in previous layers (namely,  1 ,  2 , . . .,  −1 are known and just only   is unknown).According to the double-levels granulation method, and supposing that the classification time of each object is a random variable   in the -levels granulation, so the probability distribution of   is shown in Table 4.
Obviously, the sum of probability distribution is equal to 1 in each layer.

Proof.
Case 1 (the single-level granulation).One has Case 2 (the double-levels granulation).One has Case 3 (the -levels granulation).One has The proof is completed.

Definition 12 (classification times expectation of granulation).
In a probability quotient space, a multilevels granulation model will be established from domain  = { 1 ,  2 , . . .,   } which is a nonempty finite set, the health rate is , the max granular levels is , and the number of objects in th layer is   ,  = 1, 2, . . ., .So the average classification time of each objects is   (  ) in th layer.
In this paper, we mainly focus on establishing a minimum granulation expectation model of classification times by multigranulation computing method.For simplicity, the mathematical expectation of classification times will be regarded as the measure of contrasting with the searching efficiency.According to Lemma 5, the multilevels granulation model can simplify the complex problem only if the prevalence rate  ∈ (0, 1 −  − −1 ) in the blood analysis case.
Theorem 13.Let the prevalence rate  ∈ (0, 0.3); if a group is tested to be abnormal in the 1st layer (namely, this group contains abnormal objects), the average classification times of each object will be further reduced by subdividing this group once again.
Theorem 13 illustrates that it can reduce classification times by continuing to granulate the abnormal groups into the 2nd layer when  1 > 1.There is attempt to prove that the total of classification times will be further reduced by continuously granulating the abnormal groups into th layers until a group's number is no less than 1.Theorem 14. Supposing the prevalence rate  ∈ (0, 0.3), if a group is tested to be abnormal (namely, this group contains abnormal objects), the average classification times of each object will be reduced by continuously subdividing the abnormal group until the objects number of its subgroup is no less than 1.
Proof.The expectation difference between ( − 1)-levels granulation  −1 ( −1 ) and -levels granulation   (  ) reflects their efficiency.On the condition of  − −1 <  < 1 and 1 ≤   <  −1 , and according to (11), the expectation difference  −1 ( −1 ) −   (  ) is shown as follows: Because (1 −   1 ) × ⋅ ⋅ ⋅ × (1 −   −1 ) > 0 is known, according to Lemma 5 and   ≥ 1, we can get (1/  + 1 −    ) < 1: Theorem 14 shows that this method will continuously improve the searching efficiency in the process of granulating abnormal groups from 1st layer to th layer because  −1 ( −1 ) −   (  ) > 0 always holds.However, it is found that the classification times cannot be reduced when the objects number of an abnormal group is less than or equal to 4, so the objects of this abnormal group should be tested one by one.In order to achieve the best efficiency, then we will explore how to determine the optimum granulation, namely, how to determine the optimum objects number of each group and how to obtain the optimum granulation levels.

The Optimum Granulation.
It is a difficult and key point to explore an appropriate granularity space for dealing with a complex problem.And it not only requires us to keep the integrity of the original information but also simplify the complex problem.So we take the blood analysis case as an example to explain how to obtain the optimum granularity space in this paper.Suppose the condition  − −1 <  < 1 always holds.
Case 1 (granulating abnormal groups from the 1st layer to the 2nd layer).(a) If  1 is an even number, every group which contains  1 objects in 1st layer will be subdivided into two subgroups into 2nd layer.Scheme 15.Supposing the one subgroup of the 2nd layer has  (1 ≤  <  1 /2) objects, according to formula (6), the expectation of classification times for each object is  2 ().And the other subgroup has ( 1 − ) objects, so the expectation of classification times for each object is  2 ( 1 − ).The average expectation of classification times for each object in the 2nd layer is shown as follows: Scheme 16.Suppose every abnormal group in 1st layer is average subdivided into two subgroups: namely, each subgroup has  1 /2 objects in the 2nd layer.According to formula (6), the average expectation of classification times for each object in the 2nd layer is shown as follows: The expectation difference between the above two schemes embodies their efficiency.In order to prove that Scheme 16 is more efficient than Scheme 15, we only need to prove that the following inequality is correct: namely, Proof.Let () =   ( − −1 <  < 1), and according to Lemma 7, then, we have The proof is completed.
Therefore, if every group has  1 ( 1 is an even number and  1 > 1) objects in the 1st layer that need to be subdivided into two subgroup, Scheme 16 is more efficient than Scheme 15.
The experiment results have verified the above conclusion in Table 5.Let  = 0.004 and  1 = 16.When every subgroup contains 8 objects in the 2nd layer, the expectation of classification times obtains minimum value for each object, where  21 is the number of the one subgroup in the 2nd layer,  22 is the number of the other subgroup, and  2 is the corresponding expectation of classification times for each object.
(b) If  1 is an even number, every group which contains  1 objects in 1st layer will be subdivided into three subgroups into 2nd layer.Scheme 17.In the 2nd layer, if the first subgroup has  (1 ≤  <  1 /2) objects, the average expectation of classification times for each object is  2 ().If the second subgroup has  (1 ≤  <  1 /2) objects, the expectation of classification times for each object is  2 ().Then the third subgroup has ( 1 −  − ) objects, and the average expectation of classification times for each object is  2 ( 1 −  − ).So the average expectation of classification times for each object in the 2nd is shown as follows: Similarly, it is easy to be prove that Scheme 16 is also more efficient than Scheme 17.In other words, we only need to prove the following inequality: namely, Proof.Let () =   ( − −1 <  < 1), and according to Lemma 7, then we have The proof is completed.
Therefore, if every group which contains  1 (is an even number and  1 > 1) objects needs to be subdivided into three subgroups in the first layer, Scheme 16 is more efficient than Scheme 17.
The experimental results have verified the above conclusion in Table 6.Let  = 0.004 and  1 = 16.When every subgroup contains 8 objects in the 2nd layer, the average expectation of classification times reaches minimum value for each object.In Table 6, the first line stands for the objects number of first group in the 2nd layer and the first row stands for the objects number of second group, and data stands for the corresponding average expectation of classification times.For example, (1, 1, 7.7143) expresses that the objects number of three groups, respectively, is 1, 1, and 14, and the average classification time for each object is  2 = 0.077143 in the 2nd layer.
(c) When an abnormal group contains  1 (even) objects and it needs to be further granulated into the 2nd layer, Scheme 16 still has the best efficient.
There are two granulation schemes.Scheme 18 is that the abnormal groups are randomly subdivided into  ( <  1 ) subgroups in the 1st layer, and Scheme 16 is that the abnormal groups are averagely subdivided into two subgroups in the 1st layer.
Scheme 18. Supposing an abnormal group will be subdivided into  ( <  1 ) subgroups.The first group has  1 (1 ≤  1 <  1 /2) objects and the average expectation of classification times for each object is  2 ( 1 ); the 2nd subgroup has  2 (1 ≤  2 <  1 /2) objects and the average expectation of classification times for each object is  2 ( 2 ), . ..; th subgroup has   (1 ≤   <  1 /2) objects and the average expectation of classification times for each object is  2 (  ), . ..; th subgroup has   (1 ≤   <  1 /2) objects and the average expectation of classification times for each object is  2 (  ).Hence, the average expectation of classification times for each object in the 2nd layer is shown as follows: Similarly, in order to prove that Scheme 16 is more efficient than Scheme 18, we only need to prove the following inequality: namely, Proof.Let () =   ( − −1 <  < 1), and according to Lemma 7.Then, we have The proof is completed.
Therefore, when every abnormal group which contains  1 (which is an even number and  1 > 1) in the 1st layer objects needs to be granulated into many subgroups, Scheme 16 is more efficient than other schemes.
(d) In a similar way, when every abnormal group which contains  1 ( 1 is an odd number and  1 > 1) objects in the 1st layer will be granulated into many subgroups, the best scheme is that every abnormal group is uniformly subdivided into two subgroups: namely, each subgroup contains ( 1 − 1)/2 or ( 1 + 1)/2 objects in the 2nd layer.
Case 2 (granulating abnormal groups from the 1st layer to th layer) Theorem 19.In th layer, if the objects number of each abnormal group is more than 4, then the total of classification times can be reduced by keeping on subdividing the abnormal groups into two subgroups which contain equal objects as far as possible.Namely, if each group contains   objects in th layer, then each subgroup may contain   /2 or ( 1 −1)/2 or ( 1 +1)/2 objects in ( + 1)th layer.
Proof.In the multigranulation method, the objects number of each subgroup in the next layer is determined by the objects number of group in the current layer.In other words, the objects number of each subgroup in the ( + 1)th layer is determined by the known objects number of each group in th layer.
According to recursive idea, the process of granulating abnormal group from th layer into ( + 1)th layer is similar to that 1st layer into 2nd layer.It is known that the best efficient way is as far as possible uniformly subdividing an abnormal group in current layer into two subgroups in next layer when granulating abnormal group from 1st layer into 2nd layer.Therefore, the best efficient way is also as far as possible uniformly subdividing each abnormal group in th layer into two subgroups in ( + 1)th layer.The proof is completed.
Based on  1 which is the optimum objects number of each group in 1th layer, then the optimum granulation levels layer, and  is the optimum granulation levels.Namely, in this multilevels granulation method, the final structure of granulating abnormal group from the 1st layer to last layer is similar to a binary tree, and the origin space can be granulated to the structure which contains many binary trees.
According to Theorem 19, multigranulation strategy can be used to solve the blood analysis case.When facing the different prevalence rates, such as  1 = 0.01,  2 = 0.001,  3 = 0.0001,  4 = 0.00001, and  5 = 0.000001, the best searching strategy is that the objects number of each group in the different layers is shown in Table 7 (  stands for the objects number of each groups in th layer and   stands for the average expectation of classification times for each object).Theorem 20.In the above multilevels granulation method, if  which is the prevalence rate of a sickness (or the negative sample ratio in domain) tends to 0, the average classification times for each object tend to be 1/ 1 ; in other words, the following equation always holds: Proof.According to Definition 12, let  = 1 − ,  → 1; we have According to Lemma 6,  1 = [1/( + ( 2 /2))] or  1 = [1/( + ( 2 /2))] + 1.And then let so lim →1   = 0 and lim →1   = 1/ 1 .The proof is completed.
Let  = (1/ 1 )/  .The changing trend between  and  with the variable  is shown in Figure 3.

Binary Classification of Multigranulation Searching Algorithm.
In this paper, a kind of efficient binary classification of multigranulation searching algorithm is proposed through discussing the best testing strategy of the blood analysis case.The algorithm is illuminated as follows.
Output.The average classification times expectation of each object .
The algorithm flowchart is shown in Figure 4.  Complexity Analysis of Algorithm 21.In this algorithm, the best case is  where the prevalence rate tends to be 0, and the classification time is  *   ≈ / 1 ≈  * (+( 2 /2)), which tends to 1, so the time complexity of computing is (1).But the worst case is  which tends to be 0.3, and the classification times tend to , so the time complexity of computing is ().

Comparative Analysis on Experimental Results
In order to verify the efficiency of the proposed BCMSA, in this paper, suppose there are two large domains  1 = 1 × 10 4 ,  2 = 100 × 10 4 and five kinds of different prevalence rates which are  1 = 0.01,  2 = 0.001,  3 = 0.0001,  4 = 0.00001, and  5 = 0.000001.In the experiment of blood analysis case, the number "0" stands for a sick sample (negative sample) and "1" stands for a healthy sample (positive sample), then randomly generating  numbers, in which the probability of generating "0" denoted as  and the probability of generating "1" denoted as 1 −  stand for all the domain objects.The binary classifier is that, summing all numbers in a group (subgroup), if the sum is more than 1, it means this group is tested to be abnormal, and if the sum equals 0, it means this group has been tested to be normal.
Experimental environment is 4 G RAM, 2.5 GHz CPU, and WIN 8 system, the program language is Python, and the experimental results are shown in Table 8.
In Table 8, item "" stands for the prevalence rate, item "levels" stands for the granulation levels of different methods, and item "()" stands for the average expectation of classification times for each object.Item " 1 " stands for the objects number of each group in the 1st layer.Item "ℓ" stands for the degree of () close to 1/ 1 . 1 = 1 × 10 4 and  2 = 1×10 6 , respectively, stand for the objects number of two original domains.Items "Method 9" and "Method 10," respectively, stand for the improved efficiency where Method 11 compares with Method 9 and Method 10.
Form Table 8, diagnosing all objects needs to expend 10000 classification times by Method 9 (traditional method), 201 classification times in Method 10 (single-level grouping method), and only 113 classification times in Method 11 (multilevels grouping method) for confirming the testing results of all objects when  1 = 1 × 10 4 and  = 0.0001.Obviously, the proposed algorithm is more efficient than Method 9 and Method 10, and the classification times can, respectively, be reduced to 98.89% and 47.33%.At the same time, when the probability  is gradually reduced, BCMSA has gradually become more efficient than Method 10, and ℓ tends to 100%; that is to say, the average classification time for each object tends to 1/ 1 in the BCMSA.In addition,  the BCMSA can save 0%∼50% of the classification times compared with Method 10.The efficiency of Method 10 (single-level granulation method) and Method 11 (multilevels granulation method) is shown in Figure 5; the -axis stands for prevalence rate (or the negative sample rate) and the axis stands for the average expectation of classification times for each object.In this paper, BCMSA is proposed, and it can greatly improve searching efficiency when dealing with complex searching problems.If there is a binary classifier, which is not only valid to an object, but also valid to a group with many objects, the efficiency of searching all objects will be enhanced by BCMSA, such as blood analysis case.As the same time, it may play an important role for promoting the development of granular computing.Of course, this algorithm also has some limitations.For example, if the prevalence rate of a sickness (or the occurrence rate of event )  > 0.3, it will have no advantage compared with traditional method.In other words, the original problem need not be subdivided into many subproblems when  > 0.3.And when the prevalence rate of a sickness (or the negative sample rate in domain) is unknown, this algorithm needs to be further improved so that it can adapt to the new environment.

Conclusions
With the development of intelligence computation, multigranulation computing has gradually become an important tool to process the complex problems.Specially, in the process of knowledge cognition, granulating a huge problem into lots of small subproblems means to simplify the original complex problem and deal with these subproblems in different granularity spaces [64].This hierarchical computing model is very effective for getting a complete solution or approximate solution of the original problem due to its idea of divide and conquer.Recently, many scholars pay their attention to efficient searching algorithms based on granular computing theory.For example, a kind of algorithm for dealing with complex network on the basis of quotient space model is proposed by L. Zhang and B. Zhang [65].In this paper, combining hierarchical multigranulation computing model and principle of probability statistics, a new efficient binary classifier of multigranulation searching algorithm is established on the basis of mathematical expectation of probability statistics, and this searching algorithm is proposed according to recursive method in multigranulation spaces.Many experimental results have shown that the proposed method is effective and can save lots of classification times.These results may promote the development of intelligent computation and speed up the application of multigranulation computing.However, this method also causes some shortcomings.For example, on the one hand, this method has strict limitation for the probability value of : namely,  < 0.3.On the contrary, if  > 0.3, the proposed searching algorithm probably is not the most effective method, and the improved methods need to be found.On the other hand, it needs a binary classifier, which is not only valid to an object, but also valid to a group with many objects.In the end, with the decrease of probability value of  (even it infinitely closes to zero), for every object, its mathematical expectation of searching time will gradually close to 1/ 1 .
In our future research, we will focus on the issue on how to granulate the huge granule space without any probability value of each object and try our best to establish a kind of effective searching algorithm under which we do not know the probability of negative samples in domain.We hope these researches can promote the development of artificial intelligence.

Figure 3 :
Figure 3: The changing trend about  and  with .

Figure 5 :
Figure 5: Comparative analysis between 2 kinds of methods.
[].Then we can have a new problem space ([], [], []).The coarser universe [] can be defined by an equivalence relation  on .That is, an element in [] is equivalent to a set of elements in , namely, an equivalence class .So [] consists of all equivalence classes induced by .From  and , we can define the corresponding [] and [].Then we have a new space ([], [], []) called a quotient space of original space (, , ).(falsity preserving principle [61]).If a problem [] → [] on quotient space ([], [], []) has no solution, then problem  →  on its original space (, , ) has no solution either.In other words, if  →  on (, , ) has a solution, then [] → [] on ([], [], []) has a solution as well.

Table 1 :
The probability distribution of  1

Table 2 :
The average classification times of each object with different results.

Table 4 :
The probability distribution of   .

Table 5 :
The changes of average expectation with different objects number in two groups.

Table 6 :
The changes of average expectation with different objects number of three groups.

Table 8 :
Comparative result of efficiency among 3 kinds of methods.