Uncertainty Analysis of Knowledge Reductions in Rough Sets

Uncertainty analysis is a vital issue in intelligent information processing, especially in the age of big data. Rough set theory has attracted much attention to this field since it was proposed. Relative reduction is an important problem of rough set theory. Different relative reductions have been investigated for preserving some specific classification abilities in various applications. This paper examines the uncertainty analysis of five different relative reductions in four aspects, that is, reducts' relationship, boundary region granularity, rules variance, and uncertainty measure according to a constructed decision table.


Introduction
Uncertainty is associated with randomness, fuzziness, vagueness, roughness, and incomplete knowledge. The theories of probability, information, fuzzy set, evidence set, rough set, and so forth have been used for uncertainty analysis [1]. Rough set theory (RST) as a new comer and one of methods for the representation of uncertainty has gained an increasing attention from both the theoretical and the applied points of view. Uncertainty exists in realistic world inherently. There are many factors affecting the uncertainty of actual questions. Examining and analyzing the characteristics of uncertainty from various situations are vital in intelligent information processing.
Attribute reduction is an important problem of rough set theory. A reduct is a minimum subset of attributes that provides the same description or classification ability as the entire set of attributes [2]. A relative reduct in Pawlak rough set is a minimum subset of attributes that preserves the positive region of the classification unchangeable, and it can also be defined as a minimum subset of condition attributes that provides the same classification ability as the entire set of attributes in a decision table. Many objective functions for attribute reduction have been proposed and examined to find the reduct to preserve a specific property [3][4][5][6][7][8][9][10].
The relationship among various reductions has attracted interest of some researchers [3,[11][12][13][14][15]. This also inspires our interest to investigate the uncertainty variation under a reduction and the uncertainties in various relative reducts.
There are two types of uncertainty inherently in RST. The first type of uncertainty arises from the indiscernibility relation. It increases as the granularity of the partition becomes coarser. The second one results from the approximation regions of rough sets, since the lower approximation is the certain region and the upper approximation is the possible region. This gives rise to a direction of analyzing uncertainty in relative reductions.
In this paper, the uncertainty analysis of five relative reductions is investigated in four aspects. Firstly, the relationships among different reducts are listed according to some research results. Secondly, the boundary areas of five reductions are described in detail according to a constructed decision table. Thirdly, the quality analysis and variant process of classification rules generated from these five relative reductions are discussed. Lastly, the definition of uncertainty measure in classification is proposed.
In the remainder of this paper, some related notations are reviewed and uncertainty analysis of five relative reductions is discussed.

Basic Notions
Information table is used by Pawlak for raw data representation [2]. For classification tasks, we consider a special information table with a set of decision attributes. Such an  information table is also called a decision table. In this part, only some related notations are reviewed.

Definition 1.
A decision table is given as the following tuple: where is a finite nonempty set of objects, is a finite nonempty set of attributes including a set of condition attributes that describe the objects and a set of decision attributes that indicate the classes of objects, is a nonempty set of values of ∈ , and : → is an information function that maps an object in to exactly one value in .
For simplicity, we assume = { } in this paper, where is a decision attribute which labels the decision for each object. A table with multiple decision attributes can be easily transformed into a table with a single decision attribute by considering the Cartesian product of the original decision attributes.
A partition : / = { 1 , 2 , . . . , | / | } is used to denote the partition of the universe defined by the decision attribute set , and so is another partition defined by a condition attribute set ⊆ . The equivalence classes induced by the partition are the basic blocks to construct the Pawlak rough set approximations. For a decision class ∈ , the lower and upper approximations of with respect to a partition are defined by Pawlak [2]:    Decision rule generation is an important issue in a decision table. The rough set approach offers all solutions to the problem of decision table simplification and many real applications have been found in various fields [17]. Rule sets may include certain and uncertain (or possible) rules. Theorem 3 can be written as Theorem 5 in the view of decision rules.
Theorem 5. In a decision table , the rule sets have the following properties.
(1) If the decision table is consistent, then any rule of the rule sets is a certain rule.
(2) If the decision table is inconsistent, then the rule sets consist of certain rules and uncertain rules. And the number of uncertain rules is larger than zero.
Definition 6. In a decision table , confidence of uncertainty rules of with respect to a condition attribute set ⊆ and an equivalence class ⊂ is defined as where 0 ≤ ≤ 1 obviously. ( | ) is written as when is known. All the rules with respect to in different are called one pair of rules. The confidence of one pair of rules has the property that all the sum of is equal to 1. But the value range of may be different with different subset in partition . A rule with premises (preconditions) , conclusions (post conditions) , and confidence is denoted as → . If = 1, then the rule is certain; if = 0, then the rule is impossible; otherwise, it is possible or uncertain.
In order to express the degree of completeness and incompleteness of knowledge about the nonempty set , a pair of measures is defined by Pawlak [18]. They are the accuracy measure of ( ) = (card ( )/card ( )) and -roughness of ( ) = 1 − ( ) = 1 − (card ( )/card ( )) = (card ( )/card ( )). Pawlak also defines a measure to evaluate the quality of approximation of by [18], ; a variety of ( ) can be defined as and referred to as -roughness of classification or approximation. It can express the degree of inexactness of knowledge about the classification . This method to measure uncertainty is only associated with object numbers in the boundary region and the universe of discourse not the granularity of discourse. A modified measure, the rough entropy, is proposed by Beaubouef et al. [19], which combined the roughness of a set with approximate granularity: The Scientific World Journal 3 According to Beaubouef, the rough entropy of each decision class is the probabilities for each equivalence class belonging either wholly or in part to it. There is no ordering associated with individual class members. Therefore the probability of any one value of the class being named is the reciprocal of the number of elements in the class. If is the cardinality of, or the number of elements in, equivalence class and all members of a given equivalence class are equal, then = 1/ represents the probability of one of the values in class . denotes the probability of equivalence class within the universe.
is computed by taking the number of elements in class and dividing by the total number of elements in all equivalence classes combined.
The rough entropy ( ) indicates the uncertain percentage concerning granularity, but it is only defined for a set. By generalizing it to a partition, a definition of rough entropy for a classification is proposed in Section 3.4.
In order to support our discussion, the road traffic accident table is constructed by our group and is shown in Table 1. It is inconsistent obviously. The total number of data items is 15; among them only items 1 and 2 are consistent and others are inconsistent.
From Table 1, the partition derived can be written as The partition or equivalent classes IND( ) derived by condition attribute set can be written as The positive and boundary regions of decision attribute with respect to condition attribute set are as follows.
Positive region is Boundary region is The quality of classification or the degree of dependency of on defined by Pawlak in Table 1 is as follows: ( ) = 2/15. The -roughness of classification is given below. Its value denotes the uncertain percentage of the discourse.
-roughness is Notes: -confidence, others are the same as in Table 1.
The quality of classification, ( ), denotes the percentage of objects to all objects in universe that are certainly classified to a decision class. We can use ( ), namely,1 − ( ), to denote the percentage of uncertainly classified objects to all objects in universe and call it the quality of classification uncertainty. These two parameters of inconsistent decision table retain constant in reductions which preserve the positive region invariant.
The rules generated from Table 1 are listed in Table 2. These are primitive rules obtained from original data. The rule has the form → , denoting that the confidence of the classification rule is on the value of attributes set . The first rule in line one from the top can be written as ( = , = , = , = , = , = ) → 1 = .
From all 15 data items, 10 rules are achieved as shown in Table 2. The first two of them are certainty rules, and other 8 possible rules are uncertainty rules and appear in four pairs.

Uncertainty Analysis of Relative Reductions
There are various definitions of relative reductions in rough set theory [2-8, 11, 12]. All of them can be classified into three categories: region preservation, information preservation, and partition preservation. Five relative reductions [2][3][4][5][6] chosen from the three categories will be investigated for uncertainty analysis in the following four aspects (Table 3). They are classical positive region reduction proposed by Pawlak [2], mutual information preservation reduction proposed by Miao [4], distribution reduction proposed by Slezak [5], general decision reduction proposed by Kryszkiewicz [6], and boundary partition reduction proposed by Miao et al. [3]. Related articles may be inquired for better explanation of these relative reductions.

The Relationships among Relative Reducts.
A reduct is a minimum subset of attributes that provides the same description or classification ability as the entire set of attributes. The relationship among different relative reducts has been studied by many researchers. In order to describe the relationship among the five properties preservation reductions, some definition is given in the first place. (1) If 2 always preserves property 1 simultaneously, then 2 is defined as a 1 reduct container.
(2) If 2 is a 1 reduct container, then 2 reduction is defined as a stronger one and 1 reduction is a weaker one.
Because the reduct of propertypreservation is nonexclusive, we cannot say that the reduct of a weaker reduction is included in the reduct of a stronger one. We can only say that if 2 reduction is stronger than 1 reduction, then ∃ reducts 2 and 1 of properties 2 and 1 reduction, and 1 ⊆ 2 .
According to Definition 7, some research results can be assembled here as the following theorems in inconsistent decision tables. The proof of these theorems can be seen in listing reference documents. This means that a distribution reservation reduction is equivalent to mutual information preserving reduction. The distribution reservation reduction will be used in later of this paper as these two reductions.
For good understanding, the relationships among five relative reductions are stated in Figure 1 in decision tables. The two-headed arrow expresses the equivalent relation and one-way arrow expresses the container relation. As shown in the figure, a distribution reservation reduct is equivalent to mutual information preserving reduct. A boundary partition reservation reduct must be container for any other four reducts. In other words, in the five relative reductions, The Scientific World Journal 5 a boundary partition reservation reduction is the strongest; a positive region reduction is the weakest.
Five reducts of decision (Table 1) can be computed according to their reduction definitions. The computed results are as follows.
(i) A positive region reservation reduct is (ii) A general decision reservation reduct is (iii) A distribution or a mutual information reservation reduct is (iv) A boundary partition reservation reduct and boundary region are In the process of computing the above five reducts, the attribute adding guideline is adopted. So the reducts meet set inclusion relationship as the reduction becomes stronger. That is, This expression is not always correct because of the nonexclusive reduct of a reduction.

The Boundary Areas of Five
Reductions. Now, the problem will be investigated further from another view. From Theorem 3, we know that the universe of discourse is the sum of positive region and boundary region. Because all the five reductions discussed in this paper preserve the same positive region with the original table (see Section 3.1), the boundary regions of the five reductions are also the same from the view of object set. Now we examine the granularity in the boundary regions from granular computing. The computation results are as follows.
The boundary regions of five relative reductions are illustrated in Figure 2. By observing the granularity of the boundary regions, we can get the following conclusion. Proposition 12. If 1 reduction is weaker than 2 reduction and 1 ⊆ 2 , then the knowledge on 2 reduction is finer than the knowledge on 1 reduction or the knowledge on 1 reduction is coarser than the knowledge on 2 reduction.
In RST, if 1 and 2 are two attribute sets, and 1 ⊆ 2 , then we have IND( 1 ) ⊇ IND( 2 ). So Proposition 12 can directly result from Pawlak's definition of knowledge granule in reference [18].

The Classification Rules in Five Relative
Reductions. The quality of classification of rules may be measured by rule confidence. Confidence is used to estimate the degree of validity of rules. It means the fraction of objects satisfying both the premises and the conclusions of rules in the set of all objects satisfying the premises of rules. In one rule pair, the higher the confidence of a rule is, the more valid the rule is.
All rules generated from five relative reducts and their confidences are shown in Table 4. This table is Similarly, condition attributes aligning with cell, boundary partition reduct, in the first row from the bottom are the reduct of boundary partition reduction, RED BPA = { , , , , }. The confidences of ten rules generated from boundary partition reduct are listed in the first column in cell confidence from right. They are the same in second and third rows from the bottom.
The change of uncertainty rules in different relative reductions follows a regular pattern. The pair numbers of possible rules are equal to the reduct attributes value numbers of inconsistent data in the table. The number of rules in one pair equals the number of decision values with the same reduct attributes value.
Uncertainty rules that resulted from different reductions can be compared under certain conditions. Because the reduct in rough sets is nonexclusive, only those uncertainty rules that resulted from reducts with set inclusion relation can be compared. This condition is satisfied for rules as shown in Table 4.
Considering the rules from two different reductions, pairs of uncertainty rules in stronger reduction may be integrated into pairs of uncertainty rules in weaker one. In other words, the pair number of uncertainty rule in stronger reduction is larger than or equal to that in weaker one. The number of rules in one pair remains unchanged or gets larger in weaker reduction because of more decision values.
The rule confidence of new formed pair changes because some rules are combined into new pairs and the data item supporting the rules combined at the same time. As an example, when changing from GDR reduction to POS reduction, two pairs of possible rules become one pair. In combination with rule pairs of confidence (3/11, 8/11) and (1/2, 1/2), the reduct value numbers decrease from two to one and the decision values increase from two to three. Combining the data items of previous two pairs with the same decision value, new rules confidences are calculated. They are 9/13 = (8 + 1)/(11+2), 3/13 = (3+0)/(11+2), and 1/13 = (1+0)/(11+2) as shown in Table 4.
The rule confidence only makes sense in its pairs. Let us inspect the last row in the upper part; rule confidence based on the positive preservation is 1/13 and the general decision preservation is 1/2. It never means that the later rule is more reliable than the former. Rule confidence denotes that if an object meets the preconditions, then it has the probability to have the conclusion. The rule confidence in different pairs cannot be compared.

The Uncertainty
Measure of Rough Sets. The uncertainty measure is an important subject in RST. Various definitions of uncertainty measure have been proposed by many researchers [16,[18][19][20][21]. All these measures can reflect the uncertainty of relative reductions to some extent.
Upon the rough entropy definition of Theresa B in 1998 and -roughness of classification/approximation in Section 2, we give the rough entropy Definition 13 for classification task. Both the roughness and the granularity are concerned in this measure.