A New Knowledge Characteristics Weighting Method Based on Rough Set and Knowledge Granulation

The knowledge characteristics weighting plays an extremely important role in effectively and accurately classifying knowledge. Most of the existing characteristics weighting methods always rely heavily on the experts' a priori knowledge, while rough set weighting method does not rely on experts' a priori knowledge and can meet the need of objectivity. However, the current rough set weighting methods could not obtain a balanced redundant characteristic set. Too much redundancy might cause inaccuracy, and less redundancy might cause ineffectiveness. In this paper, a new method based on rough set and knowledge granulation theories is proposed to ascertain the characteristics weight. Experimental results on several UCI data sets demonstrate that the weighting method can effectively avoid subjective arbitrariness and avoid taking the nonredundant characteristics as redundant characteristics.


Introduction
In data mining, in order to effectively classify the knowledge, we need to make proper assessment on the knowledge characteristics sets. Therefore, it is very important to compute the weights of characteristics sets. Weights reflect the role of characteristics in the classification process and directly affect the validity and accuracy of the classifier. The common weighting methods include experts scoring method, fuzzy statistics method [1][2][3], Analytic Hierarchy Process (AHP) method [4][5][6], and Principal Component Analysis (PCA) method [7,8]. In these methods, the a priori knowledge must be used.
In recent years, the rough set method has been studied to calculate the characteristics weight. For instance, based on the concepts of characteristics importance, Wang et al. proposed a method to determine the characteristics weights. However, this method did not consider the influence of decision characteristics on conditional characteristics [37]. Cao and Liang combined the characteristics importance of the rough set and the experts' a priori knowledge to determine the characteristics weight [38]. This method achieved the unity of the subjective a priori knowledge with the objective situations, but it ignored the internal difference in the equivalent partitions. Therefore, some nonredundant characteristics would be handled by redundant characteristics. Bao et al. proposed a method ascertaining characteristics weight based on rough set and conditional information entropy. It avoids some nonredundant characteristics to be handled by redundant characteristics. But in this method the characteristics importance obtained by redundant characteristics was higher than that got by nonredundant characteristics [39]. Zhu and Chen constructed the priority queue of characteristics importance to improve Bao's research. They presented a weighting 2 Computational Intelligence and Neuroscience method based on the conditional information entropy and rough set, but that method also involved additional costs [40].
In this paper, a new knowledge characteristics weighting method based on the rough set and knowledge granulation theory is proposed. The accuracy of equivalent partitions in knowledge characteristics is studied and the difference in equivalence classes is analyzed. Experimental results on several UCI data sets confirm our theoretical results. By comparing the numerical results with those of the AHP method, the PCA method, and two rough set based methods, we can draw the conclusion that our new method can effectively avoid taking nonredundant characteristics as redundant characteristics and can improve classification accuracy.
The rest of the paper is structured as follows. Some basic concepts about rough set are briefly introduced in Section 2. In Section 3, a new knowledge characteristics weighting method is proposed and studied. Some experimental results are given in Section 4 to show the effectiveness of the proposed weighting method. Finally, we end this paper with some conclusions in Section 5.

Rough Set.
Rough set theory takes knowledge as a partition of the objects domain. The equivalence relations and equivalence classes produced by the equivalence relations are valid information or knowledge about the objects domain. Let denote the universe of objects, which is a nonempty set.
⊆ × is the equivalence relation on , called the knowledge on the universe . The equivalence relation divides into the disjoint subsets; it is denoted as / The lower approximation (set) of the set is also defined as the positive region POS( ) = ( ). The set BND ( ) = ( ) − ( ) will be referred to as the -boundary region of . Obviously, when the border area is larger, the set divided by is rougher. Therefore, the roughness of rough set about the equivalence relation can be achieved; it is denoted by The accuracy of rough set about the equivalence relation is defined as where |⋅| represents the number of the elements in the collection, 0 ≤ ( ) ≤ 1. When ( ) = 1, is defined as the accuracy set about the equivalence relation . When ( ) < 1, is defined as the rough set about the equivalence relation .
Suppose and are two equivalence relations about the universe , if ⊆ , for ∀ ∈ , there is [ ] ⊆ [ ] . Thus, the equivalence classes / can be considered finer than the equivalence classes / and the knowledge ( , ) is more accurate than the knowledge ( , ); see [37][38][39][40] for details.

Knowledge
Granularity. By the rough set theory, people learn that knowledge is related to the equivalence classes, which shows that knowledge is granular. That is why some scholars also identify the structure of knowledge granularity by the equivalence classes and calculate the size of the knowledge granularity [39].
If ( , V) ∈ , it indicates that the objects and V belong to the same equivalence class with the equivalence relation ; they are indiscernible. Obviously the smaller GD( ) is, the stronger the discernibility of becomes. Assume that is an equivalence relation, = ( , ) is a knowledge base, and / = { 1 , 2 , . . . , } is the equivalence class. According to (3), the knowledge granularity can be expressed as And the discernibility of is defined as According to (4), there is Dis(

Knowledge Characteristics Weighting Based on Rough Set and Knowledge Granulation
Cao and Liang calculated the characteristics weights by the cardinality of the positive region set over the cardinality of the discourse set, but the results may be inaccurate [38].  The weight of the knowledge characteristics 1 ( ) = Card(POS 1 ( ))/Card( ) = 4/9, in which Card( ) represents the number of the elements in the collection . And the weight is also shown in 2 ( ) = Card(POS 2 ( ))/Card( ) = 4/9. Thus 2 ( ) = 1 ( ). It is obvious that the characteristics weights are the same, but the equivalence classes of these two characteristics are different.
In order to solve the problems above, we use the knowledge granularity to study the relationship between the various subsets in the complex sets of the equivalence classes and propose a method based on the knowledge granularity to compute the discernibility of knowledge characteristics. Then, the knowledge characteristics weights according to the relationship between the discernibility and the weights of knowledge characteristics will be determined.

The Discernibility of Knowledge Characteristics. We first
give a definition about the discernibility of the knowledge characteristics.
Definition 1. Suppose that = ( , ) is a knowledge base, is the equivalence relation, and ∈ is a characteristic. Let Then, the discernibility of is denoted by By Definition 1, we know that the larger Dis( ) is, the more discernible the ability of becomes. When we select two objects randomly on , there are | | 2 ways. After adding characteristic into ( − { }), the characteristic discernibility increases from | −{ }| to | |. Thus, the number of equivalence classes is more than or equal to the original set. Thus, the ability of such discernibility is improved, and the discernibility increases.

Method to Determine Characteristics Weight.
To propose our new characteristics weight method, we further give two definitions.
According to (2) and (5), we have the following formulation of KCDis( ): According to Definitions 3 and 4, we present a new formula to compute the weight of characteristic in the following definition. Detailed computation process is shown in Algorithm 1.

Experimental Results
In this section, some experiments are used to show the effectiveness of our new method. The data used in our experiments come from the Pima Indians Diabetes Data Set, which includes a total of 768 cases, of which 392 are valid, and the rest of the data cases' characteristics values are missing. Note that the Pima Indians Diabetes Data Set is no longer available due to permission restrictions.
In actual computations, we use these 392 cases for experimentation. The condition characteristics information includes "plasma glucose concentration at 2 hours in an oral glucose tolerance test", "diastolic blood pressure (mm Hg)", "triceps skin fold thickness (mm)", "2-hour serum insulin (mu U/ml)", "body mass index (weight in kg/(height in m) 2 )". The data set is given in Table 1, where " 1", " 2", " 3", " 4", and " 5" denote the condition characteristics, respectively. " " stands for the decision characteristics "class variable (0 or 1)". Then the condition characteristics values are discretized to different levels as " , , " or " , , , "; see Table 2. According to Algorithm 1, the following characteristics weights can be obtained: Two experiments are conducted to show the advantages of our new method. The first experiment is to compare different rough set based methods with our method. The second one is to compare the AHP and PCA methods with our method. Both comparisons can show that our new proposed method is more effective than those methods.
In the first experiment, we also choose two rough setbased methods. One is based on the dependence in rough set theory to calculate the characteristics weight. The other is based on rough sets and conditional information entropy.
In knowledge bases = ( , ) and = ∩ , the dependence of the characteristic is defined as ( ) = |POS ( )|/| |. The characteristics importance Sig( ) = ( ) − −{ } ( ). Then the characteristics weight is 1 ( ) = Sig( )/ ∑ ∈ Sig( ) [39]. By calculation, we have 1 ( 1 ) = 0.5, Computational Intelligence and Neuroscience In Table 3, we list the weighting results of the three methods based on rough set. Figure 1 clearly shows their comparison. From Table 3 and Figure 1, it shows that when the methods based on the dependence of rough set and the method based on the rough set and conditional information entropy are used to calculate the characteristics weights, the weights of " 2" and " 5" are redundant. But when the proposed method is used to calculate the characteristics weights, the results were not redundant. There is a little relation between "diastolic blood pressure (mm Hg)", "body mass index (weight in kg/(height in m) 2 )", and diabetes, but they are related. So, from this point of view, the new method is more accurate than the other two rough set-based methods.
In the second experiment, the AHP method and the PCA method are used to calculate the characteristics weight. We also compare their results with ours.
For the AHP method, we construct the analytic hierarchy matrix according to the opinion of medical experts [41]. Then we obtain the weights: 3 ( 1 ) = 0.0604, 3 ( 2 ) = 0.1012, 3 ( 3 ) = 0.3103, 3 ( 4 ) = 0.1815, For the PCA method, we select the representative variables through the transformation of multiple variables. Then the SPSS software is used to seek the explanation of the total variance and component of the matrix. We take principal components variance contribution rate as weight [41] and finally normalize them to get the weights: Computational Intelligence and Neuroscience 7   The weighting results are given in Table 4. Figure 2 shows the comparison between the proposed method and two wellknown methods. From Table 4 and Figure 2, it is easy to check that the rank of the results calculated with our method is " 1" > " 4" > " 3" > " 5" > " 2". It shows that there is a closed relation between "plasma glucose concentration at 2 hours in an oral glucose tolerance test" and diabetes, and there is a little relation between "diastolic blood pressure (mm Hg)" and diabetes. These results are synthetic optimization on the results calculated by AHP and PCA from Figure 2. By consulting the medical experts, the results calculated by our method are more accordant with the actual situation.
However, the Analytical Hierarchy Process (AHP) method is based on the subjective judgment of the experts and the Principal Component Analysis (PCA) method needs to extract representative principal components and increase an additional a priori information and evaluation criteria. Therefore, these two methods cannot objectively reflect the weight distribution. The new method does not need the prior knowledge, but the obtained weights are in line with the actual situation.
From the above discussion, the weighting method based on rough set can avoid the arbitrariness of subjective judgment. In addition, the weighting method with granularity theory can effectively avoid taking nonredundant characteristics as redundant characteristics. We can conclude that our new method reasonably distributes the weight for each characteristic. The weights basically reflect the importance of each characteristic and can also objectively reflect the actual situation of the patient's body. Thus, the proposed method is a powerful method in knowledge classification.

Conclusions
Knowledge characteristics can help us have a good understanding of the knowledge base. The determination of knowledge characteristics weight can help us effectively classify the knowledge base, so as to achieve the purpose of knowledge management and decision making. In this paper, based on rough set theory and knowledge granularity theory, the weights of knowledge characteristics are determined. Experimental results show that the proposed method can effectively avoid taking nonredundant characteristics as redundant characteristics and can effectively determine the weights of knowledge characteristics.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.