Parallel Attribute Reduction Algorithm for Complex Heterogeneous Data Using MapReduce

College of Automation, Nanjing University of Posts and Telecommunications, Nanjing 210023, China Jiangsu Engineering Lab. of Big Data Analysis & Control for Active Distribution Network, Nanjing 210023, China College of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210023, China School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200072, China


Introduction
With the rapid development of information technology, especially in the aspects of sensing, communication, network, and calculation, the amount of accumulated data in many fields is increasing at striking speeds.The inestimable value in big data has become a common understanding in academia and industry [1] and has garnered great attention in many counties; thus, big data technology development was announced as a national strategy by many countries [2].However, most of the vast array of data that comes to us may be chaotic, irrelevant, and redundant.How to extract and express such implicit information in the form of explicit knowledge hidden in the given complex information systems has been an active area of research in the past number of decades.In practice, rough sets [3] have been widely used as a mathematical tool to deal with uncertain data.As one of the core research contents of rough set theory, attribute reduction can remove redundant attributes and reduce data dimensions under the premise of stable dependencies between decision attributes and conditional attributes in a decision table.Scholars have designed a large number of attribute reduction algorithms in recent years, which can generally be divided into three methods [4]: positive region [5][6][7], discernible matrix [8][9][10][11], and information entropy [12][13][14][15].However, the execution of these algorithms is a typical serial operation.Although these kinds of algorithms with serial operation are possible to handle small data efficiently, their computational complexity, which depends on attribute number |C| and sample size |U|, may inevitably lead to lower efficiency and/or even complete failure when facing massive data.
In order to solve this problem, some scholars have proposed parallel algorithms for high-dimensional or largescale data.Based on the divide and conquer strategy, Xiao et al. [16] used parallel computing to divide the reduction task into multiple processors that process simultaneously.Rough entropy was used to measure attribute significance by Lv et al. [17], and a parallel minimum reduction set algorithm was proposed.However, the dataset should be loaded at once into the memory when implementing these algorithms.To make up for the defects of the aforementioned parallel algorithms, the Google File System-(GFS-) based distributed file system and the MapReduce parallel programming model were utilized by Qian et al. [18,19].The decision table was divided into several subdecision tables, and thus, a large amount of data did not need to be loaded into a single memory bank when calculating attribute reduction.Moreover, the machines in the cluster can cooperate with each other to solve problems that the single machine could not address.Zhang et al. [20] proposed a parallel method for computing rough set approximations based on the MapReduce technique to deal with the massive data.The equivalence classes for each subdecision table were calculated parallelly in Map step; thereafter, these equivalence classes were combined in Reduce step if their information sets are the same.Qian et al. [21] further analyzed that the key to improving the reduction efficiency is the effective computation of equivalence classes and attribute significance.Consequently, a structure of <key, value> pair to speed up the computation of equivalence classes and attribute significance was designed and the traditional attribute reduction process was parallelized based on MapReduce mechanism.However, these abovementioned algorithms were based on the Pawlak's classic rough set model with an equivalence relation, which is only suitable for categorical data.
To break the limit of the equivalence relation, Lin [22] proposed the concept of the neighborhood model and adopted the neighborhood relation instead of the equivalent relation, which can directly deal with numerical data through the neighborhood granulation in the universe.The monotonic relation between the positive region and attribute set in the neighborhood rough set model was proved by Hu et al. [23,24], and an attribute reduction algorithm with lower computational complexity, which is suitable for heterogeneous data including categorical and numerical attributes, was put forward.Qian et al. [25] extended Pawlak's rough set model to a multigranulation rough set (MGRS) model, where the set approximations are defined by using multiple equivalence relations on the universe.Based on the pessimistic multigranulation rough set model, Sang and Qian [26] analyzed a granular space selection under multiple granular spaces, defined the importance measure of the granular space, and designed a granular space reduction algorithm.Subsequently, Lin et al. [27] expanded the neighborhood rough set model to multiple granular spaces and proposed the concept of a neighborhood multigranulation rough set (NMGRS) model by constructing the universe through a hierarchical division of attribute set sequences.Furthermore, Yong et al. [28] hashed a dataset by dividing data into a series of hash buckets according to the Euclidean distance, which dramatically decreased the calculation time and reduced time complexity to O m | U | for getting positive regions.A quick and efficient attribute reduction algorithm with the time complexity of O m 2 | U | was also given.In addition, Qian et al. discussed local rough set to deal with big data [29,30].However, these aforementioned algorithms based on different extended rough set models were still involved in the serial computation.
To the best of our knowledge, it is still a challenging task to perform parallel attribute reduction on complex and massive data.In particular, the existing algorithms could not effectively deal with the complex heterogeneous data, which include categorical and numerical attributes, in the parallel way from the multiple granular computing perspectives.Due to the rampant existence of heterogeneous datasets in real-life applications, it is therefore necessary to investigate effective parallel approaches to deal with this issue.For the purpose of parallelizing the traditional attribute reduction algorithm for complex heterogeneous data, the neighborhood multigranulation rough set model was considered in this paper, and the parallelization points of the hashing, positive region calculating, and boundary objects pruning are analyzed based on MapReduce mechanism.Thereafter, a fast parallel attribute reduction algorithm is developed.The effectiveness and superiority of this parallel algorithm were demonstrated by theoretical analysis and comparison experiments.
Different from available algorithms, the contribution of this paper is twofold.(1) Motivated by the aforementioned MapReduce technology, hash algorithm, and neighborhood multigranulation rough set model, the parallelization methods of multiple granular spaces and hashing Map/ Reduce functions for heterogeneous data are brought to light; 2) a neighborhood multigranulation rough set model-based parallel attribute reduction algorithm using MapReduce, which has never been done before, is proposed.
The paper is organized as follows.Section 2 outlines some preliminary knowledge.In Section 3, we present the parallelization strategies of multiple granular spaces for heterogeneous data and the parallel fast attribute reduction algorithm based on the neighborhood multigranulation rough set model.Next, experiments are conducted to evaluate the efficiency of the proposed algorithm in Section 4. Finally, in Section 5, we present the conclusion and the future work.

Preliminary Knowledge
In this section, 1-type and 2-type neighborhood multigranulation rough sets and MapReduce programming model will be briefly described.
2.1.Neighborhood Multigranulation Rough Set.Neighborhood rough set model uses neighborhood relation to replace equivalence relation, which can directly process numerical data and heterogeneous data.For further processing heterogeneous data from the perspective of multiple granular spaces and multiple levels of granularity, neighborhood rough set theory has been extended from single attribute subset to multiple attribute subsets.Two types of neighborhood multigranulation rough set models have been developed [27].

1-Type Neighborhood Multigranulation Rough Sets (1-Type NMGRS)
Definition 1 (see [25]).Let <U, Δ > be a nonempty metric space; U = x 1 , x 2 , … , x n is a nonempty finite set of objects, called the universe.A closed ball taking x ∈ U as its center 2 Complexity and δ as its radius is called the δ neighborhood of x and is defined as follows: where δ ≥ 0, and Δ is a distance function.For two points in the universe, x i = x i1 , x i2 , … , x in and x j = x j1 , x j2 , … , x jn , the distance function can usually use the Euclidean distance formula.
Definition 2 (see [25]).Let <U, Δ > be a nonempty metric space.When categorical and numerical attributes coexist, let B 1 and B 2 be categorical and numerical attributes, respectively.The neighborhood of x can be defined as follows: Definition 3 (see [25]).Given a decision system, is the set of condition attributes and D is the set of decision attributes, and C ∩ D = ∅.V = ∪ a∈R V a is a set of attribute values and V a is a domain of the attribute a. f U × R → V is a function such that f x, a ∈ V a for every a ∈ C and x ∈ U. N is the neighborhood relation.Let A ⊆ C be a categorical attribute set and B ⊆ C be a numerical attribute set, so A ∪ B ⊆ C is an mixed attribute set; U/A, U/B, and U/ A ∪ B represent two partitions and a covering of the universe U, respectively.For any X ⊆ U, the optimistic multigranulation lower and upper approximations of X with respect to A and B in U are defined as follows: whereas the pessimistic multigranulation lower and upper approximations of X are defined as follows: Comparing with that just the single neighborhood relation was used in the 1-type neighborhood multigranulation rough sets, multiple neighborhood relations were fully considered in the 2-type neighborhood multigranulation rough sets, which were denoted by 2-type NMGRS by Qian et al. [25].
Definition 4 (see [27]).Given a decision system NIS = U, C ∪ D, V, f , N , let N = n 1 , n 2 , … , n m be a neighborhood m-relation on the universe U induced by A i and B i , where A i is a categorical attribute subset, and B i is a numerical attribute subset.For any X ⊆ U, the optimistic lower and upper approximations of X in U are defined as follows: whereas the pessimistic multigranulation lower and upper approximations of X are defined as follows: Definition 5 (see [31]).Given a decision system is a partition of universe U induced by decision attribute D, and δ = δ 1 , δ 2 , … , δ n is a set of n neighborhood radii.The attribute dependency of B i for decision class Y with the neighborhood radius δ j is defined as follows: Definition 6 (see [31]).Given a decision system Y , then regardless of whether the attribute a is removed from B i , the decision positive region of the system is unchanged; in other words, the attribute a is redundant for Definition 7 (see [19]).Given a decision system NIS = U, i=1 U i and U j ∩ U k = ∅, the decision system NIS can be divided into m-subdecision systems, and S i is called the subsystem of S.

MapReduce Programming Model.
MapReduce is a parallel processing framework that breaks down large tasks into many small tasks.With the small tasks independent of each other, big tasks and small tasks are just different in size.The MapReduce parallel programming model also breaks down the computational process into two main stages: the Map stage and the Reduce stage.
In the MapReduce model, the whole dataset is split into many splits in natural sequence and then is passed to the Map stage.Data in the MapReduce programming model can be represented as <key, value> pairs.The Map function takes pairs <K 1 , V 1 > as input and generates a set of intermediate <K 2 , V 2 > pairs.The Reduce function groups together all intermediate values V 2 associated with the same K 2 and then merges together a set of values for each K 2 to form a possibly smaller set of values and finally outputs <K 3 , V 3 > pairs.The Map and Reduce functions are given as follows: Map: Here, K i and V i i = 1, … , 3 represent the user-defined data types; is used to denote a list.

Parallel Attribute Reduction Algorithm for NMGRS
Aiming at the numerical or the heterogeneous data, many attribute reduction algorithms based on neighborhood multigranulation rough sets have been developed.However, it is still a challenging task to parallelize these attribute reduction algorithms for massive heterogeneous data.Motivated by the works of Qian et al. [21] and Yong et al. [28], quick parallelization strategies to speed up the computation of neighborhood classes and positive regions are proposed, and a parallel attribute reduction algorithm is designed in this section.

Parallelization Strategies.
To parallelize the attribute reduction algorithm based on the neighborhood multigranulation rough set model, the MapReduce model was adopted.Thus, it is the key point that how to design the Map and Reduce functions for quickly getting neighborhood classes and positive regions.The work of Yong et al. [28] demonstrated that the neighborhood of a sample can only exist in its adjacent hash buckets or its own hash bucket.Therefore, to find possible neighborhoods, it is only necessary to group the samples according to their hash values.So, in the Map function, the hash value of each sample could be firstly calculated, and then the hash values and sample IDs are output.In the Reduce function, the sample IDs in the hash bucket are merged according to the same hash value.Thus, the Map and Reduce functions for hash buckets calculation are designed as follows.
Example 1.We take the decision table shown in Table 1 as an example to illustrate the calculation process of Algorithms 1 and 2, where the decision attribute is listed in the last column.
According to Definition 7, the decision information system was divided into The neighborhood radius is given as 0.08, and the condition attribute subset is given as The Map process: The <KEY HM , VALUE HM > pairs that output from Map 1 are <0,1>, and <1,2>.
After Algorithms 1 and 2, samples were hashed into three hash buckets with hash values of 0, 1, and 3.
Next, we calculated the positive regions under the current subset.According to Definition 4, the process of neighborhood computation and positive region judgment based on the multigranulation neighborhood rough sets could be divided into two parts.First, calculate the neighborhood of x under one condition attribute subset and then take the intersection of the positive region set of multiple condition subsets (applicable to the pessimistic neighborhood multigranulation rough set model) or take the union of the positive region set (applicable to the optimistic neighborhood multigranulation rough set model).
As to the neighborhood calculation of a single condition attribute subset, according to the work in literature [26], whether the sample belongs to the positive region can be judged by a distance function after traversing the hash bucket where a neighborhood probably exists.The hash value of a sample was calculated by the Map function firstly, and then, possible hash buckets can be searched in the output files (named after the hash value) by Algorithm 2. Consequently, the sample can be judged whether it belongs to the positive region or not by scanning the found hash buckets of this The positive region of the whole universe U under single attribute granularity can be obtained by this algorithm through the intersection or union of the single granularity positive region sequence.Thus, the significant attribute could be calculated and added to current reduction subsets.
Example 2. We continue using the decision table in Table 1 and the same conditions as in Example 1 to illustrate the operation process.
The above operation results show that the positive and boundary region of the current universe can be acquired by Algorithms 3 and 4, where the positive region includes 4 and the boundary region includes 1,2,3 .
According to the monotonic proof of the work of Ma et al. [32] that the situation of one sample belonging to a certain positive region will not be changed when additional attributes are added.In other words, in this case, it is unnecessary that these samples are repeatedly calculated in the second Reduce stage.So the Reduce function could focus the boundary samples.
The Map and Reduce functions for updating positive regions are designed as follows.
Example 3. Here, we are taking Table 1 6 Complexity convenience, this algorithm is denoted as PARA_NMG in this paper.

Algorithm: Time Complexity Analysis.
It is assumed that the neighborhood decision information system has |U|-samples and m-condition attributes.The positive region calculation is still the key step for the proposed PARA_NMG algorithm.In step 2.1, the calculation method in literature [28] is used to calculate the positive region of each attribute set, the time complexity of which is O m U .As to step 2.3, suppose that there are l-attributes eventually selected, with each attribute is added into the reduction subset, U /l-samples will convert from the boundary sample to the positive region (in term of probability).Therefore, the time complexity of serial calculation is O m U l + 1 .Furthermore, the MapReduce model was used in the PARA_NMG algorithm to parallelize the attribute reduction algorithm; assuming that there are k-nodes, the time complexity of the algorithm is [22] and time complexity of O C 2 U 2 /k in literature [33].

Experiment Analysis
In this section, we conducted some numerical experiments to assess the efficiency of our proposed algorithm.The experiments were implemented on a PC cluster of nine nodes, where one was set as a master node and the rest were configured as slave nodes.Each node is equipped with Inter Core i5-2400M CPU (four cores in all, each 3.1 GHz), 4 GB RAM memory, and the software of Ubuntu 14.0, Hadoop 2.6.0, and Java 1.6.20.All algorithms were coded in Java.
To illustrate the efficiency of our proposed PARA_NMG algorithm, the representative parallel algorithm for reduction algorithm based on positive region, which was denoted as PAAR_PR, proposed in literature [21] was used for comparisons.The difference is that the PAAR_PR algorithm is based on the classical rough set model, while the PARA_NMG algorithm is based on the neighborhood multigranulation rough set model.
To test the efficiencies of above two algorithms on different types of data, the experiments were carried out with the real datasets Soybean, US Census Data (1990), Susy, PAMAP2 Physical Activity Monitoring, and Poker Hand from UCI Machine Learning Repository [34] and another dataset KDD99.Here, Soybean and US Census Data (1990) are categorical datasets, Susy and PAMAP2 Physical Activity Monitoring are numerical datasets, and Poker Hand and KDD99 are heterogeneous datasets.To create a big data environment, the dataset Soybean was duplicated 100,000 times as a new dataset.For convenience, these above six datasets were denoted as DS1~DS6, respectively.The characteristics of these datasets are shown in Table 2.

Comparison and Analysis of Reduction Results.
For neighborhood rough set model, it is important to select a proper neighborhood radius when calculating neighborhood classes.According to the work of Hu et al. [24], the reasonable neighborhood radius should be selected in the interval   7 Complexity 0 1, 0 3 .Qian et al. [29] analyzed the monotonicity of positive region with the neighborhood radius, and they found that the classification accuracy will be deduced with neighborhood radius increase.Considering these factors and characteristics of selected datasets, the neighborhood radius for our PARA_NMG algorithm was set to be 0.1 when facing numerical data.The reduction results of the PARA_NMG algorithm and the PAAR_PR algorithm are shown in Table 3.
It can be seen from Table 3 that the PAAR_PR algorithm obtained effective reduction results on categorical datasets DS1 and DS2.Notwithstanding that there are few numerical attributes in DS6, the equivalence classes could be obtained, so the PAAR_PR algorithm is still practicable.However, for numerical datasets DS3 and DS4 and heterogeneous dataset DS5, PAAR_PR could not get the reduction results because equivalent classes could not be obtained in these datasets.Thus, the applicability of PAAR_PR algorithm depends on the characteristics of datasets.Comparatively speaking, the PARA_NMG algorithm was not limited by data types when calculating attribute reduction on different datasets.Considering the rampant existence of heterogeneous datasets in real-life applications, the neighborhood multigranulation rough set-based PARA_NMG algorithm has better applicability.
In addition, for datasets DS1, DS2, and DS6, although the attribute reduction results were all obtained, there was a little difference between the selected attribute subsets by both algorithms.To further analyze the two algorithms' effects on reduction results from the perspective of classification    4.
We can see from Table 4 that classification accuracies, according to the reduction subsets of PARA_NMG algorithm, are better for most of these classifiers.In fact, the classification accuracy is the important factor that should be considered in real-life applications.So, from a practical point of view, our neighborhood multigranulation rough set-based PARA_NMG algorithm also has better applicability.

Comparative Analysis on Computational Time.
To illustrate the influence of the number of computer nodes on the two algorithms' computational time, the experiments were implemented on a cluster with different number of nodes.The average running times of the two algorithms were recorded, which are shown in Table 5 as follows.For datasets DS3~DS5, only the results of PARA_NMG algorithm are given.
As can be seen from Table 5, PAAR_PR is faster than PARA_NMG because of the different rough set models were used.PAAR_PR is based on the classical rough set model, and the time complexity of the classical heuristic serial reduction algorithm based on the positive region is O m U 2 .Conversely, PARA_NMG is based on the neighborhood multigranulation rough set model, and the time complexity of the classical serial reduction algorithm is O m 2 U 2 .To minimize the times of computation for getting positive regions, the hash function was introduced into the Map and Reduce stages for neighborhood multigranulation rough sets, and the time complexity of our parallel attribute reduction algorithm was reduced to O m | U | l + 1 /k .To some extent, the computational time of our algorithm is still comparable.
In fact, except for the computational time, the speedup is really an important performance index for evaluating the efficiency of a parallel algorithm, which is defined as follows: where p is the number of nodes, T 1 is the execution time at one node, and T p is the execution time at p nodes.The speedup of two algorithms was tested with different number of nodes.To be more intuitive, the average speedup of two algorithms on each dataset with different computer nodes is presented in Figure 1 as follows, where the x axis represents the number of computer nodes, the y axis represents the speedup, and the red star point denoted by liner represents the theoretical speedup of a parallel algorithm.
As shown in Figure 1, the parallel reduction algorithm proposed in this paper could achieve better speedup on different data types.With the number of nodes increase, the superiority of our PARA_NMG algorithm in speedup is more and more obvious.Therefore, the PARA_NMG algorithm is more suitable for processing heterogeneous massive data parallelly on a large number of computing nodes.

Conclusion
Attribute reduction is one of the important research issues in rough set theory.In current big data era, traditional attribute reduction algorithms are now faced with big challenges for dealing with massive data.Most existing parallel algorithms have seldom taken granular computing into consideration, especially for dealing with complex heterogeneous data including categorical attributes and numerical attributes.To address these issues, aiming at heterogeneous data, a quick parallel attribute reduction algorithm using MapReduce in the framework of neighborhood multigranulation rough sets was developed in this paper.The hash function was introduced into the Map and Reduce stages to speed up the positive region calculation.The effectiveness and superiority of the developed algorithm were verified by comparison analysis.
However, just the static data was considered in this paper; in fact, datasets in real-world applications often vary dynamically over time.How to parallelize the incremental attribute reduction algorithm in the framework of neighborhood multigranulation rough sets is a focus for future research.

Figure 1 :
Figure 1: Average speedup on different number of nodes.

Table 1 :
Decision table.thedistance function and decision attribute, where key values of samples in the positive region were assigned by 1, while key values of samples in the boundary were assigned by 0. The positive region for the whole universe U could be obtained by combining each positive region of S i in the Reduce function.Map and Reduce functions for neighborhood calculation by a single condition attribute subset are designed as follows.
and the results of Example 2 to illustrate the operation process on parallel boundary set updating and forming a new decision table.condition attribute subset, C; a data split S i Output: <KEY HM , VALUE HM > // let KEY HM be the set of hash value of each sample, and VALUE HM be the set of sample ID begin <KEY HM , VALUE HM >=∅ for each x i ∈ U do let key=hash(x i ); // hash xi = Δ x 0 , x i /δ =key, where x 0 is a special sample in universe U, which is satisfied with ∀ a∈C,x i ∈U f x 0 , a = min f x i , a , and f is the information function.let value=the ID of x i <KEY HM , VALUE HM >=<KEY HM , VALUE HM >= ∪ <key, value> end for end <KEY HM , VALUE HM > Output: <KEY HR , VALUE HR > // let KEY HR be the set of different hash value key', and VALUE HR be the set of sample IDs subset value' with the same hash value key'.begin <KEY HR , VALUE HR >=∅ for <key, value>in <KEY HM , VALUE HM >do if key is not appeared in <KEY HR , VALUE HR > <key', value'>=<key, value> else if key=key' k <KEY HR , VALUE HR >=<KEY HR , VALUE HR >-<key', value'> value' k =value' k ∪ value // combine samples with the same hash value, obtain the hash bucket end if end if <KEY HR , VALUE HR >=<KEY HR , VALUE HR > ∪ <key', value'> end for //output with multi-file; a file named after a hash value is a hash bucket end VALUE UM > pair that outputs from Map 2 is <1, (0.14 0.23 0.40 0.31 No)>.The Reduce process: The <KEY UR , VALUE UR > pairs that output from Reduce1 are <1 (0.10 0.20 0.61 0.20 Yes)>, <2 (0.13 0.22 0.56 0.10 Yes)>, and <0.14 (0.23 0.40 0.31 No)>.Parallel Attribute Reduction Algorithm.On the basis of parallel algorithms given in Section 3.1, a neighborhood multigranulation rough set-based parallel attribute reduction algorithm using MapReduce is presented.For Single condition attribute subset, C; the hash bucket B; and a data split S i Output: <KEY M , VALUE M > //key * represents whether the sample belongs to the positive region; let key * of the sample in positive region be 1, and key * of the sample in boundary be 0; value * represents all samples' ID that have the same key * begin <KEY M , VALUE M >=∅ for each x i ∈ U do let key * i =0 //assuming that this sample does not belongs to the positive region under C let value * =the ID of x i and given that x i ∈ B k for each x j ∈ B k−1 ∪ B k ∪ B k+1 //traversing the hash bucket where a neighborhood probably exists if x j is the neighborhood of x i , but they have different decision attribute values let key * VALUE M >=<KEY M , VALUE M > ∪ <key * i , value * end if end if <KEY R , VALUE R >=<KEY R , VALUE R > ∪ <key * ′, value * ′> end for end Algorithm 4: Single condition attribute subset neighborhood-Reduce.
Input:Input: Input:Input: <KEY M , VALUE M > Output: <KEY R , VALUE R > // let KEYR be the set of different key * , and VALUE R be the set of sample IDs subset value' with the same key'.begin <KEY R , VALUE R >=∅ for <key * , value * >in <KEY M , VALUE M >do if key is not appeared in <KEY R , VALUE R > <key * ′, value * ′>=<key * , value * > else if key * =key * ′ k <KEY R , VALUE R >=<KEY R , VALUE R >-<key * ′, value * ′> value * ′ k =value * ′ k ∪ value * // combine samples with the same key *

Table 3 :
Reduction results of the two algorithms.
: a data split S i Output: <KEY UM , VALUE UM > // key ^represents the sample does not belongs to the positive region; value ^represents the sequence of the sample's attribute value begin <KEY UM , VALUE UM >=∅ for each x i ∈ Si do if x i ∉ POS key ^i=1, value ^i=x i else key ^i=0, value ^i=x i end if <KEY UM , VALUE UM >=<KEY UM , VALUE UM > ∪ <key ^i, value ^i> end for end

Table 4 :
Classification accuracies with two algorithms' reduction results.Step 2: if (C-reduct)= ∅, go to Step 3; while if (C-reduct) ≠ ∅, execute the following loop operations: Step 2.1: for each condition attribute c k ∈ C − reduct , use algorithms 1~4 to calculate the positive region POS k of reduct ∪ c k ; Step 2.2: compare the positive region POS k of condition attribute subset reduct ∪ c k after each attribute c k added.Find current maximum positive region Max_POS; if |Max POS| = ∅ then keep attribute reduction invariant, and go to Step 3. Otherwise, add c k into reduct, reduct = reduct ∪ c k ; Step 2.3: update boundary sample set Q with algorithms 5~6, Q = Q − Max POS, and return to Step 2.1 Step 3: output reduction reduct end.<KEY UM , VALUE UM > Output: <KEY UR , VALUE UR > // let KEY UR be the sequence number in the case of key ^=1, and VALUE UR be the set of value ^sequence subset.begin KEY UR =0 and VALUE UR =∅ if key ^k=1 KEY UR =KEY UR ++ VALUE UR =VALUE UR ∪ value Input:

Table 5 :
Average computational time of two algorithms (s).