Fuzzy Covering-Based Three-Way Clustering

This paper investigates the three-way clustering involving fuzzy covering, thresholds acquisition, and boundary region processing. First of all, a valid fuzzy covering of the universe is constructed on the basis of an appropriate fuzzy similarity relation, which helps capture the structural information and the internal connections of the dataset from the global perspective. Due to the advantages of valid fuzzy covering, we explore the valid fuzzy covering instead of the raw dataset for RFCM algorithm-based three-way clustering. Subsequently, from the perspective of semantic interpretation of balancing the uncertainty changes in fuzzy sets, a method of partition thresholds acquisition combining linear and nonlinear fuzzy entropy theory is proposed. Furthermore, boundary regions in three-way clustering correspond to the abstaining decisions and generate uncertain rules. In order to improve the classiﬁcation accuracy, the k -nearest neighbor (kNN) algorithm is utilized to reduce the objects in the boundary regions. The experimental results show that the performance of the proposed three-way clustering based on fuzzy covering and kNN-FRFCM algorithm is better than the compared algorithms in most cases.


Introduction
ree-way decisions (3WD) proposed by Yao [1,2] is a hot topic in various fields in recent years. Since it was put forward, the idea of tripartition has attracted many scholars to do research. Especially recently, great progress has been made in the theoretical research and model building of three-way decisions based on rough sets. For example, Liang and Liu et al. [3][4][5][6] proposed fuzzy three-way decision models and stochastic three-way decision models to deal with real-valued or linguistic-valued decision-making problems. Qian et al. [7] established multigranulation decision-theoretic rough set model based on granular computing theory. Hu [8,9] introduced the concept of three-way decision space and established a three-way decision model based on partially ordered sets. Qi et al. [10] investigated the 3WD model in the framework of lattice theory. Li et al. [11] have constructed a cost-sensitive sequential three-way decision model to simulate the decision-making process from coarse granularity (high cost) to fine granularity (low cost) and please refer [12][13][14] for further generalizations and applications of this model. Yao et al. [15] construct an optimization-based framework for three-way approximations of fuzzy sets. In the meanwhile, for dynamic objects and attributes, some algorithms and incremental 3WD models are designed for classification of dynamic data [16,17]. From the viewpoint of application, three-way decisions have been widely used in research fields such as pattern recognition [18,19], artificial intelligence [20][21][22], engineering, managements [23], and social communities [24].
Based on the above backgrounds and work in three-way decisions, a novel method for three-way clustering based on fuzzy covering is discussed. First, the fuzzy covering of the dataset according to the reasonable fuzzy similarity relation is constructed. e fuzzy covering of the universe requires that the more similar the objects in the universe are, the more similar the corresponding fuzzy classes are. e fuzzy covering established in this way can better reflect the intrinsic relationship between objects in the universe. erefore, clustering results will have more accuracy with valid fuzzy covering. One of the inevitable problems of clustering is threshold calculation. As is well known, for most of the three-way decision models mentioned above, we first need to obtain the pair of partition thresholds α and β. Different thresholds lead to different decision results. e appropriate partition thresholds make the decision more accurate, whereas the inappropriate thresholds distort the decision. Traditionally, the partition thresholds are usually selected according to the experts experience in advance [25][26][27]. According to the loss function, Yao et al. [1] proposed a method to determine the thresholds by Bayesian risk decision theory. By using Shannon entropy as a measure of uncertainty, Deng et al. [28] present an informationtheoretic approach to explain and calculate the thresholds. Zhou et al. [29] explore the shadowed set to automatically obtain the partition thresholds of the three-way decisions but cannot theoretically give a reasonable semantic explanation. To address this issue, inspired by the idea of balancing the uncertainty change of fuzzy sets, a threshold calculation method combining linear fuzzy entropy with nonlinear fuzzy entropy is proposed. is method provides a new scientific explanation for the generation of thresholds. And then, the boundary regions of three-way clustering are processed by the kNN algorithm to reduce uncertainty and improve decision accuracy. e structure of the rest of this paper is as follows: Section 2 briefly introduces the necessary notions of threeway decisions. Section 3 focuses on constructing the fuzzy covering of the raw dataset according to the fuzzy similarity relation and some necessary conditions and discusses its related properties. In Section 4, a novel rough fuzzy C-means (FRFCM) algorithm based on valid fuzzy covering is established. en, we investigate the partition thresholds by combining the linear and nonlinear fuzzy entropy. Furthermore, the framework for processing the boundary region of three-way clustering using the kNN algorithm is introduced. In Section 5, the validity and practicability of the algorithm are evaluated by experiment. Concluding remarks are given in Section 6.

Preliminaries
e basic concepts on three-way decisions are briefly reviewed in this section.
An information system is defined as a 4-tuple . , x n denotes a finite nonempty universe, C is a nonempty finite of condition attributes, D is a nonempty finite of decision attributes, and V � ∪ a∈At V a , where V a is a domain of attribute a; f: U × C ⟶ V is an information function such that f(x, a) ∈ V a for every x ∈ U, a ∈ At. If V a is a membership function value, then the value of object x under attribute a can be expressed as μ a (x) ∈ [0, 1]. e trisecting-and-acting framework of three-way decisions is an extension of binary decision in order to overcome some shortcomings of binary decision. e traditional binary decision model only has acceptance and rejection options, which can easily lead to errors when the information available is insufficient to make an accurate judgment. Sometimes, the cost of wrong decisions is very high. erefore, deferment decision is necessary, which allows decision makers to collect more information and make more accurate judgment. is is a strategy that people often adopt in the decision-making process, and deferment decision is consistent with human cognition. A three-way decision model based on the evaluation function and a pair of thresholds is shown as follows.
Definition 1 (see [30]). Let U be a finite nonempty universe, v be an evaluation function, and (α, β) a pair of thresholds, 0 ≤ β < α ≤ 1, then the positive, negative, and boundary regions of any subset A ⊆ U are defined as follows: Evaluation function is the key of decision. e result of decision-making is different with different evaluation functions. ere are various evaluation functions that can be adopted. If a fuzzy membership function μ A is used as an evaluation function, then the induced three regions are defined by the following equations [31]: (2) e three-valued approximations of a fuzzy set is described by Zadeh [32] as follows: (3) and x has an indeterminate status relative to A, if β < μ A (x) < α. ese three cases correspond to the three-way decisions of the above fuzzy set. When α � 1 and β � 0, we obtain the qualitative three-way decisions of a fuzzy set. However, the qualitative decision model of fuzzy set is very restrictive, and we generally do not select these two thresholds.

Fuzzy Covering and Its Validity
e focus of this section is on the method of constructing valid fuzzy covering of raw data and discusses the properties of the fuzzy covering. Let us first recall some concepts that help us to better understand fuzzy covering.
Definition 2 (see [33,34]). Let U � x 1 , x 2 , . . . , x N be a finite universe and F(U) be the fuzzy power set of U. For each c ∈ (0, 1], we call P � (P 1 , P 2 , . . . , P m ) with is called a fuzzy c-covering approximation space. If m i�1 P i (x) ≥ 1 for each x ∈ U, then P is called a fuzzy covering of U. (U, P) is called a fuzzy covering approximation space. m i�1 P i (x) � 1 for each x ∈ U, then P is called a fuzzy partition of U. We call (U, P) a fuzzy partition approximation space.
Definition 3 (see [35]). Let σ be a mapping σ: is called the degree of similarity between fuzzy sets A and B, if σ(A, B) satisfies the following properties: Some similarity measures are listed as follows: e fuzzy set in this paper is constructed by fuzzy similarity relation R which satisfies the following properties.
For any x, y ∈ U, (4) also called a fuzzy similarity class associated with R on U. erefore, the set of fuzzy similarity classes [x i ] R : i � 1, 2, . . . , |U| constructed by relation R is a fuzzy covering of universe U.
In the following, we investigate the validity and related properties of the fuzzy covering of the raw dataset.
. , x N be a universe. R is the fuzzy similarity relation on U, and σ is the similarity relation is a fuzzy covering of U constructed by fuzzy similarity relation R. For any } is the set of similarity objects with x i . P is defined as a valid fuzzy covering of U with respect to θ, if the following condition holds: where θ ∈ (0.5, 1]. It is easy to know that the value of I(P) depends on σ and R and the choice of φ. φ is generally assigned no less than 0.8. e closer the I(P) is to 1, the more relation the R expresses the structure of sample space. If θ is less than 0.5, the fuzzy covering of the universe is invalid. e fuzzy covering P � [x 1 ], [x 2 ], . . . , [x N ] satisfies that similar objects in U have corresponding similar fuzzy classes, so the fuzzy covering P more fully reflects the original distribution of objects in U.
Proof. It can be easily verified by the definition. □ Remark 1. Let P 1 and P 2 be two valid fuzzy coverings of U with respect to the same θ. We choose fuzzy covering with a larger validity index I(.) as research data.

Rough Fuzzy C-Means Algorithm Based on Fuzzy
Covering. In this section, we discuss the rough fuzzy C-means algorithm with fuzzy covering. e reason for clustering with fuzzy covering is that each fuzzy similarity class can reflect the relationship with the whole dataset, avoiding the disadvantage of excessive loss of clustering information with raw data. e combination of fuzzy set and rough set provides an important direction for uncertain reasoning. Lingras [36] developed rough C-means (RCM) by combining the C-means clustering algorithm with rough set theory. e new clustering center is only related to the positive region and the boundary region, unlike fuzzy C-means (FCM) [37], which is related to all objects. Since there is no membership involved, rough C-means (RCM) cannot effectively deal with the uncertainty caused by overlapping boundaries. In such circumstances, Mitra et al. [25] proposed a rough fuzzy C-means (RFCM) algorithm in which it combines the advantages of both fuzzy set and rough set into the framework of the C-means clustering algorithm. When dividing objects into approximation regions, replacing the absolute distance with a fuzzy membership is the innovation of the rough fuzzy C-means. is adjustment enhances the robustness of the clustering to deal with overlapping situations. Maji et al. [26] modified the calculation of the new clustering center in the RFCM model by assuming that the objects in the lower approximation have definite weights and the objects in the boundary have fuzzy weights. In what follows, we discuss the rough fuzzy C-means of fuzzy covering (FRFCM) algorithm, which is an RFCM algorithm based on fuzzy covering of the universe. Suppose where d ij is the distance between [x j ] and v i , μ ij ∈ [0, 1], and C i�1 μ ij � 1. e parameter m is the fuzzifier greater than 1.
A two-category dataset is taken to explain the influence of different parameters m on classification. e membership degree of each object belonging to each cluster can be considered as a function which is related to relative distances and the fuzzifier parameter. en, formula (6) translates to the following form: where a denotes the relative distance of an object with respect to one of the clusters. e uncertainty caused by different fuzzifier parameter m can be illustrated in Figure 1.
It is easily to obtain that if the value of m tends to 1, the memberships are most crisp, as well as the uncertainty of the system is reduced which is suitable for three-way clustering. In this circumstance, only objects that are approximately the same distance from each cluster center are divided into boundary regions. In addition, the parameter m cannot be assigned with a very large value because as the value increases, the memberships of objects around the center of the cluster will be assigned to 1 and most objects are divided into boundary region which will increase the uncertainty of the system and the error rate of decision-making. Furthermore, the positive region of cluster may become empty. e center vectors v 1 , v 2 , . . . , v C are updated as follows: where can be considered as the contributions to the center v i by the fuzzy lower region and fuzzy boundary region, respectively. R b Q i � RQ i − R Q i denotes the boundary region of cluster G i , where RQ i and R Q i are the lower and upper approximations of cluster Q i with respect to relation R, respectively. e weighted values w il and w ib usually satisfy w il + w ib � 1 and w il > w ib . In this paper, we take and e approximation regions are determined by the FRFCM algorithm with the following principles: if It also means x j ∈ RQ p and x j ∈ RQ q . In this case, x j cannot be divided into the positive region of any clusters. Otherwise, [x] j ∈ R Q p and x j ∈ R Q p . Due to the particularity structure of the fuzzy covering of U, the results of fuzzy covering clustering can well reflect the clustering results of the raw dataset through the above FRFCM algorithm.

Acquisition of
resholds for ree-Way Clustering. In this section, we firstly review the shadowed set model for computing thresholds. en, a novel method of calculating thresholds is proposed by combining the linear and nonlinear fuzzy entropy. e FRFCM algorithm is an important tool to deal with imprecise, incomplete, and inconsistent data. e thresholds in FRFCM which determines the formation of approximation regions should be carefully selected. e unreasonable thresholds may cause the partition of approximate regions to be distorted, and clustering centers may deviate from the expected locations. erefore, we should compute the partition thresholds scientifically according to some principles.
ere are many methods to obtain the thresholds, and the most popular method is the shadowed set [38]. In fact, the shadowed set adopts the method of elevating and reducing membership degree, which divides the domain of fuzzy set into three regions. e corresponding membership function is as follows: where μ A is the membership function of fuzzy set A.
In the following study, only discrete fuzzy systems are considered, and similar models and conclusions can be obtained for continuous fuzzy systems. According to shadowed sets theory, the following formula is proposed to calculate the minimum V value to obtain the optimal thresholds α and β: However, the semantic interpretation of obtaining threshold pairs by using the above method is not very clear. Because the shadowed set model can not reasonably explain the relationship between the obtained shadowed set and the fuzziness of the raw fuzzy set, further research is needed. Various methods for measuring uncertainty are described in the literature [39]. Fuzzy entropy is an important tool to measure the uncertainty of fuzzy set and meets the following requirements.
Definition 5 (see [40]). Let A � (x i , μ A (x i )), x i ∈ U be a fuzzy set on the universe of discourse U � x 1 , x 2 , ..., x N . e fuzzy entropy of fuzzy set A is the mapping E: F(U) ⟶ R + , which satisfies the following four conditions: It is easy to verify that, for any x i ∈ U, μ A (x i ) � 0 or μ A (x i ) � 1, the value of corresponding entropy function is 0, then the fuzzy entropy of the fuzzy set equals 0; i.e., the uncertainty of the fuzzy set is the minimum. When μ A (x i ) � 1/2 holds for any x i ∈ U, the value of corresponding entropy function is 1, then the fuzzy set has maximum uncertainty. e commonly used linear and nonlinear fuzzy entropy functions are listed as follows [41][42][43]: With the above fuzzy entropy functions of fuzzy measure, the corresponding fuzzy entropy of the fuzzy set A can be easily obtained as follows: e basic idea of calculating the thresholds by fuzzy entropy is to reduce the uncertainty of the membership of the objects which are the elevating or reducing operation in the shadowed set to 0, while the membership of objects corresponding to the middle part in the shadowed set is adjusted to the maximal uncertainty; i.e., the fuzzy degree increases to 1. In what follows, we propose a flexible fuzzy entropy method which combines the linear fuzzy entropy function E 1 A (x i ) and nonlinear fuzzy entropy function E 2 A (x i ) to obtain the clustering thresholds. en, the calculation model is as follows:

Mathematical Problems in Engineering
where λ ∈ [0, 1] is a parameter adjusting the impacts of linear entropy and nonlinear entropy.
In equation (13), when λ � 1, only linear fuzzy entropy function E 1 A (x i ) is used to calculate the thresholds. If λ � 0, only nonlinear fuzzy entropy function E 2 A (x i ) is used to calculate the thresholds. e smaller the value of λ, the more the influence brought from the linear fuzzy entropy, and vice versa. In the subsequent experiments of this study, we assign λ � 0.5. Figure 2 illustrates the increase and decrease in fuzzy degree of the fuzzy entropy function by taking the linear fuzzy entropy function E 1 A (x), the nonlinear fuzzy entropy function E 2 A (x), and the fuzzy entropy function It can be seen from Figure 2 that the curve of flexible fuzzy entropy function lies between the curve of linear and nonlinear entropy functions. e method of using flexible fuzzy entropy to obtain the thresholds can prevent the uncertainty of fuzzy set measured by linear or nonlinear fuzzy entropy from being too small or too large, which leads to the partition thresholds unreasonable. resholds used in RFCM and its related algorithms are usually user-defined. However, the threshold calculated by the above model can not only be interpreted from the change in fuzzy degree of fuzzy set but also be adjusted and optimized automatically.
According to α opt and β opt , the positive, boundary, and negative regions of each cluster Q i can be expressed as where μ ij is the membership degree of the jth object belonging to the ith class.

Boundary Region Processing of ree-Way Clustering
Based on kNN Algorithm. Following the above discussion on automatically selecting the optimal partition thresholds based on fuzzy entropy theory, this section will present the object processing in the boundary regions of three-way clustering.
In the three-way clustering, the boundary region objects are rarely further processed. k-nearest neighbor (kNN) algorithm [44] is a well-known nonparametric classifier, which is considered as one of the simplest methods in data mining and pattern recognition. e principle of the kNN algorithm is to find k nearest neighbors of a query in dataset and then predicts the query with the major class in the k nearest neighbors. In this paper, the kNN algorithm will be utilized to process the objects in the boundary regions. If the object does not find a positive region, it is still classified to the boundary region. erefore, the uncertainty of the boundary region decreases with the decrease in the number of objects in the boundary region, and reclassifying the objects in the boundary region can improve the accuracy of the three-way clustering. e details of updating the boundary region with the kNN algorithm are as follows.
Because the kNN algorithm mainly relies on limited adjacent objects for classification, it is more suitable than other methods for the overlap of class domain or the object set to be classified at the boundary region. erefore, Algorithm 1 can handle the uncertain arising from the boundary region. Of course, dealing with the boundary region with the k-nearest neighbor algorithm will add extra computing burden and may also face the risk of misclassification of objects.
In what follows, based on valid fuzzy covering, FRFCM and kNN algorithms, we proposed a three-way clustering algorithm, which is called the kNN-FRFCM algorithm, and it can be formed, as shown in Algorithm 2.
us, according to Algorithm 2, we obtain three-way clustering results of the original dataset by using the valid fuzzy covering.

Experiment Analysis
ree-way clustering method based on fuzzy covering proposed in this paper is suitable for dataset with less data and dimension or data with similar amount of data and dimension. Otherwise, clustering with the fuzzy covering constructing by the data with a large amount of data and few dimension will cause the curse of dimensionality. In this paper, six datasets include Iris, Breast Cancer Wisconsin (Original) (BCWO) which eliminates the missing data, New thyroid, Seeds, Forest-type mapping (FTM), and CT from UCI Machine Learning Repository [45] for empirical study. On these datasets and their corresponding fuzzy covering, the results of clustering methods including FCM, RCM, RFCM, kNN-RCM, and kNN-RFCM are compared. In order to distinguish the results of the raw dataset and the fuzzy covering with the same algorithm, the clustering algorithms of the fuzzy covering are expressed as FFCM, FRCM, FRFCM, kNN-FRCM, and kNN-FRFCM, respectively. Details of the six datasets are described in Table 1.
e partition threshold related to RCM and its related algorithms is set as 0.001. φ and θ involved in fuzzy covering are set as 0.8 and 0.9, respectively. e value of k in the kNN algorithm is assigned as 7, and the evaluation indexes such as the normalized mutual information (NMI) [47], ACC [48], and rand index (RI) [49] are utilized to investigate the validity of the algorithm. Furthermore, the reasonable values of fuzzifier m involved in all comparison algorithms are greater than 1. m � 1.03 and m � 1.1 are selected, and the experimental comparison results are listed in Tables 2-7.
From Tables 2-7, it can be easily concluded that the selected fuzzy parameters have a significant impact on the performance of all comparison algorithms when dealing with the same dataset. Since the boundary region is the main cause of system uncertainty, thus, too large boundary regions are not required for three-way clustering and we need to pay attention to the uncertainty caused by the fuzzifier m Input: a set of objects U � x 1, x 2, · · · , x N , the cluster centers V � v 1 , v 2 , . . . , v C , the positive region POS � ∪ C i�1 POS(Q i ), boundary region BND � ∪ C i�1 BND(Q i ), and the optimal value of k. Output: the updated positive region POS(X) and boundary region BND(X) Step 1: calculate the distance between x i and other objects, where x i ∈ BND; Step 2: find the region where the k points with the smallest distance are located; Step 3: n Q i is the number of k objects in the positive region of class Q i , where i � 1, 2, · · · C. n Q C +1 is the number of k objects in the boundary region, and n Q 1 + n Q 2 + · · · + n Q C + n Q C+1 � k. If there is only one cluster Q j , such that n Q j � max  1, 2, . . . , N), the cluster centers v i (i � 1, 2, . . . , C), and the initial fuzzy membership degrees μ ij (i � 1, 2, . . . , C, j � 1, 2, . . . , N); Output: the positive, boundary, and negative regions of each cluster, respectively.
Step 1: compute the optimal partition thresholds α iopt and β iopt for each cluster Q i using formula (13); Step 2: according to formula (14), determine the positive region POS(Q i ), boundary region BND(Q i ), and NEG(Q i ) for each cluster Q i by α iopt , β iopt , and fuzzy partition matrix (μ ij ) C×N ; Step 3: update each clustering region by Algorithm 1; Step 4: update the membership partition matrix (μ ij ) N×N by formula (6); Step 5: update the cluster center v i (i � 1, 2, . . . , C) with formula (8); Step6: repeat Step 1 to Step 5 until convergence is reached; Step 7: the results of fuzzy covering clustering are replaced by the corresponding objects in the universe. Reducing operation in the implementation of the algorithms. Moreover, the clustering results show that kNN-FRFCM algorithm has better performance than the other algorithms in most of cases. is is mainly because it can reduce the uncertainty of the system by reprocessing the objects in the boundary regions. From the clustering results, we can also obtain that the results of clustering based on fuzzy covering are mostly better than the results of clustering with raw data. erefore, the valid fuzzy covering can replace the raw dataset for clustering, and the clustering results are better than the raw dataset. e premise that fuzzy covering can replace the raw dataset for clustering is to select the appropriate fuzzy similarity relation [46].  1  Iris  150  4  3  2  BCWO  683  10  2  3  New thyroid  215  5  3  4  Seeds  210  7  3  5  FTM  326  27  4  6 CT 221 36 2

Conclusions
In this paper, a valid fuzzy covering of the raw dataset is constructed by some principles. Because the similarity between fuzzy similarity classes in the valid fuzzy covering can be used to measure the similarity between objects in the raw dataset, each fuzzy similarity class reflects the connection with the whole dataset, so valid fuzzy covering instead of the raw data for clustering can improve the precision of clustering. From the perspective of semantic explanation of uncertainty change in fuzzy sets, we investigate the method of combining linear fuzzy entropy with nonlinear fuzzy entropy to obtain decision threshold pairs. e advantage of calculating thresholds method in this paper not only objectively obtains the classification thresholds based on the objects intrinsic relations but also the formula is simple and easy to understand, as well as the method of calculating the thresholds avoids the inappropriate subjective assignment. Additionally, the objects in the boundary region obtained by the FRFCM algorithm are reprocessed by the kNN algorithm to reduce the uncertainty of the system. Furthermore, we will continue to investigate the method of thresholds acquisition and the processing method of boundary region for three-way clustering following the idea of this paper. e three-way clustering in incremental information system is one of the future research directions too.
Data Availability e experimental data supporting the findings of this study are available on the website provided in this article.

Conflicts of Interest
e author declares that there are no conflicts of interest.