FCM-Type Fuzzy Coclustering for Three-Mode Cooccurrence Data : 3 FCCM and 3 Fuzzy CoDoK

Cocluster structure analysis is a basic technique for revealing intrinsic structural information from cooccurrence data among objects and items, in which coclusters are composed of mutually familiar pairs of objects and items. In many real applications, it is also the case that we have not only cooccurrence information among objects and items but also intrinsic relation among items and other ingredients. For example, in food preference analysis, users’ preferences on foods should be found considering not only user-food cooccurrences but also the implicit relation among users and cooking ingredients. In this paper, two FCM-type fuzzy coclustering models, that is, FCCM and Fuzzy CoDoK, are extended for revealing intrinsic cocluster structures from threemode cooccurrence data, where the aggregation degree of three elements in each cocluster is maximized through iterative updating of three types of fuzzy memberships for objects, items, and ingredients. The characteristic features of the proposed methods are demonstrated through a numerical experiment.


Introduction
In many web data analyses, we often have cooccurrence information among objects and items instead of multidimensional observations on objects.For example, web document summarization and web market purchase summarization are reduced to document-keyword cooccurrence analysis and customer-product basket analysis, respectively.FCMtype fuzzy coclustering is an extension of fuzzy -Means (FCM) [1], where the degree of belongingness to clusters is represented by fuzzy memberships under the fuzzy partition concept [2].Fuzzy clustering for categorical multivariate data (FCCM) [3] replaced the FCM clustering criterion with the aggregation degree of objects and items in coclusters by adopting entropy-based fuzzification [4,5].In fuzzy coclustering of documents and keywords (fuzzy CoDoK) [6], the FCCM criterion was maximized with quadratic regularization-based fuzzification [7], so that it can be applied to large data sets.
Besides their usefulness in many applications, it is also the case that the conventional fuzzy coclustering models cannot work well under severe influences of other intrinsic features.For example, in food preference analysis, users' preferences on foods cannot be revealed considering only user-food cooccurrences but should be found considering implicit relation among users and cooking ingredients, which compose the foods.Then, when we have not only cooccurrence information among objects and items but also intrinsic relation among items and other ingredients; we can expect to find more useful cocluster structures in three-mode cooccurrence information data.
In this paper, two FCM-type fuzzy coclustering models are extended for analyzing three-mode cooccurrence information data, in which FCM-like alternative optimization schemes are performed considering cooccurrence relation among objects, items, and other ingredients.First, the FCCM algorithm is extended to the three-mode FCCM (3FCCM) algorithm by utilizing three types of fuzzy memberships for objects, items, and ingredients, where the aggregation degree of three features in each cocluster is maximized through iterative updating of memberships supported by the entropy-based fuzzification.Second, the 3FCCM algorithm 2 Advances in Fuzzy Systems is further extended to the three-mode Fuzzy CoDoK (3Fuzzy CoDoK) by introducing the quadratic regularization-based fuzzification.The characteristic features of the proposed methods are demonstrated through a numerical experiment.
The remainder of this paper is organized as follows: Section 2 gives a brief review on the conventional FCM-type fuzzy coclustering models and Section 3 proposes the novel extensions of the FCM-type coclustering models for threemode cooccurrence information data.The experimental result is shown in Section 4 and a summary conclusion is presented in Section 5.

FCM-Type Fuzzy Coclustering
Fuzzy -Means (FCM) [1,5] is a fuzzy extension of the conventional crisp -Means [8] by introducing fuzzy partition concept [2].When we have multidimensional observations on  objects x  ,  = 1, . . ., , they are partitioned into  fuzzy clusters by estimating fuzzy memberships   for each object, where   represents the degree of belongingness of object  to cluster  and is generally calculated under the probabilistic constraint of ∑  =1   = 1.In FCM, each cluster is represented by prototypical centroids and objects are partitioned so that the membership-weighted within-cluster errors from prototypes are minimized in the multidimensional data space.On the other hand, in the coclustering context, we have only relational information among elements but do not use any cluster prototypes in multidimensional space.In this paper, two variants of FCM-type fuzzy coclustering are considered.

FCCM.
Assume that we have  ×  cooccurrence information  = {  } among  objects and  items; for example, in document-keyword analysis,   can be the frequency of keyword (item)  in document (object) .The goal is to extract coclusters composed of mutually familiar pairs of objects and items by simultaneously estimating fuzzy memberships of objects   and items   such that mutually familiar objects and items with large   tend to have large memberships in the same cluster considering the aggregation degree of each cocluster.The sum of aggregation degrees to be maximized is defined as [3] This objective function is based on the similar concept to such relational matrix decomposition methods as corresponding analysis (CA) [9] and nonnegative matrix factorization (NMF) [10], where relational matrices  = {  } are decomposed into two component matrices having orthogonal columns.Beside both objects and items are equally forced to be exclusive in the matrix decomposition methods, FCMtype coclustering models adopt different kinds of partition constraints [11].Here, object memberships   have a similar role to those of FCM under the same condition, such that ∑  =1   = 1.If item memberships   also obey a similar condition of ∑  =1   = 1, the aggregation criterion has a trivial maximum of   =   = 1, ∀,  in a particular cluster .Then, in order to avoid trivial solutions,   are forced to be exclusive in each cluster, such that ∑  =1   = 1, and, so,   represent the relative typicalities of items in each cluster.As a result, object partitioning is mainly targeted in FCMtype coclustering while CA and NMF equally force exclusive nature to partitions of both objects and items.
Because of the linear nature with respect to   and   , (1) is maximized with crisp memberships of   ∈ {0, 1} and   ∈ {0, 1} in a similar manner to -Means.In order to find fuzzy partition, some fuzzification mechanism must be introduced like FCM.
In [3], the linear aggregation criterion of (1) was nonlinearized with respect to   and   by entropy-based penalties [4,5] for fuzzification of two-types of memberships and the objective function for Fuzzy Clustering for Categorical Multivariate data (FCCM) was proposed as where   and   are the fuzzification weights for object memberships and item memberships, respectively.Larger   and   bring fuzzier partitions of objects and items.
Based on the alternative optimization principle,   and   are iteratively updated until convergent using the following updating rules: Although the two updating rules are always fair under the constraints, they can be numerically unstable due to overflows because exp(⋅) function can take extremely large values with very large  or .

Fuzzy CoDoK.
As an alternative approach, Kummamuru et al. [6] extended FCCM by introducing the quadric termbased fuzzification mechanism [7] instead of the entropybased fuzzification, so that it can handle larger data sets.The objective function of fuzzy coclustering of documents and keywords (Fuzzy CoDoK) was proposed as where   and   play similar roles to FCCM.

Advances in Fuzzy Systems
Based on the Lagrangian multiplier method, the updating rules are obtained as The updating rules are more numerically stable than those of FCCM because their calculation ranges are in linear orders with respect to  and .However,   and   can be negative and are not fair under the constraints.Then, in practice, the negative memberships are set to zero, and the remaining positive memberships are renormalized so that their sum is one.
Besides the usefulness of these fuzzy coclustering models in handling two-modes cooccurrence information, their cocluster structures may be influenced by other third elements.Specifically, if each item is related to some other ingredients, the partition quality is expected to be improved by considering the intrinsic relation among three-mode elements.In the following section, the FCM-type coclustering algorithms are extended for analyzing such three-mode cooccurrence information data.

Extension of FCM-Type Coclustering for Three-Mode Cooccurrence Data Analysis
Assume that we have  ×  cooccurrence information  = {  } among  objects and  items, and the items are characterized with other ingredients, where cooccurrence information among  items and  other ingredients are summarized in  ×  matrix  = {  } with   representing the cooccurrence degree of item  and ingredient .For example, in food preference analysis,  can be an evaluation matrix by  users on  foods and  may be appearance/absence of  cooking ingredients in  foods.The goal of three-mode cocluster analysis is to reveal the cocluster structures among the objects, items, and ingredients considering  and  and intrinsic relation among objects and ingredients.
In order to extend the conventional FCCM and Fuzzy CoDoK algorithms to three-mode cocluster analysis, additional memberships   are introduced for representing the membership degree of ingredients  to cocluster .Besides the familiar pairs of objects and items simultaneously occur in the same cluster; typical ingredients of the items should also belong to the same cluster.Then, the aggregation degree to be maximized in the three-mode coclustering can be as where each cluster should be composed of the familiar group of objects, items, and ingredients such that they are assigned to the same cluster when object  cooccurs with item  composed of ingredient  by implying an intrinsic connection between object  and ingredient .
In the following parts of this section, the conventional FCCM and Fuzzy CoDoK algorithms are extended to their three-mode versions utilizing the above aggregation criterion.
3.1.Three-Mode Extension of FCCM.First, the FCCM algorithm is extended by using the modified aggregation criterion of (6) supported by the entropy-based fuzzification scheme.The objective function for three-mode FCCM (3FCCM) is constructed by modifying the FCCM objective function of (2) as where   is the additional penalty weight for fuzzification of ingredient memberships   .The larger the value of   is, the fuzzier the ingredient memberships are.
Here, it should be noted that we can adopt two different types of constraints to ingredient memberships   , such that object-type probabilistic constraint ∑  =1   = 1, ∀ or itemtype typicality constraint ∑  =1   = 1, ∀.In such cases as food preference analysis, some common ingredients may be widely used in many foods while other rare ingredients can be negligible in all clusters.Then, from the view point of typical ingredient selection for characterizing cocluster features, item-type typicality constraint is adopted in this paper, such that ∑  =1   = 1, ∀.The clustering algorithm is an iterative process of updating   ,   , and   under the alternative optimization principle.Considering the necessary conditions for the optimality  3fccm /  = 0,  3fccm /  = 0, and  3fccm /  = 0 under the sum-to-one constraints, the updating rules for three memberships are given as

Three-Mode Extension of Fuzzy CoDoK. Next, Fuzzy
CoDoK is extended to the three-mode coclustering model named three-mode Fuzzy CoDoK (3Fuzzy CoDoK).The objective function of ( 4) is modified as where   play a similar role to that in 3FCCM and the three types of fuzzy memberships also follow the same constraints with 3FCCM.
The updating rules are given in the similar manner to the previous section as follows: In a similar manner to Fuzzy CoDoK, the above updating rules are computationally more stable than 3FCCM because of the lack of exp(⋅) function.However,   ,   , and   can be negative.Then, in practice, the negative memberships should be set to zero, and the remaining positive memberships can be renormalized so that their sum is one.

A Sample Algorithm for FCM-Type Fuzzy Coclustering for Three-Mode Cooccurrence Data. Following the above derivation, a sample algorithm is represented as follows: [FCM-Type Fuzzy Coclustering for Three-Mode Cooccurrence Data: 3FCCM and 3Fuzzy CoDoK]
(1) Given  ×  cooccurrence matrix  and  ×  cooccurrence matrix , let  be the number of clusters.Choose the fuzzification weights   ,   , and   . (

Experimental Results
4.1.Experimental Design.In order to demonstrate the characteristics of the proposed algorithms, a numerical experiment was performed with an artificially generated threemode data set, in which 40 objects ( = 40) have relational connection with 50 items ( = 50) and the items are related to 30 ingredients ( = 30).The artificial three-mode cooccurrence matrices were generated under the assumption that objects and ingredients have intrinsic (unknown) connections, as shown in the 40 × 30 matrix  = {  } of Figure 1(a), where black and white cells represent fullconnection (  = 1) and no-connection (  = 0), respectively.(Note that all the following gray-scale figures depict visual images of matrices, where black and white cells represent maximum and minimum values.)50 × 30 cooccurrence matrix  = {  } among items and ingredients was constructed, as shown in Figure 1(b), where ingredients  randomly occurred (  = 1) in each item with 10% probability, whereas others remained as   = 0.Then, 40 × 50 cooccurrence matrix  = {  } among objects and items was generated, such that   = 1 if item  cooccurs with several ingredients, which are connected with object  in , and   = 0 otherwise.Figure 1(c) shows the cooccurrence matrix .For example, in food preference analysis, 50 foods are made from 30 cooking ingredients (matrix ) and each of 40 user chooses some foods (matrix ) considering their intrinsic preferences on cooking ingredients (matrix ).
The goal of this experiment is to extract the intrinsic cocluster structure among objects, items, and ingredients from cooccurrence matrices  and  without utilizing the intrinsic (unknown) connection  among objects and ingredients; that is,  is withheld in the following experiments.

Cocluster Extraction by 3FCCM and 3Fuzzy CoDoK.
First, the proposed 3FCCM and 3Fuzzy CoDoK algorithms were applied to  and  with  = 3 and their results are compared.Fuzziness penalties were   = 0.1,   = 0.2 and   = 0.3 for 3FCCM, and   = 0.05,   = 10.0, and   = 15.0 for 3Fuzzy CoDoK.The derived three types of memberships, which were the most frequent solutions in 100 trials with different random initializations, are depicted in the gray-scale figures of Figures 2 and 3, where each row represents the membership degree of objects, items, or ingredients for a cluster.
Figures 2(a) and 2(b) indicate that the 40 objects were successfully partitioned into three clusters by the 3FCCM algorithm, in which some meaningful ingredients, that is, cluster-wise typical ingredients in Figure 1(a), have large memberships for characterizing each cocluster, even though the intrinsic information  was withheld in the experiment.Additionally, some typical items of each cluster were also indicated by large   , as shown in Figure 2(c); for example, items 4, 5, 9, 14, 25, and 37 are typical in cluster 1.
By the way, Figure 3 14).Additionally, some meaningless ingredients had nonzero memberships in contrast to the result of 3FCCM.The similar feature can be also seen in Figure 3(c).
These results imply that the 3FCCM algorithm is more suitable for clearly capturing the intrinsic connections although 3Fuzzy CoDoK has an advantage in computational stability.

Comparison with Conventional Two-Mode Fuzzy
Coclustering.Second, the above clustering results are compared with the conventional FCCM and Fuzzy CoDoK, which are designed only for two-mode cooccurrence information.Although the intrinsic connection  is withheld in this experiment, a similar intrinsic information can be reconstructed by multiplying two cooccurrence matrices  and , such that  ×  gives an  ×  relational matrix on objects and ingredients.Figure 4 shows the estimated 40 × 30 intrinsic connection matrix X =  × .
The conventional FCCM and Fuzzy CoDoK were applied to X. Fuzziness penalties were   = 0.05 and   = 50.0for FCCM and   = 0.1 and   = 100.0for Fuzzy CoDoK.Here, item memberships   are identified with ingredient memberships   in the algorithms.Figures 5 and 6 show the derived memberships, which most frequently appeared in 100 trials with different random initializations.The figures imply that 40 objects were partitioned into similar three clusters to those of 3FCCM and 3Fuzzy CoDoK.However, ingredient memberships   were slightly contaminated and it is hard to intuitively select meaningful ingredients comparing with the result of 3FCCM.It may be because all items are embedded into X with equal responsibilities and the estimated X = × was contaminated by noise as shown in Figure 4 rather than  of Figure 1(a).In contrast, the typical ingredients can be extracted in 3FCCM by selecting only meaningful items in each cluster.Next, the robustness of the algorithms against random initialization is studied by comparing the frequencies of the plausible solutions.Table 1 compares the frequencies of the above results in 100 trials with different random initializations and indicates that the proposed three-mode coclustering models are more robust to random initialization than the conventional two-modes ones by utilizing three-mode cooccurrence information.That is, the optimal selections of both items and ingredients contribute to reduction of influences of randomness.
Therefore, the proposed algorithms are useful in analyzing three-mode cooccurrence information, which simultaneously consider the typicality of three elements.

Comparison with Multiple Corresponding Analysis.
Finally, the partition characteristic of the proposed coclustering models is compared with the relational matrix decomposition method.Multiple correspondence analysis (MCA) [9] is a technique for revealing the structural information of categorical data, where mutual relations among objects and multiple categories are summarized into low-dimensional plots.In this experiment, an enlarged cross-tabulation was constructed by combining two cooccurrence matrices  and  into  × ( + ) matrix [ ⊤ , ] so that the three elements are summarized on a plots figure.Figure 7 shows the 2D plots figure given by MCA.Although MCA does not necessarily aim at object-targeting partition,  objects were clearly separated into three subgroups in a similar manner to the proposed coclustering models because the objects had almost crisp boundaries.However, many other items and ingredients were distributed in the middle area and their contribution to the clusters was not emphasized as in the case of two-mode fuzzy coclustering of the previous subsection.
The proposed algorithms have advantages in handling three-mode elements by emphasizing their contributions to each coclusters.Additionally, while the implicit fuzziness degree of MCA is fixed (unchangeable), the proposed coclustering model can improve the interpretability of cluster partition by tuning the fuzziness degrees.

Conclusion
In this paper, novel coclustering models were proposed for analyzing three-mode cooccurrence information with the goal being to improve the partition quality of the conventional two-modes analysis.The proposed 3FCCM and 3Fuzzy CoDoK algorithms extended the conventional FCCM and Fuzzy CoDoK algorithms by introducing an additional membership for ingredients into the aggregation degree of three elements: objects, items, and ingredients.A numerical experiment with an artificial data set demonstrated that 3FCCM is more useful in capturing the intrinsic connection among objects and ingredients while 3Fuzzy CoDoK is suitable for handling large data sets with its computational stability.
Besides the simplicity of FCM-type coclustering, FCCM and fuzzy CoDoK sometimes have the difficulty in tuning of fuzziness degrees.In the conventional two-modes coclustering, an MMMs-induced model [12] showed a better utility than FCCM and fuzzy CoDoK.A potential future work is to improve the proposed FCM-type three-mode coclustering by introducing a statistical concept for easy tuning of fuzziness degrees.Another direction of future work is to develop a validity measure [13] for selecting the optimal cluster partitions.
(a) indicates that the 3Fuzzy CoDoK algorithm also extracted almost same object clusters with Figure 2(a), but the ingredient memberships shown in Figure 3(b) have slightly different features from Figure 2(b).Only a few ingredients have very large memberships while many other ones have completely zero memberships because of negativity of (

Table 1 :
Comparison of frequencies of plausible solutions in 100 trials.