A Similarity Classifier with Bonferroni Mean Operators

A similarity classifier based on Bonferroni mean based operators is introduced. The new Bonferroni mean based variant of the similarity classifier is also extended to cover a new Bonferroni-OWA variant. The new Bonferroni-OWA based similarity classifier raisesthequestionofhowtoaccomplishtheweightingneededandforthisreasonwealsoexamineanumberoflinguisticquantifiersforweightgeneration.Thenewproposedsimilarityclassifiervariantsaretestedonfourrealworldmedicalresearchrelateddatasets. Theresultsarecomparedwithresultsfromtwopreviouslypresentedsimilarityclassifiers,onebasedonthegeneralizedmeanand anotherbasedonanarithmeticmeanoperator.Theresultsshowthatcomparativelybetterclassificationaccuracycanbereached withtheproposednewsimilarityclassifiervariants.


Introduction
In this paper we introduce a new generalization to the similarity classifier that is based on using Bonferroni mean operators in the aggregation of similarities.The Bonferroni mean aggregation operator was introduced in [1] and extended in [2][3][4][5][6].Currently, research with respect to Bonferroni mean is increasingly active (see, e.g., [7][8][9][10]).The Bonferroni mean operator is constructed in a way such that it consists of two parts; each argument of the outer arithmetic mean is the product of one argument and the average of all the other remaining inner arguments; this "feature" makes it a unique operator in terms of aggregation [2].Arithmetic mean and "generalized mean" are special cases (subcases) of the Bonferroni mean (see, e.g., [2]), an issue that makes it a flexible and a "versatile" operator-previously, both the generalized and the arithmetic mean have been used in similarity classifiers [11].
In this paper we also apply an ordered weighted averaging (OWA) based variant of the Bonferroni mean, the so-called "Bonferroni-OWA operator," proposed by Yager [5].The basic OWA operator has previously been studied in connection with similarity classifiers in [12], but the Bonferroni-OWA operator is applied in this context for the first time.In order to effectively use the OWA operator a set of associated weights (vector of weights) is required; here we have selected using linguistic quantifiers in order to generate these weights.Linguistic quantifiers give a parametrized way of producing weights for the Bonferroni-OWA operator, which adds flexibility but also introduces a need to find a proper parameter value.Parameter values can be examined and good parameter values found by, for example, sensitivity analysis.For the interested reader, more on linguistic quantifiers and their applications can be found, for example, in [13][14][15][16][17][18].By using different linguistic quantifiers, we show how several new and different variants of the Bonferroni-OWA based similarity classifiers can be created and examine the newly created variants.The algorithms examined here have been implemented with the MATLAB6 software, and the new classifiers with different variants are tested by using four different medical research data sets.
In the field of medical research, classification is a key concept and the use of classifiers is warranted in many practical problems, such as patient diagnosis and inevitably also the prognosis of various human conditions and pathologies [19].

Preliminaries
2.1.Aggregation Operators.The choice of an aggregation operator that is used in a similarity classifier is a fundamental issue, as it affects the final classification accuracy of the classifier.Several aggregation operators that can be used are available in the existing literature; in this paper we concentrate on averaging type aggregation operators [30].In what follows, we briefly go through the aggregation operators that we use in our new classifier; the interested reader may find more information on aggregation operators, for example, from [2,4,18,[30][31][32][33][34][35][36][37][38].
One of the most common aggregation operators is the arithmetic mean, from which several different generalizations exist, for example, the generalized mean and the ordered weighted average (OWA).The aggregation operator is an important component that is used in similarity classifiers and in this paper, we specifically propose and examine the use of the Bonferroni mean and the Bonferroni-OWA as aggregation operators to be used in a similarity classifier, to create new similarity classifiers.The presented new variants of the similarity classifier are compared with previously presented methods that use the generalized mean and the arithmetic mean.Both the generalized mean and the arithmetic mean are special cases of the Bonferroni mean [2].The generalized mean is defined as follows.
By varying the value of the parameter  several other means can be derived from the generalized mean (e.g., the arithmetic mean, when  = 1, the harmonic mean, when  = −1, and the geometric mean, when  → 0).
One other type of generalization of the arithmetic mean is the ordered weighted averaging operator.The ordered weighted averaging operator was introduced by Yager in [16].Later on several researchers have developed new aggregation operators based on the OWA; for example, see [4,39,40].The OWA operator is also an averaging operator that is characterized by a "reordering step" that allows emphasizing the importance of selected data values.The OWA operator is defined as follows.
As it is our intention to apply the OWA together with the Bonferroni mean, we next present the Bonferroni mean operator and its OWA extension, the so-called Bonferroni-OWA operator, following the work by Yager in [5].The Bonferroni mean operator was formally introduced in [1] and discussed extensively by other researchers in, for example, [2][3][4][5].Recently, several researchers have successfully utilized the generalized Bonferroni mean in practical problems [41][42][43][44][45].The Bonferroni mean is defined as follows.Definition 3. Let x = ( 1 ,  2 , . . .,   ),   ∈ [0, 1], ∀ ∈ N, be a vector with at least one   ̸ = 0 ∀ = 1, 2, . . .,  and let ,  ≥ 0 be parameters.The general Bonferroni mean of   is defined by [ It has been shown that the Bonferroni mean is an averaging operator and that it satisfies the necessary axioms (see [5]).Following (3) the Bonferroni mean operator can be viewed as the ( + )th root of the arithmetic mean, where each argument is the product of each    with the arithmetic Advances in Fuzzy Systems 3 mean of the remaining    ; see [2].Equation (3) was further modified to include several other means, by replacing either the inner or the outer means.One of the results involves using the OWA operator in place of the inner mean and is called the Bonferroni-OWA; for more details see [2,5].The Bonferroni-OWA is defined as follows.

Linguistic
Quantifiers and OWA Weight Generation.Linguistic quantifiers are quantifiers that use a scale of linguistic expressions to summarize the properties of a class of objects without enumerating them; this way they offer an imprecise and a flexible methodology for the quantification of objects; Ying [47] offers a compact review of the literature focused on linguistic quantifiers for the interested reader.Yager [15] classified linguistic quantifiers into three main categories: Regular Increasing Monotone (RIM), Regular Decreasing Monotone (RDM), and Regular Unimodal (RUM) quantifiers.These categories are options for when weight generation systems are envisioned; here we concentrate on RIM quantifiers and apply them.RIM quantifiers were defined by Yager [14] as follows.
During the ordered weighted aggregation process, terms like most, at least, many, and all are captured by an appropriate linguistic quantifier with parameter  ∈ R. Following [14,16], for any RIM quantifier , weights for the OWA operator are calculated from where   ≥ 0 and ∑  =1   = 1.In this paper we consider five different RIM quantifiers; these are the "basic," "polynomial," "quadratic," "exponential," and "trigonometric" RIM quantifiers.In what follows, we have denoted these with subscript enumerations 1-5 in the order given above.Next we briefly present each of the five selected RIM quantifiers and show how they can be applied in creating weight generating schemes for OWA.
(1) The basic linguistic quantifier,  1 , is defined by the equation which is associated with the weights   ; by application of ( 5) and ( 6) we obtain (2) The linguistic quantifier,  2 , proposed by Schweizer and Sklar [48], which we for the purposes of this research call a polynomial quantifier, is defined by the equation when  = 1, the polynomial and the basic RIM quantifiers will coincide; otherwise they behave differently.
Applying the polynomial RIM quantifier to the weight generation we get (3) The quadratic linguistic quantifier,  3 , was suggested by Ribeiro and Marques Pereira in [49]. 3 has two parameters: , which controls the maximum value of weight generation, and , which controls the ratio between the maximum and the minimum values of the generating function; see [49].The basic form of  3 is given by By applying it to weight generation we get For the purposes of practical implementation, we have chosen  = 0.5, but we acknowledge that the parameter value can be tuned for optimal performance.
(4) The exponential linguistic quantifier,  4 , is defined as when it is applied to weight generation we get (5) The trigonometric linguistic quantifier,  5 , is defined by the equation and application to weight calculation gives These operators, with the generated weighting vectors, are applied in the aggregation of similarities.

Similarity Measures.
In this paper we use similarity measures in a generalized Łukasiewicz-structure (see [50]) to compare objects.The motivation for this choice is that it has been shown that, in Łukasiewicz-structure, the mean of many similarities is a similarity [51].Also this approach has been previously used in determining similarities implemented in similarity classifiers; see details in [21,22].By choosing the Łukasiewicz-structure, two objects can be compared for all participating features.Let  1 and  2 be two objects in a set  with entries across all features 1, 2, . . ., .We can get  similarities, when the two objects are compared, that is,  1 () ↔  2 (),  = 1, 2, . . ., .Thus, we have the similarity, ⟨ 1 ,  2 ⟩ between  1 and  2 defined as follows [50]: An equivalence relation,  1 ↔  2 , between two objects in Łukasiewicz-structure was defined in [52] as It was shown in [50] that this relation can be generalized as Combining ( 16) and ( 17) leads one to a similarity measure, which can be used to calculate the similarity between two vectors with  objects.This has been earlier discussed in [50] and further applied in [11,12,21].Thus, with the arithmetic mean, we can write the similarity between two objects  1 and  2 as where  is the parameter for the similarity measure in the generalized Łukasiewicz-structure.
Several other means can be used instead of the arithmetic mean in (18).With the generalized mean, a modification can be made to include the parameter  in the generalized mean to obtain If one replaces the generalized mean with the Bonferroni mean one arrives at a similarity with the following form: Now, to apply the Bonferroni-OWA to the similarity, the inner mean in ( 20) is replaced with the OWA operator and  =  = 1 and the similarity can be rewritten as where   is a weighting vector such that ∑  =1   = 1 and is the th largest element of the reordered similarity.In the next section, we explain how classification based on the presented similarity measures is done.

Similarity Classifier with Bonferroni Mean Operators
A new Bonferroni mean based similarity classifier and its OWA variant are introduced in this section.Before going into details of these new classifiers, we briefly describe the main components typically found in similarity classifiers.It is possible to determine the similarity between two or more samples in a given data set; the main idea is based on comparing samples and as a result of the comparison providing a numerical value that represents their similarity.Typically for similarity classifiers, resulting values closer to 1 indicate high similarity between objects and values closer to 0 indicate low similarity.For classification tasks, the challenge is typically the partitioning of the attribute space in a way such that samples with the same characteristics are allocated into the same classes; for example, see [53].Once the assignment of samples into individual classes is done properly the classification procedure can proceed.
Suppose a data matrix  ∈ [0, 1] × is to be classified into  different classes  1 ,  2 , . . .,   across  attributes, 1, 2, . . ., .The initial step is to find mean vectors for each class; these are often called ideal vectors; for example, for class   , such a vector is denoted as k  = (V  (1), V  (2), . . ., V  ()), where the entry V  (1) is the mean value of the elements in the class   .We observe that there are several ways of determining these ideal vectors, k  ; for example, one can use the generalized mean; see also [31] for other methods of computing means that can be applied as ideal vectors in this context.The generalized mean, as it is usable in this context, is defined as where the parameter  (that comes from the generalized mean) is fixed ∀,  and ♯  denotes the number of samples in class   .To determine to which class any arbitrary sample  ∈  belongs, it is compared to the ideal vectors of different classes.The comparison can be done by computing the similarity for attributes in the earlier described generalized Łukasiewicz-structure [50].The similarity between a sample x and an ideal vector of a given class k with the Bonferroni mean based similarity measure is given by for x, k ∈ [0, 1]  , where  is a parameter from the similarity measure and  and  are parameters from the Bonferroni mean operator; see [1].In the same manner, we write the similarity measure with the Bonferroni-OWA as where   is a weighting vector such that ∑  =1   = 1 and   = (1 − |()  − V()  |) 1/ is the th largest element of the reordered similarities.
The sample  ∈  is assigned to a class with which it has the highest similarity value, for example, in accordance with A pseudocode algorithm from the main part of the process is given in Algorithm 1.
In order to use similarity with the Bonferroni-OWA in Algorithm 1 we need to replace (, ) with the Bonferroni-OWA based similarity.In this case, all the other steps are the same, but  sum1 () and (, ) become where   and   are defined in accordance with (24).For purposes of finding   , different weight generating linguistic quantifiers were used.

Experimental Setting for Examination of Results.
The experiments were carried out by splitting each studied data set into two parts, one part for training and the other for testing.The data set divisions were repeated randomly 30 times in each experiment and the resulting classification accuracies with corresponding means and variances (from the thirty runs) were recorded.Individual surface plots for each new similarity classifier variant were also plotted; these help to identify proper parameter values.Statistical comparison of classification accuracies from new classifiers with accuracies from the two benchmarks was made by using typical sample statistics (-test).

Data Sets.
Data sets used in testing our new classifier were retrieved from the UCI Machine Learning Repository [54].Properties of each set used, including the number of classes and attributes and number of instances, are summarized in Table 1.
Further detailed attribute information for the fertility, blood transfusion service center, and echocardiogram data sets is presented in Tables 2, 3, and 4.

Experimental Results.
In this section we present the obtained results from the experiments.Mean accuracies from 30 separate runs were computed for each data set and for each classifier combination.The resulting classification accuracies and the variances obtained are reported in Tables 5, 6, 7, and 8 separately for each data set.For the purposes of benchmarking the new similarity classifiers and their variants, we also report results obtained with arithmetic mean and generalized mean based similarity classifiers.For the fertility data set, the highest classification accuracy of 74.20% was obtained using the Bonferroni mean based similarity classifier.The Bonferroni-OWA based classifier produced a classification accuracy of 70.63% with the    results obtained with the generalized mean based similarity classifier with a 95% confidence level.The results with the new Bonferroni mean based similarity classifier exhibit a statistically significant difference (improved classification performance) to both benchmark cases.The Bonferroni-OWA based similarity classifier that uses the polynomial quantifier for weight generation was found to be statistically significantly better compared to the arithmetic mean based similarity classifier.For the rest of the Bonferroni-OWA variants a statistically significant difference to the benchmark cases could not be established.
Figure 1 shows the surface corresponding to the classification results received from a set of runs with different combinations of parameter values of the parameters  = [0, 8] for the similarity measure and  = [0,8] for the Bonferroni mean, obtained with the Bonferroni mean based similarity classifier and the corresponding variances.The value (for Bonferroni mean) was fixed at the value 6-this value was not randomly set, but the choice was a result of optimization performed.The best performance was reached with  set to approximately 6 and  set to approximately 6.
When the new similarity classifiers were used to classify the blood transfusion service center data set, the highest achieved mean accuracy was 76.91% and was obtained with the basic linguistic quantifier variant of the Bonferroni-OWA based similarity classifier.The classification accuracy of 76.86% was achieved with the Bonferroni mean based similarity classifier.With this data set, all results with the new proposed similarity based classifiers were found to have statistically significant difference with results from the arithmetic mean based similarity classifier method (indicated with ⋆).There was also statistically significant difference between the results from the generalized mean based method with three variants of the Bonferroni-OWA based similarity classifier (denoted with a circle in Table 6).Figure 2 shows how different combinations of the values for parameters  and  affect the resulting classification accuracy and the corresponding variance of the classification accuracy.
With echocardiogram data set, the highest achieved mean classification accuracy was 91.13%, obtained by using the trigonometric linguistic quantifier variant of the Bonferroni-OWA based similarity classifier.The Bonferroni mean based similarity classifier gave a mean accuracy of 90.90%.All results received with the new proposed similarity classifiers and their variants were statistically significantly different from the results obtained by using the arithmetic mean based similarity classifier, at a 95% confidence level.Statistically significant difference could not be verified between the new methods and the generalized mean based approach.In Table 7 mean accuracies and variances are reported for all new similarity classifiers and their variants and for the benchmark cases (⋆ indicates that achieved accuracies are statistically significantly different from accuracies obtained with the arithmetic mean based classifier (this is explained in the text)).
Figure 3 shows how different parameter value combinations affect the mean classification accuracy and the variance Advances in Fuzzy Systems of the accuracy, when the trigonometric linguistic quantifier variant of the Bonferroni-OWA based similarity classifier is used.
When the lung cancer data set was examined, the highest achieved mean accuracy was 82.94% and was obtained with the Bonferroni mean based similarity classifier.We obtained classification accuracy of 82.75% with the best functioning quadratic linguistic quantifier variant of the Bonferroni-OWA based similarity classifier.Detailed results are presented in Table 8.Statistically significant difference could not be verified between the new methods and the benchmark cases.Figure 4 shows how different parameter value combinations affect the mean classification accuracy and the variance of the accuracy, when the Bonferroni mean based similarity classifier is used.To produce the figure the parameter  (in the Bonferroni mean) was fixed at  = 3 after a set of experimental runs to optimize performance.The highest accuracy reported was achieved with  = 4 and  = 4.
In Table 9 we summarize the best achieved mean classification accuracies with the new similarity classifiers (and variants) and both the benchmark cases.As can be seen from the table there is no clear "winner," but for all cases the new proposed similarity classifiers outperformed the benchmark cases; specifically the Bonferroni mean based similarity classifier was the best in two out of four cases and also for the remaining two cases beat both benchmark cases.In a way the overperformance with regard to the selected benchmarks might have been expected, as both the benchmarks are subcases of the Bonferroni mean.

Discussion
In this paper we have proposed a new Bonferroni mean based similarity classifier and a new Bonferroni-OWA based similarity classifier with five variants that are based on different linguistic quantifier based weight generation schemes for   the OWA used in the classifier.The classification performance of the proposed new similarity classifiers was tested on four real world medical data sets, for each of which thirty sets of runs were made and the mean average classification performance was recorded.As a benchmark, we have compared the results from the proposed new similarity classifiers with two previously presented similarity classifiers, based on the generalized mean and on the arithmetic mean.The mean classification performance of the proposed new similarity classifiers was better than the performance of the benchmarks; however, not on all data sets was the difference in performance statistically significant.Nevertheless, there is evidence that suggests that the proposed new similarity classifiers perform at least as well as and often better than the benchmark similarity classifiers.We note that the performance of these classifiers is data dependent.
Future research on the subject of similarity classifiers, multiclassifier approach, could be considered where each classifier would have one vote on samples class and the final class of the sample is decided by the consensus of the classifiers.

Figure 1 :
Figure 1: Mean classification accuracies (a) and the variances (b) obtained from the fertility data set with the Bonferroni mean based classifier, when  = 6.

Figure 2 :Figure 3 :
Figure 2: Mean classification accuracies (a) and the variances of the accuracies (b) obtained from the blood transfusion service center data set with the Bonferroni-OWA based similarity classifier with the basic linguistic quantifier variant.

Figure 4 :
Figure 4: Mean classification accuracies (a) and the variances (b) obtained from the lung cancer data set with the Bonferroni mean based similarity classifier, when  = 3.

Table 1 :
Data sets used and their main properties.

Table 2 :
Fertility data set attribute information.

Table 3 :
Blood transfusion service center data set attribute information.

Table 4 :
Echocardiogram data set attribute information.

Table 5 :
Classification results with the fertility data set.

Table 6 :
Classification results with blood transfusion service center data set.

Table 7 :
Classification results with the echocardiogram data set.
mean based classifier gave the accuracy of 66.87%.Complete results and their corresponding variances are presented in Table5.In the same table, computed accuracies with a star (⋆) are statistically significantly different from the result with the arithmetic mean based similarity classifier and those with a circle (∘) are significantly different from the

Table 8 :
Classification results with the lung cancer data set.

Table 9 :
A summary of classification accuracies for each of the similarity classifiers.