Optimal feature subset selection is an important and a difficult task for pattern classification, data mining, and machine intelligence applications. The objective of the feature subset selection is to eliminate the irrelevant and noisy feature in order to select optimum feature subsets and increase accuracy. The large number of features in a dataset increases the computational complexity thus leading to performance degradation. In this paper, to overcome this problem, angle modulation technique is used to reduce feature subset selection problem to four-dimensional continuous optimization problem instead of presenting the problem as a high-dimensional bit vector. To present the effectiveness of the problem presentation with angle modulation and to determine the efficiency of the proposed method, six variants of Artificial Bee Colony (ABC) algorithms employ angle modulation for feature selection. Experimental results on six high-dimensional datasets show that Angle Modulated ABC algorithms improved the classification accuracy with fewer feature subsets.
Many data mining and machine learning applications suffer from the curse of dimensionality in which a dataset usually involves a large number of features, often including relevant and irrelevant features [
Filter methods and wrapper methods are two main strategies in feature selection [
When a wrapper method is used, the problem of optimal feature subset selection can be seen as NP-Hard because the number of possible feature subsets in search space is 2
In this study, ABC algorithms employ angle modulation based bit vector generation for feature selection for the first time. In angle modulation based approach, an ABC algorithm, called Angle Modulated Artificial Bee Colony (AMABC) algorithm, selects candidate feature sets with a bit vector obtained by a bit string generator employing a trigonometric function. The main advantage of this approach is that an AMABC algorithm tries to optimize the trigonometric function that has only four parameters in continuous domain. Thus, high-dimensional binary search space can be presented by only 4-dimensional continuous search space for any dataset. Consequently, any ABC algorithm variant applied to continuous optimization problems in the literature can be used for feature selection problem. To do so, we have adopted angle modulation to six ABC variants to show its significant effect on finding relevant feature subset selection on dataset instances having many features. The comparison shows that ABC algorithms with Angle Modulated feature selection significantly improve the classification accuracy using fewer features.
This paper is organized as follows. Section
Artificial Bee Colony (ABC) algorithm, which is inspired from the foraging behaviour of real bee colonies, is proposed for tackling optimization problems. It was at first introduced by Karaboga [
Colony life is organized by division of labour. It comprises three types of bees, employed bees, onlooker bees, and scout bees, which are specialized for different tasks. The employed bees forage outside the hive and communicate with onlooker bees through a series of dances when they return to the hive with news of discovered food source. The onlooker bees obtain remarkable accurate information about the location and the quality of the discovered food sources. The attractiveness of the dance, which is assumed as selection probability of a food source, recruits the onlooker bees to help find new good food sources in the vicinity of the discovered one. A food source is abandoned because of its low quality. Then, an employed bee turns to scout bee which flies around looking for food in desirable spots. Based on this phenomenon, the ABC algorithm is composed of four main steps:
In this section, we briefly describe five ABC algorithms which we considered here as feature selection methods on various datasets.
Modified ABC (MABC) is proposed by Akay and Karaboga [
Gbest guided ABC (GABC) [
GbestDist guided ABC (GDABC) [
Chaotic ABC (CABC) [
Enhanced ABC (EABC) [
Angle Modulated Artificial Bee Colony (AMABC) algorithms are used for finding an optimal solution of binary optimization problems by reducing the problem to a four-dimensional continuous optimization problem. To do so, the algorithm generates bit strings by employing a trigonometric function derived from angle modulation [ ABC algorithms are originally presented for bound-constraint continuous optimization. They perform well and are competitive with the contemporary algorithms for continuous optimization problems. However, superiority of the binary variants of ABC algorithms has not been proved yet in the literature. With this approach, ABC algorithms try to find appropriate values of coefficients in continuous space instead of evolving bit strings in binary space. This approach decreases the dimension of the problem. For example, a large scale binary problem instance can be represented by a four-dimensional problem instance in continuous space. Several ABC variants proposed for continuous optimization can be applied easily to a binary optimization problem without modification on original implementation of the algorithm.
Generating a bit vector by sampling the function with
Angle Modulated ABC algorithms are wrapper methods to tackle feature selection. The basic process of feature selection with AMABC algorithms is presented in Figure
Summarization of the whole AMABC process for feature selection.
In an AMABC algorithm, each solution contains four-tuple value (
In this section, we analyse the feature selection performance of Angle Modulated ABC algorithms. The six ABC algorithms are selected for performance comparison, namely, original ABC (OABC), modified ABC (MABC), enhanced ABC (EABC), Gbest guided ABC (GABC), chaotic ABC (CABC), and GbestDist guided ABC (GDABC). Six real world datasets from the University of California [
Datasets used for experiments.
ID | Dataset name | # Features | # Instances | # Classes |
---|---|---|---|---|
HC | Horse Colic | 22 | 368 | 2 |
AN | Anneal | 38 | 898 | 6 |
IS | Ionosphere | 34 | 351 | 2 |
SR | Sonar | 60 | 208 | 2 |
SN | Soybean | 35 | 683 | 19 |
DE | Dermatology | 34 | 366 | 6 |
In the experiments, the number of function evaluations for all algorithms is set to
Parameter setting for ABC algorithms.
Control parameters | Value | Algorithm |
---|---|---|
Population Size | 20 | All algorithms |
limitF | 1 | All algorithms |
mu | −1 | CABC |
C | −1 | GABC, GDABC |
MR | −1 | MABC |
SF | −1 | MABC |
All algorithms are run 20 times and mean values are calculated to get feature selection accuracy. For the experiments, we have conducted nonparametric tests, the Wilcoxon signed ranks test to detect significant differences between the performance accuracies with and without feature selection. The statistical analyses are performed with a significance level
Tables
Classification accuracy of each ABC algorithm on the tested datasets by using SVM as an induction algorithm.
SN | SR | IS | DE | HC | AN | ||
---|---|---|---|---|---|---|---|
Without feature selection | Accuracy (%) |
|
|
87.09 ± 0.48 | 94.64 ± 0.39 | 79.10 ± 1.21 |
|
Average number of features | 35 | 60 | 34 | 34 | 22 | 38 | |
|
|||||||
CABC | Accuracy (%) | 85.35 ± 0.720 | 70.67 ± 0.93 | 94.30 ± 1.45 | 97.26 ± 5.22 | 85.86 ± 1.37 | 97.77 ± 0.15 |
Average number of features | 12.3 | 17.6 | 9.2 | 13.6 | 4.5 | 8.4 | |
|
|||||||
EABC | Accuracy (%) | 85.94 ± 0.52 | 73.07 ± 0.01 | 94.01 ± 0.36 | 97.26 ± 0.08 | 85.86 ± 0.5 | 97.77 ± 0.21 |
Average number of features | 7.8 | 10.2 | 8.9 | 13.4 | 4.1 | 7.9 | |
|
|||||||
GABC | Accuracy (%) | 85.51 ± 0.31 | 72.11 ± 1.23 | 94.30 ± 0.55 | 96.45 ± 0.88 | 86.14 ± 0.1 | 97.77 ± 0.23 |
Average number of features | 9.1 | 18.0 | 6.7 | 8.3 | 5.6 | 13.7 | |
|
|||||||
GDABC | Accuracy (%) | 85.36 ± 0.59 | 73.55 ± 1.03 | 94.30 ± 0.18 | 96.72 ± 0.17 | 85.86 ± 0.93 | 97.77 ± 0.08 |
Average number of features | 9.4 | 22.2 | 11.8 | 8.9 | 4.2 | 15.8 | |
|
|||||||
MABC | Accuracy (%) | 86.67 ± 0.62 | 75.96 ± 1.22 |
|
|
|
97.77 ± 0.01 |
Average number of features | 12.3 | 8.5 | 7.4 | 8.5 | 3.9 | 10.0 | |
|
|||||||
OABC | Accuracy (%) | 84.77 ± 0.65 | 73.08 ± 0.61 | 94.30 ± 0.47 | 96.99 ± 0.30 | 85.87 ± 1.35 | 97.77 ± 0.35 |
Average number of features | 5.8 | 10.3 | 5.7 | 4.9 | 5.9 | 15.7 |
Classification accuracy of each ABC algorithm on the tested datasets by using
SN | SR | IS | DE | HC | AN | ||
---|---|---|---|---|---|---|---|
Without feature selection | Accuracy (%) | 91.20 ± 0.24 | 86.15 ± 0.87 | 87.09 ± 0.48 | 94.64 ± 0.39 | 79.10 ± 1.21 | 99.13 ± 0.11 |
Average number of features | 35 | 60 | 34 | 34 | 22 | 38 | |
|
|||||||
CABC | Accuracy (%) | 93.92 ± 0.34 | 88.32 ± 0.90 | 92.19 ± 0.46 | 97.43 ± 0.29 |
|
|
Average number of features | 24.3 | 34 | 9.3 | 20.6 | 9.5 | 25.2 | |
|
|||||||
EABC | Accuracy (%) | 93.66 ± 0.31 | 87.40 ± 0.54 | 91.31 ± 0.33 | 95.57 ± 0.31 | 84.46 ± 0.50 |
|
Average number of features | 26.8 | 35.5 | 10.3 | 20.8 | 12.1 | 26.3 | |
|
|||||||
GABC | Accuracy (%) | 93.79 ± 0.37 | 88.89 ± 1.07 | 91.57 ± 0.50 | 96.58 ± 0.58 |
|
99.42 ± 0.13 |
Average number of features | 24.5 | 33.9 | 11.7 | 21.2 | 9.3 | 29.2 | |
|
|||||||
GDABC | Accuracy (%) | 93.66 ± 0.29 | 88.51 ± 0.94 |
|
96.04 ± 0.51 |
|
99.41 ± 0.10 |
Average number of features | 24.4 | 32.7 | 9.8 | 22.5 | 10 | 28.2 | |
|
|||||||
MABC | Accuracy (%) |
|
88.70 ± 1.09 | 91.54 ± 0.53 |
|
|
99.41 ± 0.10 |
Average number of features | 25 | 29.8 | 8.4 | 20.2 | 7.1 | 27.8 | |
|
|||||||
OABC | Accuracy (%) | 93.48 ± 0.34 |
|
91.40 ± 0.35 | 97.43 ± 0.29 | 85.27 ± 0.28 | 99.41 ± 0.10 |
Average number of features | 25.1 | 33.1 | 11.6 | 7.8 | 7.8 | 25.8 |
When Table
According to Table
When the results are compared in Tables
In this paper, we introduced angle modulation technique for feature subset selection. The main advantage of angle modulation technique for feature subset selection is that high-dimensional problems can be represented by low-dimensional continuous optimization problem and any optimization technique working in continuous space can be applied to solve optimal feature subset selection with less effort.
As a case study, six variants of ABC algorithms employed angle modulation for feature selection. In an experimental study, feature selection performances of original ABC algorithms and another five ABC algorithms variants are compared on six UCI datasets. The results show that feature selection with AMABC algorithms improved significantly the classification accuracy with fewer feature subsets.
Further research will be performed in order to improve performance accuracy by applying angle modulation to other evolutionary computation approaches such as Particle Swarm Optimization and Differential Evolution.
The authors declare that there is no conflict of interests regarding the publication of this paper.