Application of the Artificial Bee Colony Algorithm for Solving the Set Covering Problem

The set covering problem is a formal model for many practical optimization problems. In the set covering problem the goal is to choose a subset of the columns of minimal cost that covers every row. Here, we present a novel application of the artificial bee colony algorithm to solve the non-unicost set covering problem. The artificial bee colony algorithm is a recent swarm metaheuristic technique based on the intelligent foraging behavior of honey bees. Experimental results show that our artificial bee colony algorithm is competitive in terms of solution quality with other recent metaheuristic approaches for the set covering problem.


Introduction
The set covering problem (SCP) is a classic problem in combinatorial analysis, computer science, and theory of computational complexity. It is a problem that has led to the development of fundamental technologies for the field of the approximation algorithms. Also it is one of the problems from the list of 21 Karp's NP-complete problems; its NPcompleteness was demonstrated in 1972 [1].
SCP has many applications, including those involving routing, scheduling, stock cutting, electoral redistricting, and other important real life situations [2]. Although the best known application of the SCP is airline crew scheduling [3], where a given set of trips has to be covered by a minimumcost set of pairings, a pairing being a sequence of trips that can be performed by a single crew.
Different solving methods have been proposed in the literature for the set covering problem. Exact algorithms are mostly based on branch-and-bound and branch-and-cut techniques [4][5][6], linear programing, and heuristic methods [7]. However, these algorithms are rather time consuming and can only solve instances of very limited size. For this reason, many research efforts have been focused on the development of heuristics to find a good result or nearoptimal solutions within a reasonable period of time.
Classical greedy algorithms are very simple, fast, and easy to code in practice, but they rarely produce high quality solutions for their myopic and deterministic nature [8]. In [9] a greedy algorithm was improved incorporating randomness and memory into it and obtained promising results. Compared with classical greedy algorithms, heuristics based on Lagrangian relaxation with subgradient optimization are much more effective. The most efficient ones are those proposed by [10,11].
Metaheuristics were also applied to the SCP as top-level general search strategies. An incomplete list of this kind of metaheuristics for the SCP includes genetic algorithms [12,13], simulated annealing [14], tabu search [15], cultural algorithms [16,17], and ant colony optimization [18]. For a deeper comprehension of effective algorithms for the SCP in the literature, we refer the interested reader to the survey by [7].
In this paper we propose a novel application of artificial bee colony (ABC) to solve SCP. This paper is organized as 2 The Scientific World Journal  follows. In Section 2, we explain the problem. In Section 3, we describe the ABC framework. Our ABC proposal is described in Section 4. In Section 5, we present the experimental results obtained. Finally, in Section 6, we conclude the paper and give some perspectives for further research.

Problem Description
The set covering problem is a fundamental problem in the class of covering problems. Given a finite set and a family = 1 , 2 , . . . , of subsets of (i.e., ⊆ , = 1, . . . , ), the SCP aims to find a minimum cardinality ⊆ {1, . . . , } such that ⋃ ∈ = . The elements of are called points. Given a ⊆ {1, . . . , }, a point is set to be covered if belongs to ⋃ ∈ . In the minimum-cost set covering problem each set , 1 ≤ ≤ , has a cost and the problem is to find a ⊆ {1, . . . , }, where each point is covered and ∑ ∈ is minimum. This minimum-cost optimization version of SCP is NP-hard.
Let us define the incidence matrix of a set covering problem as follows. There are | | rows in , one for each point of ∈ , and columns in , one for each set . The entry at (the entry at the intersection of the th row and the th column) is one if point is in set ; otherwise is zero. Table 1 shows an example of an incidence matrix.
Equation (1) is the objective function of the SCP, where refers to the weight or cost of -column and is the decision variable. Equation (2) is a constraint to ensure that each row is covered by at least one column; × matrix = ( ) is a constraint coefficient matrix whose elements can be "1" or "0" to indicate the covering possibilities. Finally, (3) is the integrality constraint where the value can be "1" if column is activated (selected) or "0" otherwise.

Artificial Bee Colony Algorithm
ABC is one of the most recent algorithms in the domain of the collective intelligence. It was created by Dervis Karaboga in 2005, who was motivated by the intelligent behavior observed in the domestic bees to take the process of foraging [19]. ABC is an algorithm of combinatorial optimization based on populations, in which the solutions of the problem of optimization, the sources of food, are modified by the artificial bees that function as operators of variation. The aim of these bees is to discover the food sources with major nectar.
In the ABC algorithm, an artificial bee moves in a multidimensional search space choosing sources of nectar depending on its past experience and its companions of beehive or fitting its position. Some bees (exploratory) fly and choose food sources randomly without using experience. When they find a source of major nectar, they memorize their positions and forget the previous ones. Thus, ABC combines methods of local search and global search, trying to balance the process of the exploration and exploitation of the search space.
Although, the performance of different optimization algorithm is dependent on applications, some recent works demonstrate that the artificial bee colony is more rapid than either genetic algorithm or particle swarm optimization solving certain problems [20][21][22][23][24]. Additionally, ABC has demonstrated an ability to attack problems with a lot of variables (high-dimensional problems) [25].

Elements and Behavior.
The model defines three principal components which are enunciated as follows.
Food Source. The value of a food source depends on many different factors, as its proximity to the beehive, wealth or the concentration of the energy, and the facility of extraction of this energy.
Employed Bees or Workers. They are associated with a current food source, or in exploitation, they take with them information about this source, especially the distance, location, and profitability, to share this with a certain probability with other companions.
Unemployed or Exploratory Bees. They are in constant search of a food source. There are two types: (i) scouts: they are the ones in charge of searching in the environment that surrounds the beehive for new sources of food. (ii) onlookers (curious or in wait): they look for a food source across the information shared by the employees or by other explorers in the nest.

Biological
Behavior. The exchange of information between the bees is the most important incident in the formation of a collective knowledge, since the meaning of this The Scientific World Journal 3 Generation of food sources A solution to the optimization problem is a food source. It moves in a random way and with base in the low and superior limits of every variable of the problem

Working bees
Their number is proportional to the number of food sources, where for every source there is a working bee, and its function is to evaluate and to modify the current solutions to improve them (it looks for new sources near the current one). If the new position is not better than the current, it will keep the original position

Bees in wait
The number of bees in wait must be proportional to the number of sources. These bees will choose a food source, based on the information of the working bees by means of the waggle dance, where the food source with better value of objective function is selected Scout bees These bees generate new sources of food in a random way to replace existing sources that have not been improved Limit It defines the maximum number of cycles that a food source can keep without improving before being replaced. The limit increases from the source that is not modified by the bees; already they are used or in wait, up to obtaining its maximum allowed value. After this, the scout bees take charge of initializing the limit to 0 for every new generated position. The limit is initialized to 0 whenever a source is modified (improved) by a used bee or in wait Column adding When solving SCP, it defines the number of columns to be added to the current food source Column elimination When solving SCP, it defines the number of columns to be eliminated from the current food source Select sites for the local search (5) Recruit bees for the selected sites and to evaluate fitness (6) Select the bee with the best fitness (7) Assign the remaining bees to looking for randomly (8) Evaluate the fitness of remaining bees (9) UpdateOptimum() (10) End While (11) Return BestSolution (12) End Algorithm 1: ABC pseudocode.
interaction the bees will decide the behavior that must take the beehive. The principal ways of bee behaviour are (i) the incorporation to a source of nectar: the communication between the bees related to the quality of food sources is realized in the zone of dance (dance of the bees), where with the information obtained about all the sources of food that are available, they decide which of all the sources is the most profitable to join. (ii) the abandon of a source: by means of the dance it is determined if a source is no longer profitable and consequently it must be abandoned.

Artificial Behavior.
In Table 2 the elements of the ABC are described in a general way. The pseudocode of artificial bee colony is as in Algorithm 1.
The procedure for determining a food source in the neighborhood of a particular food source which depends on the nature of the problem. Karaboga [26] developed the first ABC algorithm for continuous optimization. The method for determining a food source in the neighborhood of a particular food source is based on changing the value of one randomly chosen solution variable while keeping other variables unchanged. This is done by adding to the current value of the chosen variable the product of a uniform variable in [−1, 1] and the difference in values of this variablecurrent food source-and some other randomly chosen food source. This approach cannot be used for discrete optimization problems for which it generates at best a random effect.
Singh [27] subsequently proposed a method, which is appropriate for subset selection problems. In his model, to generate a neighboring solution, an object is randomly dropped from the solution and in its place another object, which is not already present in the solution, is added. The object to be added is selected from another randomly chosen solution. If there are more than one candidate object for addition, then ties are broken arbitrarily.
This approach is based on the idea that if an object is present in one good solution, then it is highly likely that this object is present in many good solutions. This method 4 The Scientific World Journal provides another advantage, consisting in that if the method fails to find an object different from the others objects in the original solution, this means that the two solutions are equal, such situation is called "collision" and it is resolved by making the employed bee associated with the original solution, a scout bee eliminating duplication.

Description of the Proposed Approach to Solve SCP
Step 1 (initialization). This step includes initializing the parameters of ABC as size of the colony, number of workers and curious (onlookers or "in wait") bees, limit of attempts, and maximum number of cycles.
Step 2 (generation of initial population). To generate the initial population by every row-SCP constraint-a column-SCP variable-is selected at random from the set of columns with covering possibilities. A solution is represented by means of an entire vector as shown in Figure 1 keeping the columns considered in the solution. Then, we use an integer encoding as the encoding rule.
Step 3 (evaluation of the fitness of the population). The fitness function is equal to the objective function of the SCP (1).
Step 4 (modification of position and selection of sites for worker bees). A hard-working bee modifies its position by means of the creation of a new solution based on a different food source selected randomly. It sees if at least it has a different column, in case of having not even a different column, the hard-working bee is transformed to an explorer in order to eliminate duplicated solutions. In opposite case, it proceeds to add a certain random number of columns between 0 and the maximum number of columns to be added. After this, it proceeds to eliminate a certain random number of columns between 0 and the maximum number of columns to be eliminated. In case that new solution does not meet constraints, it is repaired. The fitness of the solution is evaluated; if the fitness (cost) is minor than the solution obtained in the beginning, the solution is replaced. In opposite case, it increases the number of attempts for improving this solution (limit).
Step 5 (recruiting curious bees for the selected sites). A curious bee evaluates the information of the nectar through the workers and it chooses a source of food with the fitness proportionate selection method or roulette-wheel selection.
Step 6 (modification of position for the curious bees). They work alike to hard-working bees in Step 4.
Step 7 (leaving a source exploited by the bees). If the solution representing a source of food does not improve for a predetermined number of attempts (limit), then the source of food is left and is replaced by a new source of food generated as in Step 1.
Step 8. This step involves memorizing the best solution and increasing the counter of the cycle.
Step 9. The process stops if the criteria of satisfaction expire; in opposite case return to Step 3.

Experimental Results
The ABC algorithm has been implemented in C in a 2.5 GHz Dual Core with 4 GB RAM computer, running windows 7.
Parameter values have a profound influence on the performance of ABC. The parameters were empirically adjusted, we determined their values in an experimental way, and for each parameter, a set of candidate values were considered. We modified the value of one parameter while keeping the others fixed. According to the best results, as parameter values in our experiments, we use ABC runs 1000 iterations with a population of 200 bees, where 100 corresponds to hardworking and 100 to curious. Limit = 50, maximum number of columns to add = 0.5% of columns in the SCP instance, and maximum number of columns to eliminate = 1.2% of the SCP instance.
These parameters showed good results, but they cannot be the ideal ones for all the instances. ABC has been tested on 65 standard non-unicost SCP instances available from OR Library at http://people.brunel.ac.uk/∼mastjjb/jeb/info.html. Table 3 summarizes the characteristics of each of these sets of instances, each set contains 5 or 10 problems and the column labeled Density shows the percentage of nonzero entries in the matrix of each instance. ABC was executed 30 times on each instance, each trial with a different random seed.

Comparison with Other Works.
In comparison with very recent works solving SCP-with cultural algorithms [16] and ant colony + constraint programming techniques [28]-our proposal performs better than the SCP instances reported in those works. In order to bring out the efficiency of our proposal, the solutions of the complete set of instances have been compared with other metaheuristics. We compared our algorithm solving the complete set of 65 standard nonunicost SCP instances from OR Library with the newest ACO-based algorithm for SCP in the literature: ant-cover + local search (ANT + LS) [29], genetic algorithm (GA) proposed by Beasley and Chu (1996) [13], and simulated annealing (SA) proposed by Brusco et al. (1999) [14]. Tables 4 and 5 show the detailed results obtained by four algorithms. Column 2 reports the optimal or the best known solution value of each instance. The third and fourth columns show the best value and the average obtained by our ABC algorithm in the 30 runs (trials). The next columns show the average values obtained by GA, SA, and ANT + LS, respectively. The last column shows the relative percentage deviation (RPD) value over the instances tested with ABC. The quality of solutions can be evaluated using the RPD; its value quantifies the deviation of the objective value from The Scientific World Journal 5  opt which in our case is the best known cost value for each instance. This measure is computed as follows: Examining Tables 4 and 5, we observe the following.
(i) ABC is able to find the optimal solution consistently, that is, in every trial, for 43 of 65 problems.
(ii) ABC is able to find the best known value in all instances of Table 5.
(iii) ABC is able to find the best known value in all trials of Table 5.
(iv) ABC has higher success rate compared to genetic algorithm, simulated annealing, and ants in sets NRE, NRF, NRG, and NRH. The RPD of BEE is 0.00%, the RPD of GA is 1.04%, the RPD of SA is 0.72%, and the RPD of ANT + LS is 0.86%.
(v) ABC can obtain optimal solutions in some instances where the other metaheuristics failed.

Convergence to the Best Solution.
Our approach shows an excellent tradeoff between the quality of the solutions obtained and the computational effort required. In all cases, ABC converged very quickly (mainly from the 10th iteration) and its computation time in the runs was less than 2 seconds (except for NRG and NRH instances where the computation time was less than 30 secs). Figures 2 and 3 illustrate how ABC converges through the iterations to a better solution. We consider only 3 problems per chart in favor of clarity and readability: scp41, scp42, and scp43 for the first chart and scp51, scp52, and spc53 for the second one. -axis represents the iteration number whileaxis represents the reached fitness value.

Conclusion
In this paper we have presented an ABC algorithm for the SCP. We have performed experiments throught several ORLIB instances; our approach has been shown to be very effective, providing an unattended solving method, for quickly producing solutions of a good quality. Experiments 6 The Scientific World Journal The Scientific World Journal 7 showed interesting results in terms of robustness, where using the same parameters for different instances gave good results. The promising results of the experiments open up opportunities for further research. We visualize different directions for future work as follows.
(i) The fact that the presented algorithm is easy to implement clearly implies that ABC could also be effectively applied to other combinatorial optimization problems.
(ii) An interesting proposal by Teodor Crainic et al. at [30] involves parallelizing strategies for metaheuristics. The author sets a basis on the idea that the central goal of parallel computing is to speed up computation by dividing the work load among several threads of simultaneous execution; then a type of metaheuristic parallelism could come from the decomposition of the decision variables into disjoint subsets. The particular heuristic is applied to each subset and the variables outside the subset are considered fixed.
(iii) An interesting extension of this work would be related to hybridization with other metaheuristics or applying a hyperheuristic approach [31].
(iv) The use of autonomous search (AS) represents a new research field, and it provides practitioners with systems that are able to autonomously self-tune their performance while effectively solving problems. Its major strength and originality consist in the fact that problem solvers can now perform self-improvement operations based on analysis of the performances of the solving process [32][33][34].
(v) Furthermore, we are considering to use different preprocessing steps from the OR literature, which allow to reduce the problem size [35].