A New Genetic Algorithm Encoding for Coalition Structure Generation Problems

Genetic algorithms have proved to be a useful improvement heuristic for tackling several combinatorial problems, including the coalition structure generation problem. In this case, the focus lies on selecting the best partition from a discrete set. A relevant issue when designing a Genetic algorithm for coalition structure generation problems is to choose a proper genetic encoding that enables an efficient computational implementation. In this paper, we present a novel hybrid encoding, and we compare its performance against several genetic encoding proposed in the literature. We show that even in difficult instances of the coalition structure generation problem, the proposed approach is a competitive alternative to obtaining good quality solutions in reasonable computing times. Furthermore, we also show that the encoding relevance increases as the number of players increases.


Introduction
Let N � a 1 , . . . , a n be a finite set, P a nonempty subset of N, and P n the power set of N. A partition Π ⊂ P n is a set of pairwise disjoint subsets of N whose union is all of N [1]. Partitions arise as solutions for problems in many fields, including Game eory [2], Multiagent Systems [3], Logistics [4], and Mathematical Programming [5].
In this paper, we focus on selecting the best partition from a discrete set. is problem relates to the complete set partitioning problem [6] in the Operations Research and Management Science (OR/MS) literature and is commonly formulated as a coalition structure generation problem [2] in cooperative game theory. Although there is a vast body of game theory literature dealing with coalition structure generation problems issues, the OR/MS literature is more sparse in this stream [4,7,8].
For the sake of exposition, we state our problem using concepts of cooperative game theory. In this context, N represents a set of players, P is called a coalition, and Π refers to a coalition structure. We denote H n as the set of all the partitions of N. Let Π ∈ H n be a given partition of N. Each partition Π represents a possible coalitional structure.
Our motivation in coalition structure generation problems comes from the field of horizontal collaboration in logistics, where stakeholders located at the same level of the supply chain cooperate to diminish logistics costs [4,7,9,10]. Consequently, we formulate the problem as cost minimization instead of profit maximization. us, let ]: H n ⟶ R + be a function that computes the cost of each coalition structure Π, which is usually defined as the sum of the costs of each coalition that belongs to the partition. Specifically, let c: P n × H n ⟶ R + be a function that computes the cost of each coalition P that belongs to the coalition structure Π, which is called characteristic function. e coalition structure generation problem, which is the problem we focus on, is given by (1) e number of partitions of a set |H n | is given by the Bell numbers [11,12] that can be computed using the recurrence formula (2), with B 0 � B 1 � 1: Note that Bell numbers increase quite fast with respect to the number of players n (e.g., B 15 � 1, 382, 958, 545) which makes the coalition structure generation problem a difficult one to solve. In fact, this problem is NP-complete [13]. Although exact algorithms have been successfully implemented for tackling this problem [14][15][16], methods based in a specialized enumeration of the search space are impractical even for small-sized instances. For tackling these problems, the general practice is to develop an appropriate heuristic solution method rather than to rely on optimal solution approaches [17].
Genetic algorithms (GAs) [18] have proved to be a useful improvement procedure for several combinatorial problems, including the coalition structure generation problem [19,20]. ese algorithms consist of four main aspects: an encoding, a fitness function, a replacement policy, and genetic operators. An important issue when designing a Genetic algorithm is to choose a proper genetic encoding that enables an efficient computational implementation. e choice of a given encoding is a crucial step in applying GAs [21], which also affects the genetic operators that act on each encoding [22].
In this paper, we review several genetic encoding proposed in the literature. e objective is to identify which of these representations provides the best performance when exploring the search space of coalition structure generation problems. e main contribution of this paper is the development of a new hybrid encoding for tackling this combinatorial problem. We show that the proposed encoding outperforms other encodings when tested on random instances gathered from common distributions. e rest of this paper is organized as follows. Section 2 reviews the relevant literature. Section 3 reviews several encoding strategies and presents a new hybrid encoding for coalition structure generation problems. Section 4 shows numerical experiments and results. Finally, concluding remarks are provided in Section 5.

Related Work
Coalition structure generation problems can be solved for two different games: characteristic function games (CFGs) and partition function games (PFGs) [3]. In CFGs, the cost of each coalition is given by the characteristic function of the game [23]. In these problems, the objective is to minimize the sum of the cost of each formed coalition [13]. In the OR/ MS literature, characteristic function games arise in the form of set partitioning problems (SPP) and complete set partitioning problems (CSPP). Real-world applications of SPP and CSPP include, respectively, the air-crew scheduling problem [24,25] and the optimal aggregation of corporate subsidiaries in corporate tax structuring [26]. In PFGs, on the contrary, the cost of a coalition may differ depending on how nonmembers are partitioned [3]. In these games, the cost of each partition is given by the partition function, which lacks a particular structure [3,13]. Applications of PFGs include, among others, oligopolies [27], communication networks [28], and collaborative logistics [29].
Several exact algorithms have been proposed for tackling coalition structure generation problems for CFGs. Particularly, Yeh [14] proposes the so-called DP algorithm, which is based on the dynamic programming approach of Bellman [30]. e DP algorithm evaluates, for every coalition, whether it is beneficial to split it, and if so, what the best such split is. Björklund et al. [31] propose a dynamic programming algorithm based on the principle of inclusion-exclusion of combinatorics and the zeta transform [12]. Michalak et al. [16] develop the so-called ODP, an optimal version of the algorithm developed by Yeh [14], which avoids its redundant operations. On the contrary, Sandholm et al. [13] propose an anytime algorithm that focuses on establishing a worst-case bound on the quality of the coalition structure while only searching for a small fraction of the coalition structures. Rahwan et al. [15] propose a tree-search algorithm called IP. is anytime algorithm partitions the solution space into subspaces and prunes those that have no potential of containing the optimal solution and searches through the remaining subspaces using a branch-and-bound technique. More recently, Michalak et al. [16] develop the socalled ODP-IP, a hybrid algorithm that combines techniques from the dynamic programming approach and the anytime approach. According to the authors, this is the fastest exact algorithm for coalition structure generation problems in CFGs in practice. To develop exact algorithms for coalition structure generation problems in PFGs, i.e., games with externalities, some authors have considered additional assumptions such as nonpositive or nonnegative externalities to avoid examining every single coalition structure [3]. Rahwan et al. [32] develop the first anytime algorithm for coalition structure generation problems in PFGs with either positive or negative externalities. In their work, the authors identify the minimum search that is required to establish a bound on the quality of the best coalition structure found. Besides, they develop an anytime algorithm that improves this bound with a further search. Banerjee and Kraemer [33] extend the Rahwan et al. [32] algorithm considering that entities are grouped into types, which are used to define the externality imposed upon a coalition by other coalitions merging. Finally, Rahwan et al. [34] present an extended version of the Rahwan et al. [32] algorithm.
More recently, the research has focused on developing new methods for concisely representing CFGs. ese compact representations capture the interaction between agents in order to reduce the representation size [35]. Examples of compact representations include Marginal Contribution Nets (MC-net) [36], Logic-based representation [37], and Partition Decision Trees (PDT) [38], among others. Ideally, a compact representation should be able to represent any coalitional game using small data structures that could be used by efficient algorithms. Examples of algorithms that work with MC-nets and develop mixed-integer linear programming formulations can be found in Ueda et al. [35] and Liao et al. [39], where an improved version of Weighted Partial MaxSAT (WPM) encoding alongside an off-the-shelf WPM solver were developed. A similar WPM-based technique was proposed in Zha et al. [40] but in the context of PDTcompact representation. e basic idea behind compact representations is to define rules which are used to compute the value of each coalition. Keeping in memory a reduced number of rules could be much more efficient than keeping all the 2 n values, i.e, one for each possible coalition. Even though many applications fit this framework, in general, either the problem cannot be described using a reduced number of rules or to find such a representation can be a challenging problem itself [36]. In this work, we focus on the general case where the characteristic function is considered as a black box.
Coalition structure generation problems are NP-complete [13]. erefore, exact algorithms are useful only for solving small-sized instances. For tackling large-sized instances, nonexact algorithms, or heuristics, become a suitable alternative to obtain good quality solutions at reasonable computing times. In a nutshell, these algorithms return solutions relatively quickly and scale up well when the numbers of players increase [34]. However, they provide no guarantee of the quality of their solutions. Several nonexact algorithms have been proposed for coalition structure generation problems. Shehory and Kraus [41] propose a greedy-distributed algorithm with low ratio bounds. Sen and Dutta [42] develop an order-based genetic algorithm for exploring the search space. Keinänen [43] proposes a simulated-annealing approach for tackling coalition structure generation problems in CFGs. Di Mauro et al. [44] propose a GRASP approach, which is then compared with exact methods considering randomly generated characteristic functions. From a different perspective, Dos Santos and Bazzan [45] propose the so-called bee clustering algorithm, a distributed clustering algorithm inspired by swarm intelligence techniques. Later, Farinelli et al. [46] propose a hierarchical agglomerative clustering approach.
In this paper, we focus on a particular type of nonexact algorithm, the Genetic algorithm (GA), which can be used for tackling coalition structure generation problems in both CFGs and PFGs [42]. Developed by Holland [18], GAs are stochastic search algorithms that emulate biological evolution based on Charles Darwin's theory of natural selection [47]. For surveys on GAs, see Whitley and Sutton [48] and Srinivas and Patnaik [49]. Moreover, a comprehensive overview can be found in Beasley et al. [50,51]. GAs are also included in the Evolutionary Computation field [52]. A review of bioinspired computing algorithms can be found in Kar [53]. GAs have been effective in tackling several NPcomplete combinatorial optimization problems, including the coalition structure generation problem. Particularly, Levine [19] proposes a parallel genetic algorithm with a penalization function for solving the set partitioning problem. Chu and Beasley [20] present a steady-state GA in conjunction with a specialized heuristic improvement operator for solving the same problem. Other GAs applications include Venugopal and Narendran [54] for machine-component grouping, Gonçalves and Resende [55] for manufacturing cell formation, and Boulif [56] for graph partitioning.
As stated in Section 1, one of the main decisions when designing a GA is to choose a genetic encoding. e selection of a proper encoding is a crucial step when implementing GAs [21], which also affects the genetic operators that act on each encoding [22]. Particularly, the encoding of solutions should implicitly give a semantic description of what a good solution is [57]. Some encodings have been proposed for tackling coalition structure generation problems considering a GA framework. For example, Sen and Dutta [42] propose an order-based encoding; Chu and Beasley [58] and Levine [19] propose a column-based encoding; and Gonçalves and Resende [55] propose a fractional encoding. As shown in Section 3, however, several other encodings can be applied for tackling this problem.

GAs for Coalition Structure Generation Problems
In this section, we detail GAs for the specific setting of coalition structure generation problems. Particularly, in Section 3.1, we provide an overview of GAs; in Section 3.2, we review several encoding schemes for coalition structure generation problems; in Section 3.3, we discuss the new encoding scheme; and in Section 3.4, we summarize the features of the encodings analyzed.

Overview of GAs.
GAs are stochastic search algorithms that mimic the genetic evolution of species. In a GA framework, the main components are an encoding scheme, a fitness function, a replacement policy, and genetic operators. GAs start with an initial population that evolves, yielding a new population of the same size. e main operators used to create a new generation are selection, recombination, and mutation. A peculiarity of this approach is that genetic operators do not work directly on the solution space or phenotype; solutions have to be coded as strings over a particular alphabet. In this case, an encoding scheme is used to represent the solutions as chromosomes or individuals that belong to the genotype space. Algorithm 1 shows a pseudocode for a basic GA version [20]. We refer the reader to Beasley et al. [50,51]; Pirlot [57]; and Boussaïd et al. [59] for comprehensive overviews (Algorithm 1). e performance of GAs depends on the relationship between phenotype and genotype. ere are three types of relations: one-to-one, one-to-many, and one-to-none. e one-to-one relation seems to be the ideal case. Unfortunately, in many instances, it is quite hard to find a simple encoding with this property.
e one-to-many relation occurs when it is possible to represent a solution through many different chromosomes. In this case, the GA is said to face redundancies. Redundancies detriment the performance of GAs since the genotype space expands, and it is needed to explore a larger space. Finally, in the one-to-none relation, some solutions cannot be represented as chromosomes. In this case, the GA is said to face blindness since promising Mathematical Problems in Engineering 3 sectors of the search space may not be analyzed. For further insight into both definitions, we refer the readers to Boulif [56].

Encoding Schemes for Coalition Structure Generation
Problems. We review several encoding schemes and discuss some genetics operators and their variants. For the sake of clarity, we focus solely on encoding schemes that do not face blindness.

Column-Based Encoding.
Column-based encoding was first proposed for tackling set partitioning, set covering, and graph partitioning problems [19,58,60,61]. In this scheme, the genes are 0-1 bits. Bits correspond to coalitions, and the values 1 or 0 indicate whether the coalition is in the coalition structure or not. An example of this encoding/ decoding is depicted in Figure 1 Gen 1.
Column-based encoding does not face redundancies. However, the size of the genotype space is much larger than the size of the phenotype space (i.e., 2 m ≫ |H n |, where m � 2 n − 1 is the number of coalitions). e excess of individuals in the genotype is due to the existence of chromosomes that cannot be decoded into valid partitions (see Figure 1, Gen 2). is phenomenon is a major drawback of column-based encoding since many iterations are required to get feasible individuals. To deal with this issue, some authors propose the use of penalty functions [19,60] and feasibility repair operators [20,62].

Row-Based Encoding.
A row-based encoding works by creating empty clusters. en, the clusters are filled up with players, and the formed coalitions correspond to the nonempty clusters. Since coalition structures are formed by at most n coalitions, exactly n clusters are enough to represent any possible solution. We next review two standard rowbased encoding: integer and fractional.
Integer row-based encoding was proposed by Venugopal and Narendran [54] in the context of Machine-Component Grouping. In this scheme, solutions are represented using an array of n components with values in 1, . . . , n { }. Each component represents a player, and its value indicates the cluster in which the player is included. An example of this encoding/decoding is depicted in Figure 2.
Fractional row-based encoding was proposed by Gonçalves and Resende [55] in the context of Manufacturing Cell Formation. In this setting, each solution is represented by an array of (n + 1) components with values in the interval [0, 1]. e decoding process is as follows. e last component of the array is multiplied by the number of players n, which yields the number of clusters. Let n c be this number. Now, the interval [0, 1] is divided into n c subintervals of equal length: [0, 1/n c ], (1/n c , 2/n c ], . . . , ((n c − 1)/n c , 1]. Each subinterval represents one of the n c clusters. Finally, the players-represented by the first n components of the array-are assigned to the clusters according to the component value and the subinterval to which it belongs. An example of this encoding/decoding is depicted in Figure 3. e integer and fractional row-based encoding share some common features. First, they are phenotype feasible, which means that chromosomes always generate valid partitions. Moreover, conventional crossover and mutation operators can be directly applied. However, these encoding face redundancies due to the interchangeable role of clusters. For example, consider the coalition structure 1, 3, 4 { }, 2, 5 { }. In row-based integer encoding, the partition above is represented not only by the chromosome , and so on. In general, the number of redundant chromosomes for a coalition structure is given by equation (3), where n represents the clusters and |Π| the formed coalitions: On the contrary, in the row-based fractional encoding, the amount of redundancies is larger since we deal with both cluster interchangeability and machine precision. For example, consider the chromosome of Figure 3. In this case, the value of the first allele could be replaced for any number in interval [0, 0.5) without changing the resulting coalition structure. Let M be the number of values that a machine can distinguish in the interval [0, 1]. en, the number of redundant chromosomes that maps the coalition structure Π is given by equation (4), where (M/n) n+1 denotes the number of ways that every component can be replaced by a value in the same subinterval:

Order-Based Encoding.
Order-based encoding was proposed by Sen and Dutta [42] for tackling coalition structure generation problems. In this scheme, coalition structures are coded using permutations on the set 1, . . . , 2n − 1 { }. In the chromosome, 1, . . . , n { } represent the players, whereas the numbers n + 1, . . . , 2n − 1 { } stand for separators or coalition breakers, which determine where a coalition ends and the next one begins. An example of this encoding/decoding is depicted in Figure 4. e idea behind order-based encoding has been widely applied in the Traveling Salesman Problem (TSP) setting [63]. In this scheme, the conventional operators, such as (1) Generate an initial population; (2) Evaluate fitness of individuals in population; (3) repeat (4) Select parents from the population; (5) Apply a crossover operation to get new individuals; (6) Mutate a random proportion of these new individuals; (7) Evaluate the current population by its fitness; (8) Create a new population using a replacement strategy; (9) until A termination condition is satisfied (10) Return the current population. ALGORITHM 1: A basic GA. multipoint crossover or backflip mutation, fail to create new valid chromosomes since the children are typically not permutations. Specific operators have been developed to deal with this issue. Examples are the PMX-crossover (PMX) [64], the Edge Recombination (ER), and the Swap-mutation [65].
Sen and Dutta [42] computed a lower bound for the number of redundancies faced by order-based encoding. To compare with the other encoding, we improve the analysis by determining the exact number of redundancies. Let Π be a coalition structure formed by |Π| coalitions. Proposition 1 provides the number of redundancies for the order-based encoding.

Proposition 1. Let Π be a coalition structure formed by |Π| coalitions. en, the number of chromosomes representing Π in the order-based encoding is given by
Proof. ere are four sources of redundancies in the orderbased encoding. First, the role of separators is interchangeable, so we have to include the factor (n − 1)!. Second, players within the same coalition can be reordered without affecting the formed coalition, then the factor |P|! appears for each formed coalition P. Similarly, the position of coalitions in the chromosome is also interchangeable. en we have to include the factor |Π|!. Finally, note that to represent Π, we need |Π| − 1 separators. As the encoding always considers (n − 1) separators, the extra n − |Π| can be placed either next to an already fixed separator or in the extremes of the chromosome. is yields a combinatorial number that is equivalent to the number of vectors (x 1 , . . . , x |Π|+1 ) with nonnegative integers that satisfies x 1 + · · · + x |Π|+1 � n − |Π|.
is problem is well-known in combinatorics in which the general solution is where k ′ the size of the vector and n ′ is the amount we want to sum. In our case , which is the last factor in the expression.

Random-Key
Encoding. Random-key encoding was proposed by Bean [66] for tackling several sequencing problems. In this scheme, a solution is encoded with random numbers drawn from [0, 1]. e decoding process, on the contrary, is carried out by sorting the alleles in ascending order, which results in a permutation on the set 1, . . . , n { }. en, note that, in the context of coalition structure generation problems, a random-key encoding with 2n − 1 alleles decodes into a permutation that can be interpreted as a chromosome of the order-based encoding. An example of this encoding/decoding is depicted in Figure 5.
An advantage of using a random-key encoding is that it enables the use of simple genetic operators since the chromosome is represented quite clearly. However, redundancy increases considerably due to the use of realvalued genes.

A New Codification Scheme: e Order-Based Bit-Key
Encoding. In this paper, we propose a new encoding scheme for tackling coalition structure generation problems: orderbased bit-key encoding (OBBK). e proposed encoding is based on both order-based and random-key encoding. In this scheme, partitions are encoded using an order-based chromosome of length n and a bit-string key with n − 1 components. e order-based segment provides an order for the players, while the values in the key indicate the coalition breaks. erefore, if player x i is next to player x i+1 in the order-based chromosome, the value of the ith component of the key indicates whether the players belong to the same coalition or not. An example of this encoding/decoding is depicted in Figure 6.
Despite the similarities with order-based encoding, OBBK encoding avoids several sources of redundancies. Proposition 2 provides the number of redundancies for this encoding.

Proposition 2. Let Π be a coalition structure formed by |Π| coalitions. en, the number of chromosomes representing Π in the OBBK encoding is given by
Proof. Following the same proof as in Proposition 1, we note that OBBK encoding faces neither the first nor the fourth sources of redundancies, namely, interchangeability and excess of separators. On the contrary, OBBK still suffers from players and coalition interchangeability; thus, the number of redundancies is given by ( P∈Π |P|!)|Π|!. We also develop specific genetic operators for the OBBK encoding. ese operators involve the use of conventional operators for order-based and bit-string encoding, which are applied separately to each segment of the array. For instance, in the crossover step, it is possible to recombine the orderbased section using the PMX operator, while the key section is recombined through the single-point operator. An advantage of this strategy is that the use of standard operators, such as PMX or ER, is quite simple because the order-based segment is smaller, particularly, n components for OBBK against 2n − 1 in the OB encoding. Figure 7 depicts the use of genetic operators when using OBBK encoding.  Table 1 summarizes the encoding reviewed and their main features. Column G-space indicates the size of the genotype space, which is the search space for GAs. A straightforward comparison reveals that OBBK encoding provides the smallest search space, hence the smallest number of redundancies.

Numerical Experiments
In this section, we detail the experiments and numerical results. Particularly, in Section 4.1, we show how to generate instances for the coalition structure generation problems; in Section 4.2, we discuss the implementation of genetic algorithms in this context; and in Section 4.3, we analyze the insights gathered from the numerical experiments.

Generating Instances for Coalition Structure Generation
Problems. Instances for coalition structure generation problems can be divided into four groups: structured CFGs, structured PFGs, irregular CFGs, and irregular PFGs. We define as structured instances those in which the objective function defines a simple structure on the partition space. For this case, due to the regularity induced, the optimal solution can be easily computed a priori. Several real-life applications, however, involve problems with less or even no structure at all. We define as irregular instances those instances constructed with a specific objective function in which the cost of individual coalitions is sampled from random values. In this case, the optimal solution is unknown a priori.
In either case, structured or not, the GAs developed in this section have no preliminary notions about the data nor do they make any assumption about the objective function. [42] propose a novel framework for generating CFGs and PFGs instances with some regularity in the partition space. In this paper, we use their approach to generate instances for both structured CFGs and PFGs.

Structured CFGs and PFGs Instances. Sen and Dutta
For CFGs, on the one hand, structured instances could be obtained by using a coalition cost that depends on both the size of the coalition and an intracoalition distance function. Consider a set 1, . . . , n { } of players. en, given a coalition P � i 1 , . . . , i |P| , the cost of P, c(P), is computed according to In equation (7), the function ϕ: P n ⟶ R + works as a penalty function that establishes preferences for the size of   the formed coalitions. Particularly, if we seek coalitions with K players in the optimal coalition structure, then we can accomplish this by using.
For generating structured PFG instances, on the other hand, the intracoalition distance function of equation (7) is replaced by an intercoalition distance function. e new distance function penalizes coalition structures in which the formed coalitions are far from each other. For this case, given a coalition structure Π and a coalition P ∈ Π, we compute the cost of P by Note that the use of an intracoalition distance makes the cost of a coalition independent from the other formed coalitions. On the contrary, the intercoalition distance depends explicitly on the other coalitions within the coalition structure, so the problem becomes a PFG. In both cases, i.e., structured CFGs and PFGs instances, the optimal solution is given by (10), assuming a function ϕ given by (8):  [15]. In their work, the cost of each coalition P is sampled from an independent normal distribution with parameters μ P � |P| and σ P � �� � |P| √ . With this choice, the value of coalition structure is proved to satisfy ](Π) ∼ N(n, n) ( eorem 6 in Rahwan et al. [15])0. In this paper, we generalize their approach so that the value of coalition structures satisfy ](Π) ∼ N(μ, σ 2 ) for any parameters values μ and σ defined by the user. Consider Proposition 3.

Proposition 3.
Let Π be a coalition structure and let P be a coalition of size |P|. If the cost of coalition P, c(P), is sampled from a normal distribution N(μ P , σ 2 P ), with mean and variance given by μ P � (|P|/n)μ and σ 2 P � (|P|/n)σ 2 , then the cost of partition Π, ](Π), follows a normal distribution N(μ, σ 2 ).
Proof. Let Π be a coalition structure. en, the cost of this partition is given by Now, for any coalition structure Π, it holds that P∈Π |P| � n. erefore, ](Π) ∼ N(μ, σ 2 ), and all coalition structures are equally likely to be optimal.

Encoding
Genes Feasibility G-space Redundancies of coalition P Column-based For generating irregular PFGs instances, on the contrary, we assign a partial cost c(P) to each coalition P. We sample the partial costs independently from the same distribution. en, for each coalition P ∈ Π, the cost of P in Π is given by Finally, the cost of a coalition structure is computed by adding up the costs of all the coalitions. is is given by Since the cost depends on the number of formed coalitions, this kind of problem belongs to the PFG class. Moreover, if we assume (12) and (13), the problem corresponds to a PFG with negative externalities. To prove this, let Π ′ denote a coalition structure resulting from merging two coalitions in Π.
In this paper, we sample the partial costs c(P) independently from three distributions, namely, a uniform distribution with values in [0, 100]; an exponential distribution with parameter λ � 1/50; and a normal distribution with parameters μ � 50 and σ 2 � 100.

Implementation Details.
In this section, we discuss further the implementation of GAs for the coalition structure generation problem. Consider Algorithm 2.
Step 1 creates the initial population by randomly selecting chromosomes in the genotype space.
Step 4 copies the best individuals into the next population to preserve the good chromosomes.
Step 6 performs a fitness-based random selection. Note that the best fitness-valued individuals are more likely to participate in the recombination procedure. For the rest of the individuals, Step 11 performs a randomly uniform selection providing a chance for bad fitness-valued individuals to appear in the next generation avoiding premature homogenization.
Step 5 uses the crossover rate parameter CR, which establishes the number of individuals that should be obtained as mutated children in the next generation.
Step 8 considers the mutation probability parameter MP, which usually is taken as a small value. Note that the size of the population remains constant and equal to N. Finally, Step 14 states the stopping criterion. Table 2 summarizes the value of the parameters used in for all the algorithms. Note that the population size depends on the number of players n. We proceed in that way, considering that, as the numbers of players increases, the feasible set size grows according to Bell numbers (see Section 1).

Numerical Results.
In this section, we perform several numerical experiments for the problems defined in the previous ections, that is, for the structured CFG and PFG problems and irregular CFG and PFG problems. As stated before, the optimal solution for the structured problems is known in advance and given by equation (10). For the irregular CFG, on the contrary, although the solutions are not previously known, the optimal solutions can be computed using the ODP-IP algorithm as long as the number of players does not exceed 25. In the case of irregular PFG problems, we define three types of instances associated with three different probability distributions: uniform, exponential, and normal. In summary, we have six different types of problems, and for each of them, we generate 23 instances. Considering that each instance has a different number of players varying between 8 and 30 players, we solved, therefore, 138 instances.
Moreover, we compare the performance of six GAs, one for each reviewed encoding scheme. Note that each encoding may work with several crossover and mutation operators, which yield to a combinatorial number of cases. For the sake of simplicity, we limited the computational experiments to testing each encoding with its classic operators. Table 3 summarizes the encoding and the operators used.
In order to mimic a multistart technique, the genetic algorithms are executed ten times, starting from different initial populations at each run. Note that the generation of the initial population (Step 1 of Algorithm 2) depends on the encoding scheme. us, all algorithms start from different initial populations.
We compute the relative gaps in order to measure the performance of the algorithms [16]. For the instances whose optimal solutions are known, the relative optimality gaps are computed using equation (14). In this equation, ] denotes the best value reported by the algorithm, whereas ] * denotes the known optimal solution. On the contary, for those instances where the optimal solution ] * are not known, i.e., for irregular PFGs, we use the same expression (equation (14)), but replacing ] * by the best lower bound: First, we discuss the results for structured instances, which are depicted in Figure 8. e x− axis shows the number of players. e y− axis shows the average GAP attained by each algorithm in each instance. At the top of the bar, we point out the algorithm that achieves the lowest average GAP and the corresponding value. Since the instances are independent, it is expected that the average GAP differs from one instance to another. However, there is a general trend to have larger GAPs for instances with a greater number of players. e OBBK algorithm shows the best performance in all instances for coalition structure generation problems in both CFGs and PFGs. Even for a large number of players, this algorithm keeps the GAP below 5%. e OBBK algorithm, moreover, reaches a zero GAP in many of the instances, which means that the algorithm finds the optimal solution for the ten runs. Besides, in cases where the average GAP is not zero, the algorithm still finds the optimal solution in some of the runs. Specifically, this happens 40% of the tested instances for the CFG with 30 players, and 20% of the instances for the PFG with 27 players. Now, we focus on the results for the irregular CFG instances. In these instances, the optimal solution is unknown a priori. To compute the optimality GAP, we use the ODP-IP method proposed by Michalak et al. [16] to search for the optimal solution. Note that this method and its implementation 1 are defined for maximization problems. However, our formulation is stated as a minimization problem. To cope with this issue, for M large enough, we use as input the reward function: e ODP-IP method works, however, for instances of up to 25 players. For instances with more than 25 players, instead of computing the relative error, we compute the average minimum value for each algorithm. ese values are reported in Table 4. Figure 9 shows that the average GAP increases when the number of players increases. In general terms, Bit-col and Introw algorithms show poor performance. Again, for the same initial population size and the same number of generations, the OBBK algorithm has the best performance in all instances. e differences in the average GAPs are now much clearer, especially for larger instances, for which the other algorithms have a GAP up to five times larger than OBBK. Even though the OBBK algorithm shows the best performance, the GAP increases up to 24% for the largest instance, far from the 5% reached in the case of structured instances. Figure 10 shows 95% confidence intervals for the GAP of the OBBK algorithm when solving irregular CFG problems. We considered only 25 players since, as stated before, it is the maximum number of players for which the ODP-IP algorithm computes the optimal solution [16]. Note that the length of the confidence intervals grows as the class of problems become more complex, that is, as the number of players increases.
We now focus on the results for the irregular PFG instances. Consider Figures 11-13. e interpretation of these results is not straightforward. For the three distributions considered, we obtain bar graphs with different characteristics. For the uniform case, the GAP roughly decreases as the number of players increases. For the exponential case, the behavior is similar to the structured instances, while, for the normal case, there is no correlation between the GAP and the number of players. An explanation for this phenomenon could be the procedure used to compute the GAP. As we mentioned before, for this type of instance, the optimum is unknown, so there are no tools for computing this value in reasonable computing time. When there is no guarantee for reaching the optimal solution, the error is computed using different lower bounds, which depends on each distribution. In the case of uniform and exponential distributions, the minimum value is 0, which is set as the lower bound. For the normal distribution, we use as lower bound the value 20 that corresponds to μ − 3σ.
(1) Create an initial population Q with N individuals.
(5) for j � 1 to ⌈CR · N⌉ do (6) Select two parents from the current population Q. (7) Generate one child by applying the crossover operator. (8) Mutate the generated child with probability PM. (9) Add the child into the next population Q next . (10) while the size of Q next be less than N do (11) Select an individual x in the current population Q. (12) Add x into Q next .
(13) Update Q ⟵ Q next (14) until e generations limit is reached (15) Return the best individual in Q.      algorithm reports the best performance in almost all instances. e Frac-row algorithm also has a good performance, reaching better optimality GAPs than OBBK in some instances with few players.

ALGORITHM 2
is last algorithm was also competitive in the Normal CFG instances. Nevertheless, for the structured instances, the second-best algorithm is OB.
Finally, it is needed to point out that the OBBK algorithm achieve an effectiveness of 80% in 15 of the 18 studied instances, i.e., in 8 of 10 runs, this algorithm finds the optimal solution. Frac-row is the closest algorithm in terms of effectiveness, reaching the same 80% but only in 4 of the 18 instances studied.

Concluding Remarks
In this paper, we review and compare several encoding used in the literature for tackling coalition structure generation problems using GAs. Moreover, we propose a novel hybrid encoding that outperforms the existing ones in all the studied instances. Even for difficult cases, such as the irregular CFG and PFG instances, the proposed approach allows us to obtain good solutions in reasonable computing times. We also show that the encoding relevance increases when the number of players increases. For the specific case of irregular CFG, and for small instances, our solution approach can find near-optimal solutions. For larger instances, namely, more than 25 players, which is the limit of the previous exact methodologies, our solution method shows a good performance for totally different instances, which proves the robustness of our procedure. is last is a distinctive feature of the OBBK encoding since, in general, the other encodings work well in specific structures and tend to provide worse results when these structures change. Finally, it is worth mentioning that GAs provide a solution approach for the unstructured PFG, which until now, is optimally unsolvable for large-sized instances.

Data Availability
e simulated data used to support the findings of this study are available from the corresponding author upon request    Mathematical Problems in Engineering 11