A High-Performance Genetic Algorithm: Using Traveling Salesman Problem as a Case

This paper presents a simple but efficient algorithm for reducing the computation time of genetic algorithm (GA) and its variants. The proposed algorithm is motivated by the observation that genes common to all the individuals of a GA have a high probability of surviving the evolution and ending up being part of the final solution; as such, they can be saved away to eliminate the redundant computations at the later generations of a GA. To evaluate the performance of the proposed algorithm, we use it not only to solve the traveling salesman problem but also to provide an extensive analysis on the impact it may have on the quality of the end result. Our experimental results indicate that the proposed algorithm can significantly reduce the computation time of GA and GA-based algorithms while limiting the degradation of the quality of the end result to a very small percentage compared to traditional GA.


Introduction
In the area of combinatorial optimization research [1], the traveling salesman problem (TSP) [2] has been widely used as a yardstick by which the performance of a new algorithm is evaluated, for TSP is NP-complete [3]. As such, any efficient solution to the TSP can be applied to solve many real world problems, such as transportation control [4], network management [5], and scheduling [6]. Assuming that ( , ) represents the distance between each pair of cities and , the TSP asks for a solution-that is, a permutation ⟨ (1) , (2) , . . . , ( ) ⟩ of the given cities-that minimizes = ( −1 ∑ =1 ( ( ) , ( +1) )) + ( ( ) , (1) ) . (1) In short, (1) gives the distance of the tour that starts at city (1) , visits each city in sequence, and then returns directly to (1) from the last city ( ) . Since the brute force method is impractical for the TSP except when the number of cities is small, the research direction for the TSP has been using heuristic search methods [7][8][9] to find a near-optimal solution.
Since the 1950s, heuristic algorithms have been developed for finding an approximate solution to the TSP and other complex optimization problems in a reasonable time [10]. Among the most widely used heuristic algorithms are evolutionary algorithms, swarm intelligence, and many others [11][12][13][14][15][16]. These algorithms eventually have a strong impact on modern computer science research because they help researchers solve problems in a variety of domains for which solutions in their full generality cannot be found in a reasonable time, even with the world's fastest computers. For reasons such as being an inherently parallel algorithm, being global search heuristics, and being easy to implement, GA [17,18] has nowadays become one of the most popular heuristic algorithms. Moreover, Holland's schema theorem [17], which says that "short, low-order, aboveaverage schemata receive exponentially increasing trials in subsequent generations of a GA" and Goldberg's building 2 The Scientific World Journal block hypothesis [18], which says that "a GA seeks nearoptimal performance through the juxtaposition of short, low-order, high-performance schemata, called the building blocks" tell us that good subsolutions (or partial solutions) of a GA have a high probability of surviving the evolution and ending up being part of the final solution. This is further confirmed by Glover's proximate optimality principle (POP) [19], which says that "good solutions at one level are likely to be found close to good solutions at an adjacent level, " or good solutions have similar structures. A crucial observation above is that good subsolutions of a GA (or simply GA) (since no confusion is possible, we will use GA to represent simple or traditional GA throughout this paper) will become more and more similar to each other during its evolution process. This, in turn, implies that many of the computations of good subsolutions at the later generations of a GA are essentially redundant. The question is how do we eliminate these redundant computations at the early generations of a GA so that the computation time can be significantly reduced while at the same time retaining or enhancing the quality of the end result.
To make the idea more concrete, a simple example is given in Figure 1 to demonstrate how it works. As Figure 1 shows, let us suppose that there are two chromosomes, 1 and 2 , each of which is composed of ℓ genes. Let us further suppose that ℓ = 4, and each gene can take only two possible values, namely, 0 and 1. Now let us assume, at a certain point in the evolution process, that the value of 1 is 0-0-1-0, and the value of 2 is 1-1-1-0 where the hyphen is used to separate the genes. Then what would be the values of 1 and 2 in the later generations? There are two answers to this question, depending on how the mutation operator is treated. The first answer is that if we use one point crossover and disregard the mutation operator altogether, then we are guaranteed that the values of the third and fourth genes of 1 and 2 will remain intact in the evolution process of a GA and will thus show up as part of the final solution. In other words, if the third and fourth genes of 1 and 2 (i.e., genes common to 1 and 2 ) are saved away, the number of genes will be cut into half and the computation time required by the crossover and mutation operators and the evaluation of the fitness function will be reduced. The second answer is that if we take into account the mutation operator, then the values of the third and fourth genes of 1 and 2 would have a small chance of not being 1-0. The probability for the values of the third and fourth genes of 1 and 2 being changed is, however, very small because only the mutation operator is allowed to change their values, and for GA, the mutation rate has almost always been set to a very small value, say, 1 or 2 percent.
The remainder of the paper is organized as follows. Section 2 gives a brief introduction to the genetic algorithm and the approaches taken to enhance its performance. Section 3 provides a detailed description of the proposed algorithm and a simple example to demonstrate how the proposed algorithm works. Performance evaluation of the proposed algorithm is presented in Section 4. Analysis of the proposed algorithm is given in Section 5. Conclusion is drawn in Section 6.  Figure 1: A simple example illustrating the difference between GA and PREGA. Note that genes common to chromosomes 1 and 2 are saved away by PREGA at generation = 1 but not by TGA.

Related Work
As a particular class of evolutionary algorithms, it is well known that GA is a search technique aimed at finding true or approximate solutions to optimization problems. The operations used to emulate the evolution process of a GA are selection, crossover, and mutation. The simple or traditional GA [18] can be outlined as given in Algorithm 1. The selection operator takes the responsibility of guiding the search of GA toward the high quality or even optimal solution. The crossover operator plays the role of exchanging the information between the individuals in the population while the mutation operator is used to avoid GA from falling into local optima.
Researches on genetic algorithms focus not only on improving the quality of the end result but also on reducing the computation time of GA. Among them are parallel genetic algorithm, hybrid genetic algorithm, and radical modification of the evolutionary procedure or the design of GA.
(1) Parallel genetic algorithm (PGA) [20,21] is a very important technique for reducing the computation time of large problems, such as TSP [22]. The three distribution models [23] that have been proposed are master-slave model, fine-grained model (cellular model), and coarse-grained model (island model). However, in [24,25], the authors indicate that the migration rate and strategy of the island model may affect its performance.
(2) Hybrid genetic algorithm (HGA) [26] refers to the process of combining GA with other effective approaches for finding a better solution in terms of either the quality or the computation time. In general, the design of HGA may either integrate other heuristic algorithms [27] or combine local search methods [28] with GA. For instance, for an HGA that is a combination of GA and a local search method, GA is responsible for finding the global minima or pointing out the particular direction that may lead to a better solution while the local search method is used to find the local minima. For this reason, HGA will enhance the quality of the end result.
The Scientific World Journal 3 GA1. Randomly generate an initial population of chromosomes. GA2. Use the fitness function to select the fitter chromosomes. GA3. Apply the crossover and mutation operators in order. GA4. If a stopping criterion is satisfied, then stop and output the best chromosome. GA5. Go to step 2.
In the research [29] on fast HGA (FHGA), Misevicius [29] points out that the design of FHGA should satisfy the following principles: (1) FHGA should arrive at the solutions quickly; (2) the populations should be compact to save the computation time; and (3) the diversity of the populations has to be maintained to avoid falling into the local minimum at early generations in the evolution process.
(3) Another way to reduce the computation time is to radically change the evolutionary procedure or the design of GA. Michalski [30] presents a non-Darwinian-type evolution called learnable evolution model (LEM) that divides the whole population into two groups: high-performance group (H-group) and low-performance group (L-group). LEM first finds descriptions about why the H-group can obtain a better result and why the L-group may degrade the quality of the end result. Then, it uses the descriptions to generate chromosomes to replace those in L-group.
Michalski also points out that LEM can speed up the number of evolutionary steps by a factor of two or more. Yet, when this kind of fast convergence methods [30,31] of GA is used, it should be very careful about the convergence speed, or it may face the premature convergence problem. One possible solution to this problem is to use the fitness sharing [32] to avoid the diversity of the population being cut down too early.
The improvements that the abovementioned methods can achieve are limited intrinsically by the operators of GA. For example, the crossover and other genetic operators may disrupt the high quality subsolutions (building blocks, or BBs for short) that are found in the previous generations [33]. As a result, the convergence time of GA may increase [34]. Over the past two decades or so, various competent genetic algorithms (competent GAs) [33,[35][36][37] have been developed to tackle the linkage and scalability problem of GA. They can be broadly divided into two classes [36]. Also referred to as the perturbation technology, the first class is based on evolving the representation of solutions or adapting recombination operators among individual solutions. Among this class are the messy genetic algorithm, fast messy genetic algorithm (fmGA) [33,38], and ordering messy GA (OmeGA) [36]. The fmGA differs from simple GA in several aspects. (1) Each gene of the fmGA is represented by its value and locus. (2) The fmGA uses variable-length chromosomes to represent the population. (3) The fmGA attempts to find the building blocks by repeatedly performing selection of solutions and random deletion of genes [36]. (4) A so-called competitive template is required to fill up the missing genes of underspecified messy chromosomes so that the fitness values can be evaluated.

The Proposed Algorithm
In this section, we present a simple but efficient technique for eliminating the redundant computations of GA and GAbased algorithms based on the notion of pattern reduction. Algorithm 2 gives an outline of the pattern reduction enhanced genetic algorithm (PREGA). As Algorithm 2 shows, PREGA is built on the framework of GA; thus, it can be considered as an enhancement of GA with two operatorsthe common genes detection (CGD) and common genes compression (CGC) operators. If we disregard steps 3 and 4, PREGA given in Algorithm 2 will fall back into GA, as shown in Algorithm 1. The underlying idea of PREGA is to detect and compress genes common to all the chromosomes at the early generations of a GA to eliminate the redundant computations at the later iterations in the evolution process.
In what follows, we will give a detailed description of the proposed algorithm.

Common Genes Detection (CGD).
The common genes detection operator of PREGA is responsible for detecting genes that are common to all the individuals in the population and thus are unlikely to be changed at later generations of the GA. Nevertheless, for different problems, the representation of chromosomes may have to be modified or even redesigned. From a different point of view, the example given in Figure 1 can be considered as a special case in terms of the fact that all the genes encode only two possible values 0 and 1 and all the genes are uncorrelated. In some other situations, such as traveling salesman problem, however, the solution of each gene will certainly affect the other genes, and genes on the same position of all the chromosomes do not necessarily represent identical subsolutions. For the TSP, each chromosome can be used to encode a different tour, that is, a different permutation ⟨ (1) , (2) , . . . , ( ) ⟩ of the given cities. In other words, for all the chromosomes, ̸ = -for all and , 1 ≤ ̸ = ≤implies ̸ = . Alternatively, each chromosome can be used to encode edges (corresponding to roads connecting pairs of cities) connecting pairs of cities of a tour.
In this paper, we use binary encoding for finding edges common to all the chromosomes. First, let us suppose that ( , ) (here, we are assuming that node , for all , 1 ≤ ≤ , 4 The Scientific World Journal PREGA1. Randomly generate an initial population of chromosomes. PREGA2. Use the fitness function to select the fitter chromosomes. PREGA3. Apply common genes detection (CGD) algorithm to find the common genes. PREGA4. Apply common genes compression (CGC) algorithm to reserve the common genes. PREGA5. Apply the crossover and mutation operators in order. PREGA6. If a stopping criterion is satisfied, then stop and output the best chromosome. PREGA7. Go to step 2.
in the graph representing the TSP, is labeled by city . Thus, we will use and interchangeably) is the edge connecting the pair of cities and . Without loss of generality, let us further suppose that < . Otherwise, we can swap and or and , since insofar as this paper is concerned, only the symmetric TSP is considered. Then, all the = ( − 1)/2 edges ( , ), 1 ≤ < ≤ , can be assigned unique numbers in the range of 0 to −1, which can be computed as To make the idea more concrete, (2) gives an example to show how all the 7(7 − 1)/2 = 21 edges ( , ), 1 ≤ < ≤ 7, are assigned unique numbers in the range of 0 to 20; . (2) As the example shows, (1, 2) is assigned the number 0, (1, 3) the number 1, (1, 4) the number 2, and so on all the way up until (6, 7) is assigned the number 20. In other words, all the ( − 1)/2 edges can be mapped to a onedimensional array with exactly ( − 1)/2 elements. This would save a little bit more than half of the space or more precisely ( + 1)/2 entries. Now, to find edges common to all the chromosomes, we apply the common genes detection algorithm given in Algorithm 3.
Obviously, as Algorithm 3 shows, steps 1 and 3 take ( 2 ) time, and step 2 takes ( ) = ( ) time assuming that is a constant. Thus, both the time and space complexities of the CGD algorithm are ( 2 ), as claimed. It is worth mentioning that the CGD algorithm described in Algorithm 3 can be made even more efficient if we keep track of in a stack or an array (of size no more than ) the edges common to all the chromosomes in step 2 when the last chromosome is being scanned; then step 3 can be eliminated altogether. If we go one step further, eventually, the CGD algorithm can be made much more efficient and scalable than as outlined in Algorithm 3 by using a more complicated data structure such as balanced trees (the basic operations of which-such as member, insert, and delete-take (log ) time where is the number of nodes in the tree). Again, assuming that is a constant and that a balanced tree is used, the time complexity of the CGD algorithm can be cut from ( 2 ) down to ( log ) = ( log ) and the space complexity from ( 2 ) down to ( ) = ( ), as claimed. As the number of generations increases, the number of cities will be quickly decreased. This implies that the CGD operator is in general much faster than specified by the above bounds, which will in turn enhance the performance of the CGC operator to be discussed next.

Common Genes Compression (CGC).
The common genes compression operator of PREGA is responsible for compressing and removing the common genes detected by CGD. As outlined in Algorithm 4, the CGC algorithm will first compress the common genes detected by CGD-by choosing a representative for and saving away the information associated with all or each segment of the common genes depending on the applications-and then remove the common genes compressed so that later generations of the GA will only see the chosen representatives. A less number of genes are used to represent the common genes each of which represents a segment of the common genes. For instance, using TSP as an example and assuming that the common genes detected 3 , 4 , and 5 form a segment of the path, then these genes-and the information associated with them such as the segment of the path they form as well as the length and direction of the segment-can be compressed, that is, represented by a single composite gene, say, 3 . Once this is done, GA will see only the gene 3 at later generations during its convergence process. In other words, each detected segment of the path can be represented by a single composite gene, which is independent of the number of cities of which each segment of the path is composed. Moreover, all the composite genes can be compressed again as the other "noncomposite" genes. It is worth mentioning that we have to take into consideration the relationships between subsolutions to see if they are dependent or independent before they are compressed. If they are independent, all the common genes can be compressed into a single gene. Otherwise, how they are compressed depends on the problem in question and the way the solutions are encoded.

An Example.
In this section, we present a simple example to illustrate exactly how PREGA works for the TSP. As Figure 2 shows, the very first step of PREGA is exactly the The Scientific World Journal 5 CGD1. Initialize the values of all the array elements to 0. CGD2. For each chromosome, we scan from left to right all the edges encoded within it, calculate the index for each edge scanned, and increment the value of the corresponding array element by one. CGD3. The result array is scanned from left to right looking for all the elements whose values are equal to . Edges corresponding to indices to these array elements are common to all the chromosomes.
Algorithm 3: Outline of common genes detection algorithm.
CGC1. Compress the common genes detected by CGD-by choosing a representative for, and saving away the information associated with, each segment of the common genes. CGC2. Remove the common genes compressed in step 1 so that the later generations of the GA will only see the representatives chosen in step 1.  The Scientific World Journal same as that of GA and is to randomly generate a population of chromosomes. For the purpose of illustration, a population of two chromosomes is generated in this case, and each gene is randomly assigned a distinct city number. Then, the selection operator is applied to select the "good" chromosomes in terms of the fitness value of each chromosome. Then, the CGD and CGC operators, as described in Section 3, are applied for the detection and compression of the genes.
As Figure 2 shows, PREGA differs from GA by adding the CGD and CGC operators as described in Section 3 to eliminate the redundant computations encountered by GA. By doing this, the performance of GA can be significantly enhanced. The example given in Figure 2 shows that the common genes indicated by 1 , 2 , and 3 are first detected by the CGD operator of PREGA and then compressed by the CGC operator of PREGA, which is denoted by . In other words, after compression, we can choose either one of the three common genes 3, 4, and 5 as the representative to indicate the segment compressed. In this case, we choose 3. To avoid confusion, we use 3 instead of 3 in Figure 2. After that, the crossover and mutation operators as well as the evaluation of the fitness function will treat each compressed segment as a single pattern until the terminal condition is met. Note that if the genes detected are consecutive, they will be compressed into a single gene. Otherwise, they will be compressed into as few genes as needed; that is, they will be compressed segment by segment.

Performance Evaluation
In this section, we evaluate the performance of the proposed algorithm by using it to solve the traveling salesman problem. The empirical analysis was conducted on an IBM X3400 machine with 2.0 GHz Xeon CPU and 8 GB of memory using CentOS 5.0 running Linux 2.6.18. All the programs are written in C++ and compiled using g++ (GNU C++ compiler). The benchmarks for the TSP are shown in Table 1. Unless stated otherwise, all the simulations are carried out for 30 runs, with the population size fixed at 80, the crossover probability at 0.5, the per-gene mutation probability at 0.01, the number of generations at 100, and the tournament size at 3 (i.e., 1 out of 3). For all the simulations, PR is started at the second generation.
To improve the quality of the end results of GA, PR, and other evolutionary algorithms, we use several useful technologies to solve the TSP. The nearest-neighbor method [39] is used in creating the initial solution for all the algorithms involved in the simulation. The 2-opt mutation operator [40] is employed as the local search method for fine-tuning the quality of the end results. Unless stated otherwise, all the simulations use HX as the crossover operator by default.
To simplify the discussion of the simulation results of TSP in Tables 2, 3, and 4, we will use the following conventions. Let TGA (traditional GA) [41], HeSEA (heterogeneous selection evolutionary algorithm) [42], SA (simulated annealing) [10], UMDA (univariate marginal distribution algorithm) [43], EHBSA (edge histogram based sampling) [44], ACS (ant colony system) [45], DPSO (discrete particle swarm optimization) [46], and PREGA denote algorithms involved in the simulation. Let ∈ { , } denote either the traveling distance ( = ) or the computation time ( = ). Let Δ denote the enhancement of with respect to in percentage. Δ is defined as follows: where is either or for the TSP, and the subscripts and are defined as follows.
Note that for ∈ { , }, the more negative the value of Δ , the greater the enhancement.

Impact of Different Removal Strategies.
To better understand the impact of the removal bound on the performance of PREGA, we tested several removal bounds-from 0% to 100% with an increment of 10%. 100% means that PREGA may reduce all the genes of chromosomes in the convergence process, whereas 0% means that no genes will be removed; and thus PREGA falls back to GA. More precisely, to simplify the implementation, what we have done is that, after step 2 but before step 3 as shown in Algorithm 2, we check to see if the removal bound is exceeded. If it is exceeded, then The Scientific World Journal 7 : time in seconds; V : coefficient of variation, which is defined to be V = / , where is either or Δ . steps 3 and 4 will be bypassed. Otherwise, all the common genes detected at step 3 will be removed at step 4 even if it will exceed the removal bound. In other words, we may end up removing a few more genes than the removal bound says.
The experimental results showed that setting the removal bound to 0% (GA) or 100% is better than the others. Although setting the removal bound to 10%, 20%, and up to 90% can also reduce the computation time, setting the removal bound to 100% seems to give a good balance between the computation time and the quality of the end results. It shows that PREGA using 100% removal bound can obtain the best results compared to the other removal bound settings, that is, 10%, 20%, and up to 90%.
A very interesting result to be paid particular attention is that the end result of PREGA using 100% removal bound is better than the others. This result shows that the quality of the PREGA is not linearly proportional to the removal bound. The main reason for this phenomenon is that the local search has to be split into two parts: one is for the common genes and the other is for the noncommon genes. This is required because the common genes have been compressed and thus cannot be mixed up with the noncommon genes. Otherwise, the common genes will become noncommon genes. This situation eventually affects the ability of the local search methods. In other words, with 100% and 0% removal bounds, the search ability of the local search methods is maximized because either all of the genes are either common or noncommon. In the case of 10%, 20%, and up to 90%, however, all the chromosomes are composed of two parts, thus limiting the local search methods to find better subsolutions in a smaller search space instead of the whole search space. This will degrade the quality of the end results, causing the quality of 8 The Scientific World Journal  the end results of PREGA to be not linearly proportional to the removal bound.

Impact of Different Kind of Crossover Operators.
There are several different crossover operators [48,49] for the TSP, such as PMX, OX, ERX, and HX. PMX is the most popular and simplest crossover operator, but it lacks searching direction. More recently, many researchers have focused their attention on finding and keeping the building blocks to enhance the performance of GA by either modifying or replacing the operators of GA. In [50], Ruiz et al. designed new crossover operators to identify and maintain the building blocks. In this paper, we use the PMX, OX, HX, and ERX operators to examine the search ability of PREGA when different crossover methods are used. In addition, we have also tested the crossover operators SBOX, SJOX, SB2OX, and SJ2OX [50] to better understand the performance of PREGA with other efficient crossover operators that are designed to avoid disrupting the building blocks on the convergence process. Note that, for the TSP in this paper, we use the 2-opt mutation method for reversing two segments (the size of which must be the same) of a tour encoded in a chromosome. For each segment, the edges to the left and right of that segment (if we consider a chromosome as a ring, then the last gene will be next to the first gene or vice versa, and thus there is always a gene to the left or right of a segment) will be replaced by two new edges. As Table 2 shows, for the TSP, PREGA can effectively reduce the computation time from 80% up to 93.8% using PMX, from 81.82% up to 96.43% using OX, from 87.50% up to 95.91% using HX, from 85.71% up to 95.93% using ERX, from 92.10% up to 94.78% using SBOX, from 91.24% up to 94.57% using SJOX, from 90.78% up to 94.67% using SB2OX, and from 90.17% up to 94.54% using SJ2OX compared to those of traditional GA and GA-based algorithms alone. The simulation results further show that not only does PREGA preserve the accuracy rate of the end results, but also it can even give solutions that are better than those found by the traditional GA and GA-based algorithms alone.
The amount of time that can be reduced and the end results that can be improved depend, to a large extent, on the size of the problem. Our simulation results indicate that the larger the problem, the better the performance of the proposed algorithm. Table 2 also shows that PREGA can even improve the performance of most of the crossover operators, including the crossover operator as complex as HX. This can be easily justified by the following observation. The more complex the crossover operators, the more the computation The Scientific World Journal 9  30 : best solution in 30 runs; V : coefficient of variation as defined in Table 2. time required per gene. If the chromosome length or the number of genes can be reduced, it will in turn save the overall computation time. The results in Table 2 show that PREGA is robust even when combining with other efficient crossover operators (e.g., SBOX and SJOX) that use a different method to perform the crossover. Our experimental results also showed that if the original GA or GA-based algorithms do not give a solution that is close to the optimal, PREGA will help arrive at better solution. For example, for the benchmark u2152 using the HX crossover operator, the final result is 73,339.08, which is worse than those using the other crossover operators. PREGA(HX) can, however, save most of the computation time and even improve the quality of the end result by about 2.46%, compared to the others.

Comparison with Evolutionary-Based Algorithms.
Finally, for completeness, we compare the performance of traditional GA [41], HeSEA [42], LEM [30], SA [10], UMDA [43], EHBSA [44], ACS [45], and DPSO [46] by applying PR to all of them. Tables 3 and 4 show that not only can PR vastly reduce the computation time of these algorithms, especially for very large data sets, but it can also greatly reduce the computation time of evolutionary-based algorithms that each iteration of which takes a great deal of computation time. Note that the cunning length of EHBSA is 1/3. The inertial weight of DPSO is 0.5, and the random numbers for determining the influence of personal best and global best 1 and 2 are, respectively, 0.3 and 0.7. For ACS, the settings are based on those specified in [45]. That is, the population size is 25; the importance of exploitation versus exploration 0 is 0.9; the importance of pheromone is 2.0; is 0.1; and the number of generations is 320. The results in Table 3 show that because HeSEA takes more computation time than GA per generation, the computation time saved for HeSEA is more than for GA. For instance, the simulation results of the largest benchmark usa13509 show that using GA, the computation time is reduced by a factor of 12.77, whereas using HeSEA, the computation time is reduced by a factor of 22.07. In addition, the results of SA and PRESA highlight a different concept of removing redundant patterns. Because SA is a single-solution-based iterative algorithm, the procedures CGD and CGC have to be modified accordingly. A very simple approach is to remove patterns that are not changed for, say, 1,000 iterations in succession. Furthermore, the simulations of SA and PRESA are carried out for 30 runs, with the initial temperature 1.0 and the change probability (Δ ) = exp(−Δ / ), where is the temperature and is Boltzmann's constant [10]. The results of Table 3 show that the more the number of solutions (i.e., the larger the population size of the population-based approach is) is used in an iteration, the better the end results is and the longer the computation time is. The results of Table 3 further show that the pattern reduction method can be applied to not only the population-based but also the single-solution-based algorithms where the former finds the common subsolutions to be removed by spatial distribution while the latter finds the common subsolutions to be removed by frequency.
The results in Table 4 show that not only can the proposed algorithm reduce a great deal of the computation time of other efficient evolutionary algorithms such as UMDA and EHBSA, but it can also reduce the computation time of swarm intelligence algorithms such as ACS and DPSO while limiting the loss of the quality of the end result. In other words, the results show that PR can cut down the computation time of evolutionary algorithms, which are themselves either faster or able to provide better results than GA. For instance, even though UMDA, EHBSA, and DPSO are faster than GA by about 44.45%, 29.94%, and 56.35%, respectively, for usa13509, PR can further reduce the computation time of UMDA from 58.61% up to 81.97%, the computation time of EHBSA from 67.46% up to 79.02%, and the computation time of DPSO from 75.00% up to 91.16%. The experimental results show that the proposed algorithm can be used to speed up the performance of all the abovementioned efficient algorithms.

Diversity Analysis.
Two of the most important issues in using the pattern reduction method for enhancing the performance of GA or GA-based algorithms are how to ensure the pattern reduction method can effectively reduce the computation time and how to maintain the diversity of the population, that is, the quality of the end results. In this paper, we will discuss the impact of the pattern reduction method on the performance of GA or GA-based algorithms based on three different measures: (1) the average number of genes compressed, (2) the average quality of the end results, and (3) the average size of the search space. In other words, these measures provide an indication of the search ability and the speed of convergence of PREGA.
The search space or diversity of solutions can help us understand whether or not an algorithm is capable of avoiding falling into a local minimum at early generations in the evolution process. In this paper, we use the outdegree of cities as shown in Figure 3 to indicate the search ability of an algorithm. In other words, the higher the outdegree of a city, the higher the search ability. Now, by assuming that the cities next to each other are represented as an adjacency matrix as given in Figure 3(b), the average size of the search space at generation , denoted by , is defined as where is the number of genes (cities) left in each chromosome and , = 1 if there exists an edge between cities and ; otherwise, , = 0. That is, (4) represents the average of the outgoing paths of all the cities currently encoded in all the chromosomes (i.e., not removed). For instance, as Figure 3 shows, sixteen edges exist in all the chromosomes encoding six cities of TSP. The average size of the search space can be computed as (1/(2 × 6)) × 16 = 1.33. This number can help us measure the diversity of the search space of a genetic algorithm at a particular generation. Figure 4 compares the performance of GA and PREGA for solving the TSP using the simulation result of the benchmark pr1002 as an example. Figure 4(a) indicates that PREGA can find and remove more common edges than GA, and Figure 4(b) shows that PREGA can find better solutions than GA before generation 542. Figure 4(c) shows that PREGA can maintain more diversities than the others in the early generations during its evolution process. These results convey a very important message. That is, PREGA would find higher quality result with higher diversities (search space) at the early generations during the convergence process. At the later generations of PREGA, the diversity will become small because it is converging to a stable solution or the global optimum, but our simulation results show that even in this case, PREGA can still reduce most of the redundant computations.
For instance, as Figure 4(c) shows that at about generation 39, the curves of the average diversities of GA and PREGA cross over. That is, the search diversity of PREGA becomes smaller than that of GA at about generation 39, and the gap between these two methods is widened as the number of generations increases. It seems that the search ability of PREGA becomes worse than that of GA. But the result of Figure 4(b) shows that the search ability of PREGA does not eventually decrease between generations 39 and 542. More precisely, in terms of the distance, PREGA finds the solution 277,508.5 at generation 133, even though PREGA is unable to arrive at a better solution afterwards. GA, however, requires about 542 generations to arrive at the same solution 277,508.5 as PREGA. In addition, the final result found by GA is 277,079.43 at generation 914. Then, GA has a very small probability to find a better solution because the search diversity tends to be 1 at generation 916. Now, the most important question is if most of the genes are compressed by PREGA at generation 133 (Figure 4(a)) or later, then will it prevent PREGA from finding better solutions at later generations. According to our simulation results, if either the population size or the problem size is increased, then not all the genes will be removed at the early generations, so the problem will not exist, and PREGA will still outperform GA. Figure 4   the optimal solution can be reached by an algorithm. For instance, let us suppose that the path Ψ = {Ψ 1 , Ψ 2 , . . . , Ψ }, where Ψ is the optimal subsolution of an optimal solution for TSP. Now, by assuming that each chromosome is represented as a ring and letting = ( + 1) mod , the rate of edges that is optimal in the best chromosome at generation , denoted by , is defined as where is the number of genes (cities), and , = 1 if there exists an edge that is the optimal subsolution between the pair of genes and ; otherwise, , = 0. Figure 4(d) shows the probability of edges that are optimal and may end up being in the final solution using GA and PREGA. As indicated in Figure 4(d), PREGA has higher probabilities to find the optimal subsolutions than GA. Also indicated in Figure 4(d) is that even though the average diversity of GA and PREGA crosses over at about generation 513, the final results of GA and PREGA are very similar. More precisely, the difference is about 0.86 optimal edges with a problem of size 1,002. In summary, the down side of PREGA is that it may quickly converge to a suboptimal solution, but the up side is that the quality of the end result is very close to that of GA. For both GA and PREGA, the number of generations required for the diversity to converge to 1 is in general unpredictable. Using the benchmark pr1002 as an example, if the number of generations performed is 100, PREGA can not only reduce the computation time by about 94.84%, but it can also even enhance the quality of the end result by about 5.08%. However, the average diversities of GA and PREGA at generation 100 are both greater than 1, which indicates that if we let them run longer, they may be able to find a solution that is better than the current one. More precisely, as Figure 4 16) or by a factor of 119.23 compared to GA, and the quality of the end results is very close to each other. In other words, for a large problem, the number of generations required by GA to converge to even a suboptimal solution could be large and is unpredictable. On the other hand, PREGA can quickly provide a solution the quality of which is very close to that of GA even if the size of the problem is large.

Time Complexity of PREGA.
The time complexity of genetic algorithm is a very important issue, and it has attracted much attention of many researches [51][52][53]. In [51], Ambati et al. used information exchange probability, reproduction time, and fitness computation time for estimating the time complexity of GA. According to the results of [51], Ambati et al. presented a GA-based algorithm for solving the TSP, the expected running time of which is ( log ), where is the number of cities. This is due to the fact that their simulations indicate that "good" solutions can be obtained by GA in (log ) generations, even if the size of the TSP is large. In another research [53], Tseng and Yang showed that the time complexity of GA is (ℓ 2 ) for data clustering problem, where ℓ is the number of generations, the population size, and the number of patterns.
In this paper, we assume that the time complexity of the traditional genetic algorithm is ( ℓ), where is the number of genes, the number of chromosomes, and ℓ the number of generations. This can be easily justified by the following analysis on the time complexity of the fitness function, selection, crossover, and mutation operators used by the traditional genetic algorithm as far as certain conditions are met. For instance, suppose that tournament selection is used as the selection operator, and its size is (a constant that is far less than ). Let us further suppose that one point crossover with probability and one point mutation with each gene having probability which are mutated are used where and are less than 1. The selection operator takes time at each iteration, because GA needs to randomly select chromosomes from a set of chromosomes to find the best one and performs this procedure times. The one point crossover will exchange the information about time, and the mutation operator will take about time, and the fitness function takes time. The overall complexity of the traditional genetic algorithm is thus ( ℓ) (e.g., , , and are parameters (constants) that you choose before a simulation is carried out, and all the simulation results given in Section 4 have = 3 (≪ = 80), = 0.5 (<1), and = 0.01 (≪1)) where and are as defined above, ℓ is the number of generations required to converge, and the assumption that all the operators do not take more than or time holds. Otherwise, the time complexity could be ( 2 ℓ) or ( 2 ℓ). In the ideal case, the pattern reduction algorithm can reduce the time complexity of GA from ( ℓ) to ( ). This can be easily proved by letting Δ (0 < Δ < 1) be a constant indicating the percentage of patterns retained at each iteration. In other words, 1 − Δ is the percentage of genes removed at each iteration in all chromosomes. Then, In summary, the time complexity of PREGA is bound from above by ( ℓ) and from below by ( ). In the best case, when the PREGA algorithm is started at the very first iteration and the removal bound is set to 100%, the time complexity will be ( ). In the worst case, if PREGA cannot detect any common genes to be removed, then PREGA will fall back to GA, and the time complexity will be ( ℓ). In other words, the time complexity of PREGA depends on (1) the iteration at which PREGA starts, (2) the number of patterns removed at each iteration, and (3) the removal bound, which is defined to be "up to % of the genes detected can be removed, " though in practice, a little bit more than the removal bound of genes can be removed to simplify the implementation (more details can be found in Section 4.1). Our simulation results showed that PREGA can reduce the computation time of GA from about 80% to 95.32% when The Scientific World Journal 13 the removal bound is set to 100% for complex data sets. The results further showed that if the number of generations of GA is set to an even larger value, we can reduce the time complexity of GA to approach that of the ideal case, that is, ( ).

Conclusion
This paper presents a novel technique for reducing the computation time of GA or GA-based algorithms based on the notion of pattern reduction. To evaluate the performance of the proposed algorithm, we use it to solve the traveling salesman problem, the benchmarks of which range in size from 130 to 13,509 cities. All our simulation results showed that the proposed algorithm can effectively cut down the computation time of GA and its variants, especially in cases where the data sets are large. Our simulation results further showed that the proposed algorithm can significantly reduce the computation time of the state-of-the-art heuristic algorithms we compared in the paper, such as ACO and PSO, even though these algorithms themselves are very efficient in solving the combinatorial optimization problems. In the future, our focus will be on enhancing the performance of the proposed algorithm and widening the domains of its application.