Improving Genetic Algorithm with Fine-Tuned Crossover and Scaled Architecture

. Genetic Algorithm (GA) is a metaheuristic used in solving combinatorial optimization problems. Inspired by evolutionary biology, GA uses selection, crossover, and mutation operators to efficiently traverse the solution search space. This paper proposes nature inspired fine-tuning to the crossover operator using the untapped idea of Mitochondrial DNA (mtDNA). mtDNA is a small subset of the overall DNA. It differentiates itself by inheriting entirely from the female, while the rest of the DNA is inherited equally from both parents. This unique characteristic of mtDNA can be an effective mechanism to identify members with similar genes and restrictcrossoverbetweenthem.Itcanreducetherateofdilutionofdiversityandresultindelayedconvergence.Inaddition,wescale thewell-knownIslandModel,whereinstancesofGAarerunindependentlyandpopulationmembersexchangedperiodically,toa ContinentalModel.Inthismodel,multiplewebservicesareexecutedwitheachwebservicerunninganislandmodel.Weapplied theconceptofmtDNAinsolvingTravelingSalesmanProblemandtotrainNeuralNetworkforfunctionapproximation.Our implementationtestsshowthatleveragingthesenewconceptsofmtDNAandContinentalModelresultsinrelativeimprovement oftheoptimizationqualityofGA.


Introduction
Genetic Algorithm is a nature inspired metaheuristic used to solve optimization and search problems which would otherwise take a long time to solve using brute force methods.GA provides us the means to traverse the solution search space intelligently and to come up with a near optimal solution in a substantially short amount of time.Genetic Algorithms are used beyond computer science, engineering, and mathematics, in areas such as economics, bioinformatics, life sciences, and manufacturing.GA is well suited for combinatorial optimization problems.One such problem where we can deploy GA is the Traveling Salesman Problem (TSP).
The goal of Genetic Algorithm is to come as close as possible to the optimal solution.Since the solution search space is so huge, the major difficulty in reaching this goal is the convergence into local minima before exploring the entire search space for global minima.This is where we could exploit the concept of mtDNA to help add some order in the random search for near optimal solution.

Genetic Algorithm
The idea of GA was proposed by Holland in his 1975 book [1].Since then GA has been an active field of research and there has been numerous publications on it.TSP is one of the problems where GA has been successfully used.
As shown in Figure 1, GA has two primary functions: population selection and crossover.Selection algorithm describes the methodology to pick parents that will create children for the next generation.There are four strategies shown in the diagram: elite, roulette, rank, and tour.The elite strategy gives preference to selecting the best members from the current population itself [2].In roulette selection, members are mapped to a roulette wheel occupying space that is proportional to their fitness and members are selected randomly from it avoiding duplicates [3].Rank selection method is similar to roulette, but instead of proportional representation of the pie based on fitness, members are ranked in ascending order based on their fitness [2].In tournament selection,  population members are chosen to compete and the best one is selected to be a parent from the pool of  members [2].This process is continued until all members have been examined.The crossover is the process of intermixing the genetic representation of the parent population members with the intension of creating a better fitness in the resulting offspring.There are different types of crossover operators based on the tour representation.Partially mapped crossover (PMX), proposed by Goldberg and Lingle [4], is a popular operator.Here a section of one parent's genes is mapped to the other parent's and the rest are interchanged to produce the offspring [5].
Mutation adds addition value to GA by introducing random change which could assist in overcoming local minima in the search exploration.One way to implement mutation is to randomly select a small percentage of population members and interchange a unit of genes with an adjacent one.
The solution can be enhanced by utilizing Island Model.Here the population is broken into smaller groups or islands and GA is run separately on each in separate threads allowing us to exploit multiple processors or even multiple distributed servers to solve a large problem.This not only speeds up the processing time but also improves the quality of the solution in most cases because it eliminates the sampling bias, if any present in the initial population [3] when run in a single thread.In addition, a small number of population members from different islands can be exchanged to exploit diversity and prevent premature convergence.
Another method to improve the quality of the solution is to perform a more thorough local search.We can achieve this by using -Opt algorithm, where  is a numerical value, which is usually 2 or 3.In 2-Opt method, two edges are removed from the tour and reconnected in the other way that is possible to retain a valid tour [7].The advantage of 2-Opt is that it is fast and efficient.When used in combination with standard GA operators, 2-Opt's probability of getting stuck in local minima is also mitigated.Every execution of 2-Opt requires  ×  operations, where  is the number of cities and  is the number of members.Therefore, 2-Opt should not be run on every iteration but rather every  (e.g., 100) iterations to limit the execution time.

2.1.
Using GA to Solve TSP.In a symmetric TSP problem, a salesman has to visit a number of cities and return back to the original (first) city with the shortest route [5].TSP is a classic NP-hard problem and the worst case run time for solving it exhaustively increases superpolynomially or exponentially with the increase of number of cities.
When using GA to solve TSP, every city is denoted by a unique number.Every solution is a random sequence (or a population member in case of GA) of unique nonrepeating numbers, representing a possible route or tour.Every population member with a unique genetic makeup represents a solution to GA. Likewise every route represents a solution in TSP.GA starts out with an initial set of randomly generated population members that go through several iterations of selection and crossover function in the hope of improving the solution in the subsequent iteration or generation.For TSP, we generate random sets of routes, each of which consists of vector sequence of cities.The selection method picks pairs of routes or population members which are allowed to become inputs to the crossover function.The crossover function then exchanges the unique numbers (cities) of each route pair to generate two children (new population members) for each pair of parent population.The children, who themselves represent solutions (routes) to the TSP, then replace the parents as the new set of routes (population members).This iterative process of selection and crossover can continue until we do not get any better results in the next generation.
Additional enhancements are provided by mutations and island models.To mimic mutations in GA, a small percentage of route solutions are randomly picked to have their sequence interchanged by switching two adjacent cities. Mutation contributes to retaining diversity [9].In Island Model, multiple GAs are independently run and a small number of population members (routes in case of TSP) are exchanged between these islands after certain number of iterations (generations).Island model also allows us to exploit all available (abundant) computing resources by running multiple GAs in parallel [3].Both these mechanisms have proven to positively impact the outcome of the solution.in layers as shown in Figure 2.Each node's incoming connections have weights assigned to them and the summation of the incoming signal's weights is processed through the node and the result feed to the subsequent node[s] in the next layer.

Train Artificial
In Figure 2, each node in the input layer is connected to all the nodes in the hidden layer and subsequently all nodes in the hidden layer are connected to the output layer nodes.Each incoming connection of the node  is represented by   (i.e., th input to the node) and each connection also has a weight,   , associated with it [10].
The following equation shows the mathematical representation of   , the output of the processing of the node, where   is the bias and   is a nonlinear function such as sigmoid function [10].

ANN Node Function. Consider
(see [10]).The ANN in Figure 2 is a Feedforward Neural Network, where the connections between the nodes do not form a feedback loop.There are different ways to train a Neural Network.During the training phase the value of the weights and biases are optimized to solve a particular problem.Gradient descent is typically used to adjust the weights based on the difference between the desired output and the current one [11].Genetic Algorithm can also be used to train the Neural Network.The selection and fitness criteria can be aptly applied in training the Neural Network.Function approximation is one of the areas where Neural Networks can be used effectively.

Related Work
There has been considerable amount of research to improve the GA operators to solve TSP.The development of several selection strategies mentioned earlier, that is, elite, roulette, rank, and tournament, is a testimony of that effort.These strategies have been implemented and run against TSPLIB benchmarks [8] by different researchers.Each of these selection operators has its own characteristics, benefits, and shortcomings.Razali and Geraghty [6] concluded in the paper that rank based selection strategy yielded better results but took more computation time, while tour method is faster for small sized problems [6].Selection methods represent only one side of the TSP problem.The other major side is the crossover functions, which contributes significantly to the success of the algorithm.There are about eleven crossover operators reviewed by Larrañaga et al. [5] in their paper.Majority of them are based on specific patterns of information mixing and interchange between the parents, for example, Order Crossover (OXI), introduces several uniform length cut points in the path of the parents and produces offspring with several subpaths from the parents intact and assimilated in the children [12].Another crossover operator, that is, Genetic Edge Recombination crossover, adds more meaningful logic in its workings by assuming the edges of the tour are important and attempts to preserve them in the offspring [13].
There are published literatures on restrictive crossover.Galán et al. [14] proposed a mating strategy that balances between exploration (selection criteria) and exploitation (fitness criteria) by developing a parameter called mating index, which controls the degree of exploration (or diversity) of parents based on the hardness of the problem.Strategies like incest prevention [15] prevent mating between similar individuals.Assortative mating is another strategy used to improve GA results.Ochoa et al. [16] demonstrate the relation between mutation rates and assortative mating choices; that is, higher mutation rates work well with assortative mating whereas lower mutation rates work well with dissortative mating to confer better fitness.The idea behind these strategies is based on the principle that offspring of similar individuals do not result in higher fitness.Introducing controlled mating based on similarity of genes does yield better results but they are also computationally costly as the lengthy chromosomes have to be compared.
This paper presents a further optimized idea of restrictive mating to complement the standard crossover operators.The idea is based on the premise that it would not be beneficial to select the offspring of the same parents (or close lineage) as new parents to cross over with each other.In fact, it could be detrimental to maintaining diversity and exploring greater search space.As an alternate to exhaustive comparison of the genes to determine genetic diversity between the parents, we present an algorithm that is computationally lean.We exploit the concept of mtDNA to enhance the GA.

Mitochondrial DNA (mtDNA).
Humans have 23 pairs of chromosomes with one copy of each pair inherited separately from each parent [6].The DNA in these chromosomes is referred to as nuclear DNA [17].In addition, humans also have mtDNA [17], which consists of only 1% of the total DNA [18], thus coding for far less genes.Though insignificant by orders of magnitude when compared to the nuclear DNA in their contribution to inheritable traits (genes), mtDNA's unique characteristic in inheritance can play an important role in guiding the search for optimal solution.The DNA sequence in the 23 pairs of chromosomes is inherited equally into the offspring from both the parents during reproduction, whereas the sequence in mtDNA is inherited only from the maternal side [18].This allows us to keep track of population members with similar genetic traits and common inheritance via maternal lineage.Diversity is the key to preventing premature convergence and achieving near optimal solution.Crossovers between similar population members with close DNA proximity will not yield results better than the prior generation in most cases.The idea in this paper is to create a data structure to tag and track the mtDNA in every population member and restrict the crossover between population members with similar mtDNA.mtDNA is widely used in evolutionary genetics and population study [19], and its concept could potentially be beneficial to GA search exploration.

Using mtDNA GA and Scaled Architecture to Solve TSP.
The primary objectives of GA are to help get us better solution after every iteration and to prevent solution exploration from prematurely converging into local minima.The primary way to address the later goal is to introduce the right amount of diversity in the parents.
Most of the crossover operators tend to be very refined and granular at the city or node information level and seem to overlook the bigger picture.As the GA undergoes several iterations of crossover, the risk of convergence increases too and in some cases, crossovers between the same population members' offsprings would not yield results any better than the previous generation because their parents would have similar genetic information to begin with.With less genetic variance in the parents, we cannot expect better or different results in the offspring.It is self-evident that genetic variability sows the seed for evolution and newer offspring [20].One way to track genetic similarity is by tracking the family lineage.And the most effective way to track inheritance in the real world is through mtDNA [21].
The concept of mtDNA (Mitochondrial DNA) is implemented in this paper to control the crossover function to prevent population members with same mtDNA from reproducing for  number of generations.To avoid the overreaching consequences of this condition, this requirement is dictated only on a percentage of crossovers.mtDNA is defined as a separate attribute of the population member class.Since mtDNA gets inherited solely from the female parent, it does not alter as it is passed down to the offspring.This attribute was exploited to guide and control crossovers.Here is a high level overview of the mtDNA algorithm and pseudocode.
All of the four selection methods (tour, elite, roulette, and rank) described earlier were utilized during the implementation.To transfer genes to children during crossover, 1/4 to 3/4 tour cut was made on parent one and transmitted to the children.The rest was transferred in cyclic order from the second parent, skipping any cities that were already derived from the first parent, thus ensuring every city is represented in the child with no repetition.In addition to leveraging mtDNA in the implementation, various selection methods, Island Model, 2-Opt, and distributed processing using multiple servers (Continental Model), were also utilized.
Figure 3 provides a high-level workflow of the GA implementation in this paper.Custom version of GA was run on each of the four threads on each server.
Island Model was implemented with multicore processors in server by running multiple threads in parallel.Each thread ran its own version of GA.Periodically after every  number of iterations/generations on each of the threads running the GA, a handful of randomly selected population members were exchanged between the threads.This process not only added more computing resources and improved the execution time of GA but also increased diversity and reduced initial sampling bias.
2-Opt was implemented by selecting the population member with the best fitness so far in the particular thread of GA execution every  iterations/generations.The selected member then underwent local optimization.Two links/edges of the best member were swapped exhaustively to check if it improves the solution.
Island Model was further scaled with distributed processing by executing the abovementioned implementation on several servers using Web Services (Service Oriented Architecture (SOA)).We aptly named it Continental Model.Population members were randomly exchanged between these independently run Island Model GA implementations in different servers after a fixed number of iterations to achieve diversity and to reduce the likelihood of premature convergence.
The implementation was run against known TSP instances (dantzig42, eil51, rd100, ch150, and kroB200) from TSPLIB [8].The numerical value in the name of the benchmark denotes the number of cities in it; for example, eil51 has 51 cities.The results from the mtDNA implementation of GA were compared against the results of implementation by Razali and Geraghty [6] and known best solutions.

mtDNA GA in Artificial Neural
Network.We used GA to train the Neural Network for function approximation.A multilayer feedforward ANN, with 1 node in the input layer, 26 nodes in the first hidden layer, 26 nodes in the second hidden layer, and 1 node in the output layer, was chosen.The GA implementation for Neural Network is similar to GA implementation for TSP.In place of the city number (in TSP), the value of the weights is randomly initialized in a solution set and crossed over with another set of weights in the case of  Neural Network.But unlike in TSP, the values of the weights do not need to be unique within a solution set.mtDNA was introduced in GA here just like in TSP.The resulting children from crossovers were tagged with the same mtDNA attribute of the mother for  iterations as defined in Algorithm 1 and crossovers prevented between those with same mtDNA.mtDNA value was reset after  iterations to ensure that crossovers were not too restrictive.We used mtDNA implementation of GA to train the Neural Network for function approximation.
Here are the details from the implementation: (1) Population size = 200.
GA was used to train ANN for the following functions with and without the mtDNA logic (Figure 4):

GA in Traveling Salesman
Problem.The TSPLIB column in Table 1 indicates the benchmark names.The 2nd and the 3rd columns represent the fitness from the known best solution and Razali et al. [6] paper, respectively.The last column lists the fitness value from the GA implementation with mtDNA together with 2-Opt and continental model in this paper.In case of TSP, the fitness function is the distance traveled by the salesman through all the cities and back to the first one.Solutions are evaluated based on the fitness value; that is, the lower the value the better.
Figure 5 shows the comparison between this paper's mtDNA GA implementation and the known solutions [8] of the TSP benchmarks.mtDNA GA implementation scored better on benchmarks with relatively less number of cities, that is, dantzig42 (42 cities), eil51 (51 cities), and eil76 (76 cities).
Figures 6, 7, and 8 provide the results of mtDNA GA implementation on TSPLIB benchmarks with higher (100, 150, and 200) number of cities.While the results of mtDNA GA on these higher benchmarks are slightly behind than the known best solution, they are significantly better than the results when mtDNA logic and scaled (multinode) architecture were not used.Thus, introducing these two concepts adds value to solving TSP by consistently lowering fitness of the solution, even in TSPLIB benchmarks with greater than 99 cities.
Table 2 provides the best route/tour results that were received by running GA with mtDNA for the respective  benchmarks.The result from Table 1 demonstrates that mtDNA yields results that are better than Razali and Geraghty [6] and the published solution posted on TSPLIB [8] for dantzig42 and eil51 benchmarks.When mtDNA and Continental Model were used with other known operators and algorithms, it resulted in solution that was better than the published solution for eil76 and very close to known best solutions for rd100, ch150, and kroB200 benchmarks.[6] and the known best solution [8].

GA in Artificial Neural
Networks.The ANN was trained separately using mtDNA implementation of GA and GA by itself.After the weights were set, the ANN was used to approximate the four functions (Figure 4) and the square error was computed.The results from the mtDNA implementation of GA as listed in Table 3 were better than the results when GA was used by itself across all four functions.Table 3 shows the results from the several (100, 200, 300, 500, 1000, and 5000) iterations/executions using both the training methods.3 in graphical form.
The results from mtDNA incorporated GA trained ANN consistently outperforms the GA-only trained ANN for the given four function approximations.

Conclusion
We have presented two important ideas of mtDNA and a Continent Model in improving the optimization quality of GA.The mtDNA logic introduced in the paper is novel idea and is inspired by nature just like many of the optimization algorithms, for example, Genetic Algorithm, Swarm Intelligence, Ant Colony Optimization, and Neural Network.Like these nature inspired algorithms and systems, the concept of mtDNA is not very complex but can be instrumental in improving the quality of metaheuristics.Maintaining diversity is the key to preventing premature convergence into local minima.The characteristics of mtDNA can be exploited to track diversity and restrict crossover between parents of same genetic traits, thus yielding better fitness value in the offspring.The mtDNA concept articulated and implemented   in this paper mimics the natural order where it is an established fact that biodiversity favors evolution and produces more adaptable offspring.Thanks to faster hardware, parallel/distributed processing, and algorithms, TSP benchmarks with smaller number of cities have been solved optimally in short runtime.Larger benchmarks/problems still provide opportunities to improve algorithms.The implementation of mtDNA on small to medium sized TSP benchmarks in this paper supports its contribution in relatively improving the quality of solution.The goal of this implementation is not only necessarily to beat the runtime record of algorithms on benchmarks that have already been optimally solved, but also to provide a proof of concept of a technique that can be exploited to get better results.Likewise with Continental Model, we improve our results with greater exploration of the search space afforded by an additional layer of randomness and exchanges between independent implementations of Island GAs.Continental Model multiplies the benefits of Island Model by injecting more diversity and reduces the negative impact of any inherent initial biases in the individual silos of GA implementations in different systems.In addition, we were able to use the concept of mtDNA in GA beyond TSP to improve the outcome of Neural Network learning.Thus, it can be concluded from the results of this paper that Continental Model and the incorporation of mtDNA to control crossover are constructive modifications that contribute to further optimize the GA by yielding relatively better results.

Future Work
To extend the validity of mtDNA in GA as a generally more acceptable technique, it can be implemented and tested in other combinatorial optimization problems with much larger data (population) sets.Other novel methods of distributed and parallel computation algorithms can also be leveraged to get closer to optimal solution.The idea of mtDNA to guide the crossover function can be further refined and ingrained into the GA algorithm to achieve better results.The value of mtDNA can be made relative to the variance between the nuclear DNA sequences (city route sequences) of population members and we can restrict crossovers between members with close mtDNA proximity in addition to members with same mtDNA.We can combine the concept of mtDNA with other crossover operators and explore further optimization strategies.In addition, instead of outright prevention of crossovers between population members with same mtDNA, we can employ special operators to such crossovers to maximize diversity.To further validate the use of mtDNA concepts, we can extend the scope of the tests with more experiments and other optimization problems.

1 ) 1 )
Master polls each server and exchanges members after N iterations.Master keeps tracks of the best fitness (Each thread in the server runs an instance of GA(2) Members are exchanged between threads every M iterations (3) Each server keeps track of its best member/Each thread in the server runs a custom version of GA (2) Crossover between selected members are permitted only if they satisfy the mtDNA requirements (3) Population members go through a number of iterations of selection, crossover, 2-Opt, and mutation (4) Best member is reported to the server

( 1 )
Initialization Assign mtDNA attribute to each population member If population = initial then mtDNA ← random unique value If population = offspring of crossover then mtDNA ← mtDNA of female (2nd parent) (2) for  ← 1 to Maximum Iterations (a) Selection Check mtDNA attributes of crossover pairs at (total iteration mod 100) <  iterations, where  < 100 If parent 1 mtDNA = parent 2 mtDNA then abort crossover & find another pair If parent 1 mtDNA ̸ = parent 2 mtDNA then allow crossover Reset mtDNA attribute of all members to unique values after  iterations. < log 2 , where  = total population (b) Crossover & mtDNA transfer Children's mtDNA ← mtDNA of the female parent Algorithm 1: mtDNA pseudocode and algorithm.

Figure 4 :
Figure 4: Four functions used for training Neural Network.

Figure 7 :Figures 9 -
Figure 7: Graph shows various GA/mtDNA results along with the published best (known) solution for ch150.

Figure 10 :
Figure 10: Graph shows error with GA mtDNA and standard GA for Function B.

Figure 11 :Figure 12 :
Figure 11: Graph shows error with GA mtDNA and standard GA for Function C.

Table 2 :
Best route/tour for mtDNA fitness values for various benchmarks.

Table 3 :
Best route/tour for mtDNA fitness values for various benchmarks.