Simulation Experiment Exploration of Genetic Algorithm’s Convergence over the Relationship Advantage Problem

Concentrating on the convergence analysis of Genetic Algorithm (GA), this study originally distinguishes two types of advantage sources: value advantage and relationship advantage. Accordingly, the quantitative feature, complete quantization feature, and the partial quantization feature in the fitness evaluation are proposed. Seven simulation experiments show that these two types of advantages have different convergence properties. For value advantage problems, GA has a good convergence. However, for a relationship advantage problem, only from the practical point of view, it is possible to get a feasible and even satisfactory solution through large-scale searching, but, in theory, however, the searching process is not convergent. Therefore, GA is not reliable to solve relationship advantage problems, to which most engineering problems involving combinatorial optimization belong. This study systematically shows convergence properties of “relationship advantage” through simulation experiments, which will be a new area for the further study on GA.


Introduction
There are now many operation research tools for solving combinatorial optimization problems, such as the Ant Colony Optimization (ACO) algorithm, Particle Swarm Optimization (PSO) algorithm, and Artificial Neural Network (ANN).However, no method has been proven to be better than some other methods, since each method has its own advantages and disadvantages.GA, as one of the oldest and the most used heuristic algorithms for NP-hard problems, has been attracting a large number of scholars to study its properties and applications.This research is motivated by a flexible job shop reconstruction employing GA [1], whose objective is to lay out machines to get a high logistic efficiency.However, the search is not stable and it is difficult to get a satisfactory solution.By a great deal of literature review [2][3][4][5], the same problem has been found in other engineering cases, such as the path planning problem and the bin packing problem, among other problems, in which a feasible solution may be gotten by great size searching, but the convergence cannot be guaranteed and the efficiency is really low.To explore the reasons for low search efficiency and poor convergence, this research carries out a series of simulation experiments, since simulation technology is a good approach to finding the root cause.
Since the GA is a kind of simulation method which simulates the process of biological evolution [6], for a long time, its effectiveness has been proved through experiments and application results based on the experience design theory [7], and so far there is no strict mathematical proof to demonstrate its convergence.Even in Adaptation in Natural and Artificial Systems [8], originally published in 1975, in which the GA was originally formally proposed by Michigan University's Professor Holland, and republished in 1992, there was no strict mathematical proof given.GA is one of heuristic optimization algorithms, which simulates Darwin's natural selection and genetic mechanism in the evolution [9].From a practical perspective, the researches of GA in recent years focus on how to improve it to solve actual industrial problems such as the traffic scheduling, packing and layout optimization, and job shop scheduling problems.
For example, in the field of vehicle and path planning, Vidal et al. [10] present an efficient hybrid genetic search with advanced diversity control, which is proven feasible by outperforming all current state-of-the-art approaches on classical literature benchmark instances for any combination of periodic, multi-depot, site-dependent, and durationconstrained vehicle routing problem with time windows.Tasan and Gen [11] propose a GA based approach to vehicle routing problem with simultaneous pickup and deliveries, which is evaluated by solving several test problems.
In the field of packing and layout optimization, Gonc ¸alves and Resende [12] present a multipopulation biased randomkey GA for the single container loading problem, and the proposed algorithm is extensively tested on the complete set of benchmark instances and is compared with 13 other approaches to demonstrate its better performance.Moradi and Abedini [13] present a novel combined GA and Particle Swarm Optimization algorithm model for optimal location and sizing of distributed generation on distribution systems, the effectiveness of which is demonstrated by a detailed performance analysis carried out on bus systems.
In the field of job shop scheduling, Yusof et al. [14] propose a new hybrid parallel GA based on a combination of asynchronous colony and autonomous immigration GAs to solve benchmark job shop scheduling problem, whose better performance is shown by decreasing the makespan considerably as compared to the conventional GA.As to the hybrid flow shop scheduling with multiprocessor task problem, Engin et al. [15] propose a new mutation operator and a full factorial experimental design was determined by using the best values of the control parameters and the operators.Zhang et al. [16] study a job shop scheduling problem with two new objective functions based on the setup and synergy costs besides the traditional total weighted tardiness criterion and present a Pareto-based GA incorporating a local search module, whose effectiveness is verified by the computational experiments on both real-world and randomly generated scheduling instances.
So far, it can be seen that for the application of GA it fails to carry out the strict mathematical proof of convergence, and its validity and practicability are proved mainly on the following four aspects: (1) Taking a standard test case as the instance to test its validity.This study is motivated by a flexible machine layout problem [1,17].There are some researches which apply GA to the flexible machine layout optimization [18][19][20].However, in a flexible workshop layout optimization study involving 6 machines [1], we found that although there could be feasible solutions or even satisfactory solutions appearing in the process of search, better solutions show no sign of spreading in the population, and the search space also does not appear to narrow; in other words, the search process is not convergent.After a detailed investigation of individuals in each generation, we found that values of design variables may be completely different between those better solutions, while there are similar relative position relations between those variables.These better position relations cannot be kept and spread in the subsequent iterations using GA.
In this paper, the advantage derived from relationship between variables is defined as relationship advantage, and the advantage derived from values of design variables is defined as value advantage.This study discusses the convergence property of the value and the relationship advantage by designing a series of simulation experiments.

Value Advantage and Relationship Advantage
In the biological world, two different types of advantages can generate higher fitness: value advantage and relationship advantage.Value advantage comes from a single trait's value.For example, in Darwin's theory of evolution, a giraffe with a longer neck has the stronger ability for reaching food on the tree; this is a value advantage produced by "neck length." The other instance of value advantage in Darwin's theory of evolution is the moth: the darker the color of the moths, the stronger the survival ability; this is a typical value advantage too.The degree of superiority depends on the value of a single trait.
Most advantages in the biological world, however, are not from the value of a single trait, but from the relationship between multiple traits, which is defined as relationship advantage in this paper.For example, cheetah is the running champion of animal world, but the cheetah's leg is not longer than that of the horse, and its body is also not stronger than that of the tiger; cheetah's superiority of running ability is not from longer legs or a more robust body, but from the high degree of coordination among locomotive organs; this is the typical relationship advantage.Again, the dragonfly's superiority of balance ability does not come from the size of head or body, but from the perfect proportion between head size and body length, which is a typical relationship advantage.In fact, in the world of biology, most advantages belong to the type of relationship advantage; even in the giraffe example referred to in Darwin's theory of evolution, if there is no coordination with long neck muscles, heart function, and other traits, the only characteristic of long neck is unable to form a survival advantage.
In the field of engineering, there are value and relationship advantages as yet.The cause of carrying out this study is the following: in the process of using a GA for machine layout optimization considering the workshop logistics characteristics, we found that the standard GA for this problem is not convergent.The reason is as follows: the optimization degree of machine layout depends on the relative position relationship among machines and the compatibility with the workshop logistics characteristics and does not depend on a specific place of a single machine.That is to say, even if the locations of multiple machines change, its logistics characteristic value still does not change as long as the relative position relationship among machines keeps changeless.Machine layout problem is a typical relationship advantage Figure 1 is a circuit including a slide rheostat, a light bulb, and a set of batteries.Taking the slide rheostat's value as the design variables and the battery voltage and the maximum resistance of the slide rheostat as constraint conditions and maximizing the brightness of the light bulb as the optimization objective, an optimization model can be constructed.The optimization model is as follows.
Determine  (the value of the slide rheostat) as follows: where  is the current value of the circuit,  is the biggest resistance of the slide rheostat,  is the voltage of the battery pack, and 1 is the resistance of the bulb.
Obviously, the brightness of the light bulb is only determined by the value of the slide rheostat; the smaller the value, the brighter the light bulb.This is a problem of value advantage.
Figure 2 is a circuit including a light bulb, a set of batteries, and 4 switches, each of which has two optional state values of 0 or 1. Taking the state values of switches as design variables, the range of optional states as the constraint condition, and turning the bulb on as the optimization objective, an optimization model can be constructed.The optimization model is as follows.
Determine 1, 2, 3, 4 (the state values of the switches) as follows: Obviously, the state of the bulb is not determined by any single value of 1, 2, 3, 4, but by whether four design variables are equal or not.This is a relationship advantage problem.
It can be drawn that not only a trait value can generate advantages, but also relationship among traits can generate advantages too; furthermore, the latter is even a more common form of survival advantage.However, the type of relationship advantage is not motioned and the two types of advantages are not distinguished in Darwin's The Origin of Species.So far, in the literature related to the GA, there is no systemic research on the relationship advantage.

Quantitative Feature, Complete
Quantization Feature, and Partial Quantization Feature in Fitness Evaluation In the theory of evolution, the role of "nature" in "natural selection" does not have the ability to differentiate different fitness from the same survival result, and the fitness of individuals is based on the survival results.Therefore, fitness evaluation has three features: the quantitative feature, complete quantization feature, and the partial quantization feature.
Quantitative feature: there is a cumulative correlation between the individual trait values and the individual fitness; namely, the change of a trait value influences the change of the fitness in a timely manner.The example shown in Figure 1 has a typical quantitative feature: taking current 's value to evaluate fitness as  = /, the smaller the value of slide rheostat , the greater the individual's fitness, and there is a continuous correlation between the design variables and the objective value.
Complete quantization feature: fitness value is only 0 or 1, and there is no intermediate and accumulation relation between the design variables and the objective value.Figure 2 is a typical example of complete quantization feature: for the two design variables  (1) = (1, 1, 1, 0) and  (2) = (1, 1, 0, 0), both results are that the light is off and the objective value is 0.Although from a smart designer's viewpoint the former is closer to the "optimization" than the latter, in the process of natural selection, the role of "nature" cannot tell the difference of the two solutions' "distances" to the optimal solution.In the optimization model simulating the evolution process, taking 's value as the fitness value cannot make a difference between the two solutions' "distances" to the optimal solution.The fitness of an individual is only either 0 or 1, which is a typical complete quantization feature.
Partial quantization feature: when the design variable values are within a certain range, the fitness value changes with the design variable value, but when design variables are outside this range, the fitness value does not change with the design variable value.Between design variables and the fitness, there are both quantitative feature and quantization feature, which is defined as partial quantization feature in this paper.Figure 3 is a circuit including a light bulb, a set of butteries, three step adjustment slide rheostats, with the resistance of each of them being 4 totally, and four switches, each of which has five optional state nodes of 0, 1, 2, 3, 4. When resistance in the circuit is greater than (3 + 1), the light bulb is off, and when resistance value is less than or equal to (3 + 1), the brightness of the light bulb increases as the resistance decreases.Taking the state values of switches as design variables, the range of optional states as a constraint condition, and the light bulb brightness as the optimization objective, an optimization model can be constructed as follows.
Determine 1, 2, 3, 4 as follows: Else  = 0. ( When the four state values of switches are near to a certain range, the brightness of the light bulb changes along with the switch values, and the evaluation of fitness has the quantitative feature.When the four state values of switches are outside a certain range, the light bulb is off, and the change of switch state values does not affect the light bulb's state; for example, when 1 = 0 and 2 = 4, no matter how many the values of the 3 and 4 are, they do not affect the light bulb's state, and the evaluation of fitness has the quantization feature.Therefore, this case of Figure 3 has a partial quantization feature.
Relationship advantage often has complete quantization feature or partial quantization feature.This study will take systemic experimental research on the property of GA employed to solve the value advantage problem and the relationship advantage problem with the quantitative feature, complete quantization feature, and partial quantization feature.

Two Simulation Experiments of Value Advantage
Experiment One.As shown in Figure 4, the circuit contains a set of batteries, a light bulb, and four identical step adjustment slide rheostats, each of which has 16 optional states from natural number 0 to 15. Taking maximizing the brightness of the bulb as the objective, optimization model can be developed as follows.Determine 1, 2, 3, 4 as follows: Min  () = 1 + 2 + 3 + 4 In the optimization model, the variables' definition domain is shown in Table 1.
Step 2. We randomly form 600 individuals as the initial population and then randomly divide the 600 individuals into 300 couples.Through crossover operation, each couple generates four offspring, and thus a new expanded population containing 1200 individuals is obtained.
Step 3. From 1200 individuals' 1200 * 16 genes, we randomly select 10 genes for "anti" operation so as to realize the variation operation in a small probability.
Step 4. Taking () which is the sum of the four decimal numbers to which the four groups of 4-bit binary in an individual correspond as the fitness calculation basis, the expression is Fitness = 1/(() + 1).Namely, the greater the sum of four design variables, the smaller the fitness.When the sum of four design variables is 0, fitness is maximum 1; namely, the 1 = 2 = 3 = 4 = 0 is the optimal solution.
Step 5. Back to the first step, we repeat the loop iteration until the population scale of the same individuals reaches a certain evaluation criterion.When meeting the evaluation criterion, the optimal individual is output as the optimal solution.
The above is the strict simulation experiment using standard GA. Figure 5 expresses the change of the amount of optimal individuals in a population with the iterative process.The abscissa iter indicates the iterations, and the ordinate indicates the amount of optimal individuals in a population after completion of each time of iteration.It can be seen that after 649 iterations there are 571 individuals of "0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0" in a population.Namely, after 649 iterations, 95.17% of individuals in a population have been updated to the optimal individuals.After about 700 times of iteration, the number of the optimal individuals in population will fluctuate around 98%.The slight fluctuation is caused by the mutation operation, but the optimal individual size is stabilized over 95%; namely, the advantage can be kept and spread in the population.
Repeated experiments show it can form a stable population eventually when using the standard GA to solve the above problem.The number of the optimal individuals in  the population accounts for an overwhelming majority of the population proportion; consequently, the search process has a good convergence.
Experiment Two.Some literatures claimed that using the GA can generate art images [21,22]; in view of this, an interesting experiment using GA to generate the Mona Lisa's image is designed in this research.Taking 60 * 60 standard image of Mona Lisa as the comparison standard and the values of 60 * 60 pixels as design variables, a chromosome is made up of 360 genes, each of which has state of either 0 or 1; while 0 indicates that the pixel is white, 1 indicates that the pixel is black.The values of the image pixels can form a chromosome containing 360 genes.In 3 * 3 images, for example, if a chromosome is (1 0 0 0 1 1 1 0), the image matrix is The corresponding image is Figure 6.According to the above coding tactics, a standard chromosome composed of 360 genes can be obtained from a 60 * 60 Mona Lisa image.The fitness of any image as an individual can be worked out by comparing its genes with the standard chromosome, taking the amount of equal genes as the fitness value.Taking the 3 * 3 image in Figure 7 as an example to illustrate the calculation method of fitness, if Figure 6 is a standard comparison image, namely, the standard chromosome being (1 0 0 1 1 1 0 0 1), then the fitness of chromosome (1 1 1 1 0 0 0 0 1) corresponding to Figure 7 is  5.
The iterative process of GA can be formed using the above tactics of coding and fitness calculation.Figure 8 shows the corresponding image of the optimal individual in every generation in the process of iterations.It can be seen that the Mona Lisa image contour can be formed after about 7000 generations of iterations, a clear image can be formed after 60000 generations of iterations, and it can form a stable clear image of Mona Lisa after about 100000 times of iterations.It can be drawn that image evolution has an obvious convergence feature if an existing image can be employed as the comparison standard.
These two experimental examples above are typical value advantage problem with quantitative feature, for which the standard GA is proven to have a significant convergence.

Experiments of the Relationship Advantage Problem with Complete Quantization Feature
Experiment three takes the expanded Figure 2 as the case.
To keep these experiments with similar complexity, the state range of each switch in Figure 2 In the optimization model, the variables' feasible domain is shown in Table 2.
Three individuals of ,   ,   can be seen in Table 2, in which only  is the optimal solution.Although the other two individuals have different gene values, they have the same objective value  = 0; namely, the objective value does not timely change with different individuals.This is a relationship advantage problem with complete quantization feature.
Experimental scheme is as follows.
Steps 1, 2, 3, and 5 are the same as in experiment one; only Step 4 which involves fitness calculation and selection operation is different.
Step 4. Taking the current value corresponding to an individual as the individual's fitness, the expression is Fitness = .We choose 600 individuals whose fitness does not equal 0. If the amount of individuals whose fitness does not equal 0 is less than 600, then we select randomly in the rest of the individuals to complete it, and if the amount of individuals whose fitness does not equal 0 is more than 600, then we randomly select 600 individuals among them.Thus the selection operation is realized to form an updated population with higher fitness.
The total of the optimal individuals in each generation along with the 100000 times of iterations is shown as in Figure 9.It can be seen that the total of the optimal individuals in the population is between 10 and 30, and in 100000 iterations the optimal properties do not spread in the population.In the process of searching, the size of the optimal individuals is less than 5% of the population size.Therefore, the searching process is not convergent.
In order to show the convergence property of the complete quantization relationship advantage problem more Else  = 0.
Compared with experiment three, experiment four is a simpler relationship advantage problem since the chromosome expressing design variables can be represented only by two genes.The search process can be obtained as in Figure 11 using the same coding method and the experimental process as in experiment three.It can be seen that the total of the optimal individuals is around 300, and the scale of optimal individuals in a population is around 50% of the population size.Even if the optimal individual scale is more than 50% of the population occasionally, the large scale cannot be kept stable.The entire process of search shows no convergence feature.
In experiment four the feasible domain only has four types of individuals, namely, (0, 0), (1, 1), (0, 1), and (1, 0), and the first two individuals are the optimal solutions.By probability calculation, it is easy to get that the probability of (0, 0) or (1, 1) individual appearing is 50% under the condition of the random probability of 0.5; that is, the search process meets the feature of random probability using GA to solve the problem, and there is no tendency to converge to the optimal solution.So, the search process is randomly divergent.From the following analysis, the reason of divergence can be shown.
Assuming that there are four individuals, 1 (0, 0), 2 (1, 1), 3 (0, 1), and 4 (1, 0), obviously, 1, 2 are the optimal individuals, and 3, 4's fitness is 0. Taking 1 and 2 for crossover operation, however, the possible newborn individuals are (0, 1) and (1, 0), whose fitness is 0. Taking 3 and 4 for crossover operation, likely newborn individuals are (0, 0) and (1, 1), whose fitness is 1.That is, an individual with a higher fitness has no higher probability to form an offspring with higher fitness.It is the reason that relationship advantage cannot spread and keep in a population and consequently the search process is not convergent.

Experiments for the Relationship Advantage
Problem with Partial Quantization Feature A lot of relationship advantages have complete quantization feature, but some relationship advantages have significant partial quantization feature.To keep these experiments with similar complexity, experiment five is designed according to Figure 3 Using GA to conduct 100000 iterations, the size of the optimal individuals in a population, which is always within 30, changes in the process of iterations as shown in Figure 12.The optimal features have no trend to spread on a larger scale in the population, and the total of optimal individuals accounts for only around 5% of the population size.Iterative process shows no tendency of convergence.Partial quantization relationship advantage cannot be kept in a population due to the fact that a better individual has no greater probability to generate a better offspring.
The Mona Lisa image can be gotten through GA, which seems to be a marvelous thing.But after detailed analysis of experiment two, it can be seen that this experiment is still a value advantage experiment, whose individual fitness depends on the accumulation number of the equal pixels compared with the comparison standard.However, artistic image is clearly a relationship advantage problem.When the Mona Lisa image is zoomed in or out, the pixel values of specific locations change, but the whole image is as yet an image of Mona Lisa.That is to say, the image's art value does not depend on the color of a specific pixel point but on the relationship among the positions and colors of pixels.Can a Mona Lisa image be obtained by GA without a comparison standard of Mona Lisa?For this question, a simpler image evolutionary experiment is designed, experiment six: collinear experiment.
Taking a 60 * 60 black and white image as the instance, the objective of optimization is at least one vertical black line appearing in the image.The chromosome can be expressed by 3600 genes, each of which expressed in 0 or 1 corresponds to the color of corresponding pixel.The row number for the pixel a gene corresponding to equals the quotient of the number of the gene in the chromosome divided by 60 adding 1, and the remainder is the column number of the pixel; for example, the 705th gene equals 1; it means that the pixel in the 12th row and 45th column is black.When the values of all corresponding pixels are with the same remainder equal to 1, a vertical black line appears in the image and the individual is the optimal individual.Using GA to conduct the simulation experiment for the problem, after 200000 iterations, the iteration process shows that the optimal individual does not appear in a large scale; even in fact, no one optimal individual appeared in the total of 200000 * 600 individuals.
To further illustrate that relationship advantage cannot be maintained using GA, experiment seven is designed with a more relaxed initial condition.Experimental optimization models of seven and six are the same, but the 600 original individuals in the first generation are all set as the optimal individuals in experiment seven; namely, the corresponding image of each individual has at least one vertical black line.The iterative process using GA is shown as in Figure 13, and the abscissa indicates the generations of iteration, while the ordinate indicates the amount of the optimal individuals in a population including 600 individuals.As shown in Figure 13, after 153 iterations, all the optimal individuals in the population disappear, and in all subsequent iterations, there is no optimal individual appearing.Repeated experiments show that all the optimal individuals in the population disappear after around 150 times of iteration.Experiment seven indicates that GA does not have convergence for relationship advantage problems and also has no ability to maintain genetic advantages and characteristics in the search process.Thus, evolutionary mechanism for relationship advantage is a failure.

Conclusions
Although in industrial applications the GA is one of the main ways to solve the NP-hard problem, however, so far, the convergence of GA used to solve NP-hard problems has never been rigorously proved, and in most cases the GA has no good convergence characteristics.Considering significantly different convergences of GAs employed to solve different problems, this study differentiates two different types of advantage sources: value advantage and relationship advantage.Quantitative feature, complete quantization feature, and partial quantization feature in fitness evaluation are analyzed in this research too.Through the design of the seven simulation experiments and studying the spreading condition of high fitness traits in the population, it is proved that the convergence of value and relationship advantages, respectively, has the following features: (1) GA applied for value advantage problems has significant convergence.(2) The relationship advantage can be divided into the partial quantization relationship advantage and the complete quantization relationship advantage, and they are both divergent when using GA to solve the problem.
For the problem of relationship advantage, only from the practical view, through large-scale search, the feasible solution, even a satisfactory solution is likely to be gotten.But theoretically speaking, the search process is not convergent, and it cannot be ensured that the optimal solution could be obtained.So, the solving reliability cannot be ensured when using GA to solve the relationship advantage problems.
Although this research reveals one of the root causes for the poor convergence of GA, there is no doubt that the work has its limits.The convergence of GA has always been a vague question.Very little literature is devoted to the convergence analysis of GA, concentrating on the convergence rate [23], employing the basic GA model [24], and facing the special problem [25].So far there is no systematic mathematic research on convergence reliability of GA.This research shows the convergence property of GA by simulation technology, like other current researches, not a strict mathematical proof.Giving the mathematical proofs for the different properties of the two types of advantages and finding a good approach to solve the poor convergence of relationship advantage will be the future work.

( 2 )
Comparing with standard GA to illustrate the superiority of the improved algorithm.(3)Comparing with the relative research to illustrate the superiority of the proposed method.(4) Using a large number of simulation experiments to show the reliability of the improved algorithm.

Figure 1 :Figure 2 :
Figure 1: Value advantage problem of the circuit.

Figure 3 :
Figure 3: The example circuit with partial quantization feature.

Figure 4 :
Figure 4: Circuit example of value advantage problem.

Figure 5 :
Figure 5: One result of evolution process in experiment one.

Figure 7 :
Figure 7: An instance to calculate the image fitness.

Figure 8 :Figure 9 :Figure 10 :
Figure 8: The experiment results for the convergence property of "Mona Lisa."

Figure 11 :
Figure 11: The change of amount of optimal individuals in experiment four.
by expanding rheostat optional nodes to 16 covering integer values from 0 to 15.The optimization model is established as follows.

Figure 12 :
Figure 12: The change of amount of optimal individuals in experiment five.

Figure 13 :
Figure 13: The change of amount of optimal individuals in experiment seven.

Table 1 :
Experiment one: the definition domain for design variables.

Table 2 :
Feasible domain of experiment three.