The Effect of Entropy on the Performance of Modified Genetic Algorithm Using Earthquake and Wind Time Series

The dynamic complexity of time series of natural phenomena allowed to improve the performance of the genetic algorithm to optimize the test mathematical functions. The initial populations of stochastic origin of the genetic algorithm were replaced using the series of time of winds and earthquakes. The determinism of the time series brings in more information in the search of the global optimum of the functions, achieving reductions of time and an improvement of the results. The information of the initial populations was measured using the entropy of Shannon and allowed to establish the importance of the entropy in the initial populations and its relation with getting better results. This research establishes a new methodology for using determinism time series to search the best performance of the models of optimization of genetic algorithms (


Introduction
GA have allowed to obtain optimal solutions to engineer problems related with the processing of images, prediction of time series, processing of voice, language, audio, and location model [1][2][3][4][5][6][7][8][9].GA have mutation characteristics, crossing, and selection taken from nature, allowing to maximize the search of information in n-dimensional spaces.These algorithms use stochastic functions to recreate the evolutionist models [10][11][12][13].
The authors [14] developed a toolbox to optimize benchmark mathematical functions using evolutionary computation.This tool allows to compare the performance of optimization done in multiple continuous convex functions through bioinspired optimization methods.In [15,16], the optimization methods are improved using dynamic parameter adaptation using fuzzy logic.The ability of dynamic adaptation of the algorithms allows to establish improvements in the performances, in particular, of the algorithms; delay premature convergences; and establish new search spaces.The best configuration of parameters with fuzzy logic in the algorithm allows to obtain better results than that with the original method.
Recent studies have established the need of incorporating the chaotic determinism in the genetic algorithm.Every time is more clear that evolution in nature is chaotic, because the populations are dynamic in size; mutation characteristics, crossing, and selection are defined by the chaotic determinism of each species and its environment.This structural modification established the chaotic genetic algorithms (CGA), which have allowed to address new applications in the search of global optimum in the spaces of search using chaotic maps for the generation of the populations of individuals of the algorithm [17][18][19][20][21].
Natural phenomena as earthquakes and winds [22][23][24][25][26][27] represented by time series have concentrated important efforts for its prediction.The production of electric power for the wind farms and the capacity to predict telluric movements [28][29][30] to protect the population are the main edges of the investigations.These time series have (1) nonlinear dynamic characteristics associated to chaotic systems; (2) the phase space rebuilt and its fractal dimension of great dimension indicate a capacity of prediction very limited for the earthquakes, improving in the case of wind series; and (3) the behavior of the time series corresponds to the chaotic determinism [31,32].
The motivation of this work is to use the time series of winds and earthquakes to establish new methods with GA for the optimization of engineering problems, using the chaotic determinism characteristics of the series.The main objective of the research is the study of the relation between the entropy of initial computational solutions compared with the performance of the modified GA.The characteristics of those time series studied allow to generate better quality initial computational solutions, which improve the performance in the results of the GA.
The research (1) establishes the chaotic characterization of the time series for later, (2) generates the initial populations of the genetic algorithm through the use of the time series of chaotic characterization, and (3) establishes the relation between the entropy of the initial populations generated by different chaotic series and the performance of the optimization method.

Literature Review
The restrictions in the efficiency of the GA are a premature convergence favoring low-quality solutions and excessive times in the search of global optimum.Recent studies related to the entropy of Lyapunov with the capacity of better performance of metaheuristics [33].This capacity of measuring the information during the process of calculation of the algorithm allows to select populations with more probability of getting better quality solutions.The entropy of Shannon is the expected value of information content in a message [34,35].In the measurement of the probability of an event, the information of the entropy is calculated in the following way: where H(x) is the entropy of information and p x i is the probability of each event.For a physical system, it can decompose in two independent statistical systems A and B; the entropy of Shannon has the property of additivity.
Finally, for physical systems with iterations of long rank, long memory time, and fractal structures, it is necessary to evaluate the entropy through the generalization of the statistic of Boltzmann-Gibbs-Shannon (BGS).Tsallis proposes the mathematic form S q , where q expresses the degree of nonextensibility of the entropy.The expression achieves the form of the entropy of Shannon in the limit when q → 1 The focus of analysis of the entropy of the populations of feasible solutions allows to calculate the level of information given by the set of solutions, which is directly related to the level of variety of the solutions [36,37].The entropy can be used to get important information of the parameters of the algorithm; the biggest complexity in GA is the line of the parameters; these must be correctly tuned in to get a good performance of the algorithm in the space of search.Some parameters have more influence in the good results of the algorithm with respect to other parameters, and the efficiency of the algorithm must concentrate the efforts in the parameters more critical to reduce the times or to increase the capacity of processing of the problem to optimize.

Proposed Method
The proposed method combines GA with deterministic time series and controls the performance of the new populations of solutions with the information of the entropy in each generation.The investigation uses nine continuous multimodal functions (Figure 1) for the analysis of the performance of the algorithm modified with respect to the traditional genetic algorithm.The multimodal function contains several local optimums and a global optimum; these functions allow to evaluate the capacity of evading the convergence of solutions in a local optimum.All the functions tested have a minimum global optimum equal to zero.The checking of the genetic algorithm proposed (GAP) was done through the comparison with the GA [38].
The GA use random processes in the generation of initial populations, of the crossing processes, and of mutation.The proposed algorithm for the study replaces the stochastic processes by the time series of winds and earthquakes.The modification of the algorithm allows to study the effect of the dynamic characteristics of these series with respect to their ergodicity, irreversibility, and nonlinearity and of great sensitiveness to the initial conditions.
3.1.Proposed Algorithm.Figure 2 shows the flowchart of the proposed algorithm, with the objective of indicating the modified parts and the standard parts of the GA; the standard GA parts are blue in color and the modified parts green in color.
The GAP replaces the random generation of the initial population with a transformation of the time series for the construction of the chromosomes of the population.Later, 2Complexity each new population is controlled through the calculation of the entropy (1) and the adjustment.
Each algorithm is executed 50 times.When the criteria of end of the simulation are deployed and achieve a maximum of generations or a low entropy of the population, the averages, the standard deviation of the results [39] and the rate of success or performance are obtained [40,41] (5).

P = 100 optimal result total of simulations 5
To guarantee the analysis of the results and be able to compare the results of the experiments, obtained with the three GA metaheuristics, GAP winds and GAP earthquake are necessary to use a nonparametric statistical test for performing a rigorous comparison among algorithms [42][43][44][45][46].The Wilcoxon nonparametric test compares the differences of the results between the two algorithms and allows to establish the equality of the medium as null hypothesis.
The procedures established to compare two paired quantitative variables are to (1) calculate the differences between results, (2) order the absolute value of the differences from lowest to highest, (3) ignore the null differences and rank the differences.Then, it is calculated R + as the sum of the rankings where the first algorithm had more fitness with respect to the second algorithm and R − the opposite situation.The parameter T is the lower of the two sums, T = min R + , R − .Finally, the parameter p value is calculated; it must have values greater than 0.05 to ensure the null hypothesis.
The main advantage of the Wilcoxon test about a t-test corresponds to the smallest effect about the statistical effect of the performances exceptionally good or bad of some experiments during the comparison of the algorithms.
Finally, the analysis of the evolution of the entropy of the populations of solutions and the entropy of initial populations allowed to study the effect of the entropy of Shannon in the performance of the modified algorithms.In particular, the study of the initial entropy in the initial populations and its performance is analyzed through a graphic of density.The graphic of density deploys the adjustment solutions of all the test functions for both time series.The experiment demanded 45.000 simulations by time series.

3.2.
Pseudocode for the Proposed Algorithm.The GAP looks for the global optimal for the multimodal functions in the following way, GAP A, B, C , where A is the transformation of the time series in the chromosomes of the individuals of the initial generation, B manipulates the initial population provided by the A function through the GA, and C delivers the results when the criteria of end is satisfied.The results delivered are entropy of the populations and fitness of the solutions (Pseudocode 1).
The parameters used in the investigation are presented in Table 1.Empirical studies elaborated by [47] have established, in relation to the selection of the parameters of the algorithm, that they depend on the problem to be solved and the particular structure of the GA.This author considers adequate a population size in a range of 100 to 200 individuals for studies with GA; the increase of populations decreases the speed of convergence and increases the resolution times.For very small populations, the 3 Complexity increase of performance will be low between cycles.For [38], finding an appropriate population size is very important; oversized or undersized population generates lowquality final solutions.
In relation to the mutation rate, low rates could not evaluate all the search space, and very high mutation rates will discard candidates with good performance and will be replaced by lower-quality solutions.The configuration of the population size and mutation rate parameters must calibrate with the increase of performance between cycles of the fitness function.[48].
The algorithm implemented uses a crossover of the singlepoint crossover type; this technique uses the cross of parents in only one point for the creation of children solutions, and its effect in the performance of the algorithm is related directly with the space of the solutions [49].
Finally, the generation parameter is related to the asymptotic convergence to the ideal solution.Such convergence can     5 Complexity be visualized through the generation graphic and fitness (Figure 3, Figures 4 and 5).In general, the detention of the cycles of the algorithm can be established by processing time and/or number of generations.

Hypothesis of the Proposed Algorithm.
From a theoretical point of view, GAP is capable of approaching to the global optimum of the problems of optimization for the following reasons:   6 Complexity (i) The exploration of the space of search is guaranteed by the cross of the initial populations of deterministic characteristics.
(ii) The sensitivity to the initial conditions of the temporary series allows to modify the exploration in each experiment.
(iii) The monitoring of the entropy of Shannon ensures the diversity of the population, establishing an end condition for very small entropies.
(iv) The deterministic characteristics of the species are better represented by deterministic time series of natural phenomena in respect to the stochastic 7 Complexity functions.The species are reproduced chaotically, then the new population is chaotic [38,47].
(v) The parameters of algorithm have different impacts in the results of the optimization; the algorithm has few parameters to adjust.

Analysis of Time Series.
The chaotic systems are abundant in nature and can be represented by time series.The analytic tools allow to study them without describing their nonlinear dynamic equation.The chaos has characteristics of unpredictability and is sensitive to the initial conditions, and its orbits form a compact region around a strange attractor [48,49].
The maximum exponent of Lyapunov ( 6) allows to corroborate a chaotic series.A positive Lyapunov exponent is a strong chaos signal [50,51] and is expressed as follows: The Fourier spectrum allows to review the periodicity of the time series through the transformation of the series in a frequency spectrum.The frequency spectrums of both series studied establish nonperiodic characteristics associated to the chaotic determinism.The spectral lines are sharp and the scanning is continuous.The wind time series, the power Fourier spectrum and spectrogram are shown in (Figures 6-8).The reconstruction of the phase space allows to describe the chaotic attractor.The coordinated delay method of Takens constructs a vector u t of m components (7).
where t = t0 + kΔt, in which t is the delay of the time and is a multiple integer of Δt and m is the dimension embedded.Both variables are essentials for the reconstruction of the vector u t to represent the true trajectory of the attractor.The studied temporary series were able to be reconstructed.With the graphic methodology called graphic recurrence plot (Figure 9), it is possible to confirm the chaotic determinism of the wind series and the earthquakes through the patterns not homogeneously distributed [52][53][54].
Finally, the analysis of the time series using the analytical techniques (1) power Fourier spectrum, (2) spectrogram, (3) graphic recurrence plot, and (4) a positive Lyapunov exponent characterizes a deterministic time series associated with a system dynamic.

Experimental Results
The experimental results of the two GAP were compared with the GA.The experimental results allow to observe the good performance of the modified algorithms with the chaotic series with respect to the traditional algorithms (Table 2).The Wilcoxon nonparametric test is used to test the good performance of the modified algorithm with respect to the traditional algorithms.GAP earthquake shows a significant improvement over GA with a level of significance alpha = 0.01 and over GAP winds with alpha = 0.05.GAP winds show a significant improvement over GA with a level of significance alpha = 0.05 with the nine test functions in each case.

Complexity
The evolution of the results of each algorithm by optimization the nine benchmark functions are presented in Figures 3-5.In the graphics are shown the average performances of each group of experiments.The experiments using the GPA earthquake algorithm have big separated lines, the experiments using GPA winds have continuous lines, and the experiments using the GA are represented with small separate lines.
The degree of clustering or disorder of simulation results may be shown using the concept of entropy and plotting with a scatter density plot.Figure 10 represents the relation between entropy and fitness using GAP with earthquake time series for the optimization of the benchmark Ackley's continuous multimodal function.
For the two modified algorithms, 9.000 optimizations for each test function were simulated.Each optimization calculated the entropy of information of Shannon of the initial population.The vector (function, entropy, and adjustment) allows to graphically show the function of density.Figure 11 shows the results of simulations about the GAD modified with the wind time series.The graphic confirms the positive correlation between the entropy of initial populations and the increase of performance of the solutions of the optimization.An increase of the entropy of the initial population improves the probability of obtaining a good performance of the adjustment of the solution.
Chaotic maps have been used to improve the performances of the GA.The dynamic chaotic systems and their characteristics can be represented by these chaotic maps.The infinitesimal modification of the parameters of those maps generates time series with exponential divergences.The main chaotic maps used in the optimization researches through CGA are the maps of Lorentz, Rossler, IKEA, Henon, and logistic.
Authors as [55] employ logistic map to generate chaotic values instead of the random values in GA processes.For our case of study, the modification of the GA with the logistic map CGA presents a similar behavior to the modified GAP with the time series of the earthquakes and the series of winds.CGA exhibits a concentration of good optimization results related to the increase of the entropy of the initial populations.Figure 12 presents the improvement of the CGA in respect to the GA.To ensure the replicability of the experiment, 45.000 simulations for each algorithm were made.

Conclusions
The proposal of the modification of CGA through deterministic time series allows to get a simple method of optimization with good performance.The time series are studied to guarantee their nonlinear dynamic characteristics associated to chaotic systems.The study of the series is performed with analytic tools without the need to describe its dynamic nonlinear equation.The tools are the maximum exponent of Lyapunov, the spectrum of Fourier, and the reconstruction of the phase space to describe the chaotic attractor.
The chaotic characteristics of the time series of wind and earthquakes are established; the series are used to build the initial populations of the chaotic genetic algorithm.The information contained by the initial populations are calculated by the entropy of Shannon.The initial population contains greater entropy of information that GA random function.
The analysis of the entropy of Shannon allows to study the effect of the entropy in the performance of the CGA.The research establishes a positive correlation between the entropy of initial populations and the performance of the solution.The results obtained with time series of the earthquake are better in relation to time series of the velocities of the wind; these results are correlated with the greater entropies of the initial populations generated by the earthquake series.Likewise, the dynamic adjustment establishes the minimum levels of entropy of the initial population and discards the populations with very low entropy generating a reduction in the processing times of the experiments.
The results establish an improvement of the algorithms built by these series in relation to a genetic algorithm built with random functions.Through the graphics of density, showed in Figure 11, the fitness of each benchmark functions with the entropy variable.Likewise, the Wilcoxon nonparametric statistical method was used to compare the performances of the GA modified with the traditional GA.The improvement of the results is explained by the reduction of the early convergence of the solutions and the greater search information contained in the initial populations.Figures 3-5 show superior asymptotes in all the results with respect to the GA, established by the reduction of the convergence and the increase of the fitness of the initial populations of the modified GA.The experimental results testing nine benchmark 10 Complexity mathematical functions demonstrate that the proposed algorithm CGA beats the performance of the optimization method GA.When measuring entropies with different complexity metrics, there exists concordance between the results found in this research with the study of Liu and Abraham.These authors use the complexity metrics of Lyapunov to establish a relation between entropy and the optimization results through the metaheuristic swarm intelligence.In both researches were found the existence of a direct relation between entropy and performance.The proposed method allows, in future investigations, to study the performance of the modified genetic algorithm with other deterministic time series.Likewise, the chaotic deterministic characteristics of the series studied would allow to modify the bioinspired optimization methods to obtain better performance in the experiments with benchmark mathematical functions.

Figure 2 :
Figure 2: Flowchart of the proposed GA algorithm.

Figure 3 :
Figure3: Evolution of the results in each generation with test functions F1, F2, and F3 using GA, GAP winds, and GAP earthquake.

Figure 4 :
Figure4: Evolution of the results in each generation with test functions F4, F5, and F6 using GA, GAP winds, and GAP earthquake.

Figure 5 :
Figure5: Evolution of the results in each generation with test functions F7, F8, and F9 using GA, GAP winds, and GAP earthquake.

Figure 9 :
Figure 9: Graphic recurrence of the wind series.

Figure 10 :
Figure 10: Entropy scatter density plot of earthquake time series.

Figure 12 :
Figure 12: Performance of the CGA with respect to GA.

Table 1 :
Parameters used in GAP.Start the first population with the transformation function.Calculate the entropy of the population.While the criteria of end is not satisfied.For each individual.Use the selection criteria of new population.Calculate new entropy of the population and adjust of all the new population.

Table 2 :
Wilcoxon signed-rank test results.GAP earthquake shows an improvement over GAP winds with a level of significance alpha = 0.05 and over GA with alpha = 0.01.GAP winds show an improvement over GA with a level of significance alpha = 0.05.