Differential Cloud Particles Evolution Algorithm Based on Data-Driven Mechanism for Applications of ANN

Computational scientists have designed many useful algorithms by exploring a biological process or imitating natural evolution. These algorithms can be used to solve engineering optimization problems. Inspired by the change of matter state, we proposed a novel optimization algorithm called differential cloud particles evolution algorithm based on data-driven mechanism (CPDD). In the proposed algorithm, the optimization process is divided into two stages, namely, fluid stage and solid stage. The algorithm carries out the strategy of integrating global exploration with local exploitation in fluid stage. Furthermore, local exploitation is carried out mainly in solid stage. The quality of the solution and the efficiency of the search are influenced greatly by the control parameters. Therefore, the data-driven mechanism is designed for obtaining better control parameters to ensure good performance on numerical benchmark problems. In order to verify the effectiveness of CPDD, numerical experiments are carried out on all the CEC2014 contest benchmark functions. Finally, two application problems of artificial neural network are examined. The experimental results show that CPDD is competitive with respect to other eight state-of-the-art intelligent optimization algorithms.


Introduction
Optimization problems in engineering are often very complex and difficult to solve. At first, lots of deterministic algorithms based on mathematical programming theory are used for engineering optimization problems. They obtain better results for relative simple and ideal models. Unfortunately, these deterministic algorithms show poor performance for real-world complex problems. Inspired by natural evolution, more and more researchers are interested in the development of nature-inspired algorithm by exploring natural phenomena [1]. These natural phenomena mainly include the biological evolutionary process, animal behavior, and physical phenomena. The nature-inspired algorithms can solve the difficult design and optimization problems by building solutions that are more fit relative to desired properties [2]. The resulting field, nature-inspired algorithms have been successful in solving optimization, design, constrained, largescale, and multiobjective clustering and forecasting [3][4][5][6][7][8][9].
Evolutionary algorithms (EAs), such as Genetic Algorithm (GA) [10], Differential Evolution (DE) [11], and Derandomized Evolution Strategy with Covariance Matrix Adaptation (CMA-ES) [12], are inspired by the biological evolutionary process. GA, proposed by Fraser and popularized by Holland, has been widely studied. It solves optimization problems by simulating Darwinian evolution concepts, such as crossover, mutation, and selection. DE, proposed by Storn and Price, is a simple yet powerful populationbased evolutionary algorithm. CMA-ES, proposed by Hansen and Ostermeier, adapts the complete covariance matrix of the normal mutation distribution. Swarm intelligence (SI) algorithms, such as Particle Swarm Optimization (PSO) [13,14], Artificial Bee Colony (ABC) [15], Teaching-Learning-Based Optimization (TLBO) [16,17], and Jaya (a Sanskrit word meaning victory) algorithm [18], are inspired by all kinds of animal behavior. PSO explores the search space according to pbest and gbest, which are the past best position and the global best position achieved by particles, respectively. ABC, proposed by Karaboga, simulates the foraging behavior of the honeybee swarm and has been applied to solve many engineering optimization problems [19,20]. The TLBO method, proposed by Rao, is based on the effect of the influence of a teacher on the output of learners in a class [16]. In order to reduce the complexity of the algorithm, Rao [18] 2 Computational Intelligence and Neuroscience proposed Jaya algorithm which uses one phase instead of two phases of the TLBO algorithm. Jaya algorithm tries to get closer to reaching the best solution and tries to move away from the worst solution. Other heuristic methods, such as biogeography-based optimization (BBO) [21], Simulated Annealing (SA) [22,23], Chemical Reaction Optimization (CRO) [24], and Brain Storm Optimization (BSO) [25], simulate the physical phenomena or rules. BBO, proposed by Simon in 2008, is a newly proposed metaheuristic algorithm. In BBO, mathematical models are used to describe the evolution process of species, such as migration, mutation, and distribution of species. A solution is regarded as an island (habitat) with a habitat suitability index (HSI). Islands with a high HIS are well suited for species, and vice versa. Suitability index variables (SIV), which refer to the features correlated with HSI, and HSI are considered as the search space and objective function, respectively [26]. SA is a heuristic algorithm which is based on an analog of thermodynamics with the way metals cool and anneal [27]. CRO is a chemical-reaction-inspired metaheuristic which mimics the characteristics of chemical reactions in solving optimization problems [24]. BSO mimics the brainstorming process in which a group of people solves a problem together [28].
The issue of exploration-exploitation is the major factor which influences the performance of evolutionary algorithm. Exploration helps to find new potential solutions and improve the convergence rate of algorithm. Exploitation helps to improve the quality of found-so-far solutions. However, overexploration may lead to slow convergence and overexploitation may increase the risk of premature convergence. Therefore, numerous ideas are proposed to balance the exploitation and exploration search process of EAs [29]. For example, a large scaling factor in DE is required at the early stage of the evolution to ensure strong exploration capability, while a small is preferred to improve exploitation capability at the later stage [30]. Therefore, many improved DE variants, such as adaptive differential evolution with optional external archive (JADE) [31], self-adaptive control parameters in differential evolution (jDE) [32], a self-adaptive DE (SaDE) algorithm [33], and a composite DE (CoDE) algorithm [34], are proposed to improve the relationship of exploration and exploitation with the proper settings of control parameters [35][36][37][38][39]. For PSO, many different modified PSO variants, including inertia weight and the acceleration coefficient, are proposed by the researchers to enhance PSO's exploration capability and alleviate premature convergence problem [40][41][42][43][44][45][46].
Designing suitable evolutionary strategies and control parameters is important to realize a good balance between exploration and exploitation. In this paper, we proposed a novel nature-inspired algorithm called differential cloud particles evolution algorithm based on data-driven (CPDD) mechanism for solving global optimization problems. The CPDD algorithm simulates the change of matter state and cloud transformation process. The optimization process is divided into two stages, namely, fluid stage and solid stage. The CPDD algorithm carries out the strategy of integrating global exploration with local exploitation in fluid stage. Furthermore, local exploitation is carried out mainly in solid stage. Data-driven mechanism is designed for obtaining better control parameters to keep a better balance between exploration and exploitation.
The rest of the paper is organized as follows. Section 2 introduces the proposed CPDD and the concepts behind it in detail. In Section 3, the performance of the proposed CPDD is validated on different optimization problems. Section 4 shows the applications for training artificial neural network. Finally, the conclusions and ideas for future research are drawn up in Section 5.

Differential Cloud Particles
Evolution Algorithm Based on Data-Driven Mechanism 2.1. Algorithm Background. The idea of differential cloud particles evolution algorithm based on data-driven mechanism is inspired from the phenomenon of matter state transition and cloud formation. For further understanding, an explanation on the principles of matter state transition and cloud formation will be stated as follows. Substances commonly exist in three states: gaseous, liquid, and solid. For example, heat evaporates water into steam, while low temperature turns water to ice. Similarly, three states are involved in the formation and change of cloud. The water evaporated into vapor. Water vapor condenses to form clouds. The clouds transform into snow or hailstone with decreasing temperature. Therefore, water vapor, cloud, and snow can be regarded as gaseous, liquid, and solid, respectively. The cloud transformation process and matter state transition are illustrated in Figure 1. As a result, the matter shows different states with different temperature.
CPDD loosely simulates the cloud transformation process and matter state transition. Because the gas and the liquid have fluidity, there are two types of operation implemented in CPDD, namely, fluid stage operation and solid stage operation. We utilize the term "phase transition" to describe transitions between fluid stage and solid stage. A phase transition is the transformation of a thermodynamic system Computational Intelligence and Neuroscience from one phase or state of matter to another one by heat transfer.

The Proposed CPDD.
Similar to other metaheuristic algorithms, the proposed algorithm begins with an initial population called the cloud particles. Each cloud particle in CPDD algorithm represents a solution in the population. The population size is similar to the temperature in real world. Liquefaction refers to the process in which gas is transformed into liquid. Solidification is a phase transition in which liquid is turned into solid. Liquefaction and solidification need to give off heat. The temperature will gradually decrease in the exothermic process. In order to improve the performance of the algorithm, EAs had better to start with exploration and then gradually change into exploitation [47]. Therefore, CPDD algorithm has a larger population to ensure strong exploration ability at the beginning of the evolution. The population size gradually decreased and the exploitation ability gradually improved during evolution. In the later evolution, CPDD algorithm has a smaller population to encourage broader exploitation. The change of population number is similar to the change of the temperature; namely, matter has high temperature in gas state and has low temperature in solid state.

Fluid
Stage. The fluid stage, which mimics the movement of the fluid, includes gaseous state and liquid state. At the beginning, the initial state is gaseous state, which is composed of many cloud particles. Each cloud particle has good fluidity and can move freely. Therefore, the algorithm shows strong exploration ability in gaseous state. For the evolutionary process, the population size gradually decreases. Then, the local exploitation ability can be improved by reducing the population size. In addition, the state of population transforms from gaseous state into liquid one as the cloud particles move. The movement of cloud particles in liquid state causes macrolocal exploitation. The movement of cloud particles in fluid stage is given in Figure 2(a). Inspired by JADE, an improved search strategy is introduced in fluid stage. Similar to other evolutionary algorithms for optimization problems, a CPDD population is represented as a set of real parameter vectors which is defined as follows: where is the dimensionality of the optimization problem and is the population size. In fluid stage, the search strategy, based on DE/current-to-best with optional archive, is generated in the following manner: where ∈ {1, . . . , }, 1 , 2 , and 3 are mutually different random integer indices selected from ∈ {1, . . . , }, and x is randomly chosen as one of the top 100 % cloud particles in the current population. is 15% of the population size. x 3 is selected from the union of the population and the archive. The archive is the set of archived inferior solutions and is used for maintaining diversity in JADE [34]. If the archive size exceeds 150% of the population size, some solutions are randomly removed from the archive so that some newly cloud particles can be inserted into the archive. , which is different from in JADE, is a mutation factor that controls the speed of algorithm process and is generated at each generation by data-driven mechanism introduced later in (4).

Solid
Stage. The algorithm has found the potential area of optimal solution after global exploration and macrolocal exploitation in fluid stage. Then the phase transition is carried out; namely, the population transforms from fluid into solid. The cloud particles vibrate only in a small region and carry out microlocal exploitation. The process, in which most of the cloud particles are gathered toward the optimal solution (microlocal exploitation), is analogous to the solidification. Finally, most of the cloud particles will gather to a position, that is, the location of the optimal solution. Figure 2(b) shows the movement of cloud particles in solid stage. The following strategy is search strategy used in solid stage: Similar to fluid stage, x is randomly chosen as one of the top 100 % cloud particles in the current population. x 2 is selected from the union of the population and the archive. 1 and 2 are mutually different random integer indices selected from ∈ {1, . . . , }. is generated at each generation by datadriven mechanism.
The population size is reducing from to four in order to balance the exploration and exploitation. In fluid stage, the algorithm carries out global exploration and macrolocal exploitation. In solid stage, the algorithm performs macrolocal exploitation. Consequently, the algorithm can balance the exploration and exploitation based on the transformation from fluid stage to solid stage. Computational Intelligence and Neuroscience

Control Parameters Assignments.
In CPDD algorithm, the parameter controls the diversity of the population. Higher value of the parameter will increase the diversity of the cloud particles and enhance the convergence speed. On the contrary, smaller value of the parameter will result in premature convergence and slow convergence rate. In addition, for different problems, different values of parameters are needed in different evolution stage. Based on our previous research [48], in this paper, is produced based on datadriven mechanism which can be described as follows: Initial is generated according to (4). 0 is a parameter and is set to 0.068. In each generation, value that succeeds in generating a trial k which is better than the parent individual x is preserved and is recorded as . The size of is recorded as | |. If | | exceeds the current population size, randomly selected elements are deleted. If | | is less than the current population size , the ( -| |) new is generated according to (4). Consequently, value that shows better performance is preserved for next generation. It means that value from last generation drives the evolution of new generation.
CR, the crossover rate, is another important control parameter in CPDD algorithm. The initial CR is a Gaussian distribution with mean "0.5" and standard deviation "0.1." Similar to the parameter , CR values that have performed well in generation are preserved for next generation. The preserved CR is recorded as CR . The size of CR is recorded as |CR |. If |CR | = 0, CR is regenerated as follows: In addition, the cloud particles transfer from fluid stage to solid stage when CR G is empty.
If |CR | is less than the current population size , the ( -|CR |) new CR is given by where (CR ) refers to the standard deviation of CR . rand denotes a uniformly selected random number from [0,1). If |CR | > , extra elements are randomly selected and deleted. Therefore, the next generation of CR is produced by CR , which shows better performance in the last generation.
The pseudocode of CPDD is illustrated in Algorithm 1. Φ(0.5, 0.1) refers to a Gaussian distribution with mean "0.5" and standard deviation "0.1." Popsize is the current population size. FES stands for the number of function evaluations. maxFES stands for the maximum number of function evaluations.  [49] benchmark problems are used to test the performance of CPDD. The CEC2014 benchmark set consists of 30 test problems. According to their shape characteristics, these benchmark problems can be broadly classified into four kinds of optimization problems [49]:

Experiments and Discussions
(iii) Hybrid problems 17 -22 (iv) Composite problems 23 -30 For all of the problems, the search space is [−100, 100] D . In this paper, the dimension ( ) of all problems is set to 10 and 30.

Experimental Platform and Termination Criterion.
For all experiments, 30 independent runs are carried out on the same machine with a Celoron 3.40 GHz CPU, 4 GB memory, and windows 7 operating system with Matlab R2009b and conducted with × 10, 000 (number of function evaluations, FES).

Performance Metrics.
In our experimental studies, the mean value ( mean ), standard deviation (SD), maximum value (Max), and minimum value (Min) of the solution error measure [50] which is defined as ( ) − ( * ) are recorded for evaluating the performance of each algorithm, where ( ) is the best fitness value found by an algorithm in a run and ( * ) is the real global optimization value of tested problem. In order to statistically compare the proposed algorithm with its peers, the statistical tool -test [16] at a 0.05 significant level is used to evaluate whether the median fitness values of two sets of obtained results are statistically different from each other. Three marks "+," "−," and "≈" are also used to report the results clearly."+," "−," and "≈" denote that the performance of CPDD is better than, worse than, and similar to that of the corresponding algorithm, respectively.

Comparison with Eight State-of-the-Art Optimization
Algorithms on 10 and 30 Dimensions. In this part, CPDD is compared with PSO, PSOcf (PSO with constriction factor) [44], TLBO, DE, JADE, CMA-ES, ABC, and BBO. The appropriate parameters are important for the performance of the optimization algorithms. Therefore, the setting of parameters of different algorithms is given in the following.
Algorithm 1: CPDD algorithm. and 0.9, respectively. For PSO, the population size is set to 40, the linearly decreasing inertia from 0.9 to 0.4 is adopted over the course of the search, and the acceleration coefficients 1 and 2 are both set to 1.49445. For ABC, the number of colony sizes is set to 20, and the number of food sources is set to half of the colony sizes. For JADE, the population size is set to 100; = 0.05 and = 0.1. The parameters of other algorithms are the same as those used in the corresponding references.
The statistical results, in terms of mean , SD, Max, and Min obtained in 30 independent runs by each algorithm, are reported in Tables 1∼7.   6 Computational Intelligence and Neuroscience    Table 1, we can see that CMA-ES and CPDD achieve the optimal solution in each run for unimodal problems 1 -3 for 10 dimensions. CPDD performs better than other algorithms on 2 -3 and achieves the second best performance on 1 for 30. The reason that CPDD has the outstanding performance may be because of data-driven mechanism, which is helpful for obtaining better control parameters. Table 2, we observe from the statistical results that CPDD is significantly better than other algorithms on 5 -7 , 9 , 13 , 14 , and 16 . ABC performs well on 4 ; JADE performs well on 10 -11 and 15 , respectively. PSOcf performs well on 12 . CPDD and JADE achieve the optimal solution on 8 . Table 3 shows that CPDD obtains better solutions than other algorithms on 4 , 6 , 12 , and 13 . CMA-ES performs well on 5 . JADE performs well on 8 -11 , 15 , and 16 . ABC performs well on 14 . DE, JADE, and CPDD perform well and achieve the similar solutions on 7 . 17 -22 . In the case of 17 -22 , we find that CPDD achieves very competitive results from Tables  4 and 5. It beats PSO, PSOcf, TLBO, DE, JADE, CMA-ES, ABC, and BBO on these hybrid problems for 10 and 30 dimensions except 17 . DE performs well on 17 for 10 dimensions. This may be due to the fact that the scheme of matter state change can help CPDD to keep a better balance between exploration and exploitation. 23 -30 . From Tables 6 and 7, we find that these composite problems are very time-consuming for fitness evaluation compared to others because these problems combine multiple test problems into a complex landscape. Therefore, it is extremely difficult for state-of-theart optimization algorithms to obtain relatively ideal results. Table 6 shows that CPDD obtains the better solutions on According to the experimental results on thirty test problems from Table 8, we find that CPDD outperforms PSO, PSOcf, TLBO, DE, JADE, CMA-ES, ABC, and BBO on twenty-six, twenty-six, twenty-seven, twenty, eighteen, twenty-five, twenty-three, and twenty-five test problems, respectively, for 10 dimensions. CPDD outperforms PSO, PSOcf, TLBO, DE, JADE, CMA-ES, ABC, and BBO on twenty-eight, twenty-six, twenty-six, twenty, fourteen, twenty-two, twenty-one, and twenty-two test problems, respectively, for 30 dimensions. Moreover, Figures 3 and 4 have further displayed the convergence graphs of different benchmark problems in terms of the mean errors (in logarithmic scale) achieved by each of nine algorithms for CEC2014 problems versus the number of FES for 10 and 30 dimensions.

Composite Problems
In summary, it is suggested that CPDD beats PSO, PSOcf, TLBO, DE, JADE, CMA-ES, ABC, and BBO in 15 out of 30 benchmark problems for 10 dimensions. CPDD achieves better performance than other seven algorithms in 14 out of 30 benchmark problems for 30 dimensions. The experiment results reveal that CPDD works well for most benchmark problems. This is due to the data-driven mechanism and the phase transition of matter mechanism which are used in CPDD. The data-driven mechanism makes use of the better control parameters which are found by the last generation to produce new control parameters for next generation. The experiment results indicate that the control parameters which are achieved by the data-driven mechanism are appropriate for most benchmark problems and are helpful for finding better solutions. The cloud particles carry out phase transition according to the extent of evolution. The exploration ability and exploitation ability of the algorithm are dynamically adjusted by the phase transition mechanism. Therefore, it not only can improve the convergence rate of algorithm but also can decrease the risk of premature convergence as much as possible.

The Real-World Optimization Problem
In this section, the proposed CPDD algorithm is applied to estimate parameters of a real-world problem. The artificial neural network trained by our CPDD algorithm is a three-layer feed-forward network which includes input units, hidden units, and output units. The basic structure of the proposed scheme is depicted in Figure 5.
In the three-layer feed-forward network, input = ( 1 , 2 , . . . , , . . . , ) , output = ( 1 , 2 , . . . , , . . . , ) , and desired output = ( 1 , 2 , . . . , , . . . , ) . Comparison needs to be made to check out the difference between the test output and real demand. The aim of neural network training is to find a set of weights with the smallest error measure. The objective function is the mean sum of squared errors (MSE) over all training patterns which is shown as follows: where is the number of training data sets, is the number of output units, is desired output, and is output inferred from neural network.

SISO Nonlinear Function Approximation.
In this example, there are one input unit, five hidden units, and one output unit in the three-layer feed-forward ANN. The model is constructed to model the curve of a nonlinear function which is described by the following equation [21]: The sigma function is used as activation function in the output layer. The number (dimension) of the variables is 16 8 Computational Intelligence and Neuroscience               To assess the performance of each algorithm in noise, 30 db additive with Gaussian noise is added to the experiment. For each algorithm, 50 runs are performed. The other parameters are same as those of the previous investigations. Table 9 shows that CPDD performs better than other compared algorithms in terms of the mean MSE and the standard deviation of MSE. The approximation curves for training and test using different algorithms are shown in Figure 6. It indicates that CPDD outperforms other algorithms for training the model.

Lorenz Chaotic Time Series Prediction.
In this example, there are three input units, five hidden units, and one output unit in the three-layer feed-forward ANN. It is used to forecast a Lorenz chaotic time series which is described by the following equation [51]: where = 10, = 28, and = 8/3.

18
Computational Intelligence and Neuroscience  In order to train the ANN, 10000 pieces of data are selected according to the real model, among which the first 8000 pieces of data are discarded; the rest normalized 2000 pieces of data are selected as experiment data. The former 1500 points are chosen as the training data points, and the rest 500 points are chosen as the testing data points to test the validity of the model. The simulation goal is to build the single-step-ahead prediction model of chaotic time series as the following form [51]: ( + 1) = ( ( − 6) , ( − 3) , ( )) .
In our experiment, population size is set as 50; the maximal number of function evaluations (FES = 10000) is used as ended condition of each algorithm. The results are shown in Table 10 in terms of the mean MSE and the standard deviation obtained in the 50 independent runs for nine optimization algorithms. Table 10 indicates that CPDD shows better performance than other compared methods in terms of the mean MSE and the standard deviation. Figure 7 shows the prediction of Lorenz chaotic time series for training and test with different optimization algorithm. The curves show that the prediction obtained by CPDD performs better than other algorithms. The experimental results represent that CPDD has better prediction performance for Lorenz chaotic time series compared with other optimization algorithms.

Conclusion
A new metaheuristic optimization algorithm CPDD, which is inspired from the phenomenon of cloud transformation and the transition of matter state, is proposed in this paper. Data-driven mechanism is introduced into differential cloud particles evolution algorithm and it is applied to 30 benchmark problems from the CEC2014 Special Session on Real-Parameter Single Objective Optimization benchmark suite. The experimental results showed that phase transition and data-driven mechanism can not only balance the exploration and exploitation capacity of CPDD but also accelerate the convergence rate. It can be also concluded that CPDD shows    However, these methods show poor performance on composite problems. This indicates that the control parameters produced by data-driven mechanism are not appropriate for these problems. Therefore, the exploitation of CPDD is poor on these problems. How to improve the exploitation ability of CPDD will need to be further gone into. It is necessary to introduce some new techniques in CPDD for improving exploration and exploitation ability to solve hybrid composition problems. In addition, CPDD will be used to solve real-world engineering problems.

Conflicts of Interest
The author declares that there are no conflicts of interest regarding the publication of this article.