In this paper, a hybrid approach that combines a population-based method, adaptive elitist differential evolution (aeDE), with a powerful gradient-based method, spherical quadratic steepest descent (SQSD), is proposed and then applied for clustering analysis. This combination not only helps inherit the advantages of both the aeDE and SQSD but also helps reduce computational cost significantly. First, based on the aeDE’s global explorative manner in the initial steps, the proposed approach can quickly reach to a region that contains the global optimal value. Next, based on the SQSD’s locally effective exploitative manner in the later steps, the proposed approach can find the global optimal solution rapidly and accurately and hence helps reduce the computational cost. The proposed method is first tested over 32 benchmark functions to verify its robustness and effectiveness. Then, it is applied for clustering analysis which is one of the problems of interest in statistics, machine learning, and data mining. In this application, the proposed method is utilized to find the positions of the cluster centers, in which the internal validity measure is optimized. For both the benchmark functions and clustering problem, the numerical results show that the hybrid approach for aeDE (HaeDE) outperforms others in both accuracy and computational cost.
Optimization has been widely applied in different fields such as economics, finance, engineering, etc. Although there are many optimization algorithms developed in various ways, they can be decomposed into two major techniques: population-based algorithms and gradient-based searching algorithms.
Population-based algorithms including evolutionary algorithms and swarm-based algorithms are types of global searching techniques. Evolutionary algorithms [
On the other hand, gradient-based techniques only compute a single solution at any time and move the solution to a better one through iterations basing on gradient information. In comparison to population-based techniques, gradient-based techniques have more advantages in terms of locally exploitative behavior and computational cost, but they usually give the optimal solution which gets stuck at the local extreme values.
To avoid the disadvantages of both population-based and gradient-based algorithms, this paper proposes a hybrid approach that combines them together. Specifically, the state-of-the-art population-based algorithm, aeDE, is performed in the initial steps to explore the global searching space. In the later steps, when the aeDE nearly converges to a critical value, spherical quadratic steepest descent (SQSD) [
The remainder of this paper is organized as follows. The following section presents the aeDE, the SQSD, and the proposed algorithm. Section
This section reviews some theories regarding the aeDE algorithm, SQSD algorithm, and proposes the hybrid HaeDE algorithm. The detail is presented as follows.
To clarify the notation used throughout this article, we refer to the minimization of the objective function
The initialization phase of the aeDE is similar to that of the original DE, in which an initial population, including NP individuals, is generated through a random sampling technique. Specifically, each individual is represented as a chromosome containing
In the case of the original DE, a mutant vector rand/1: rand/2: best/1: best/2: current to best/1:
where integers
In the case of the aeDE, a new adaptive mutation scheme for the mutation phase of the DE is proposed. In this scheme, two mutation operators including “rand/1” and “current to best/1” are utilized. The “rand/1” aims to ensure diversity of the population and prohibits the population getting stuck in a local optimum and the “current to best/1” aims to accelerate convergence speed of the population by means of guiding the population toward the best individual. These two mutation operators are adaptively chosen based on
In above pseudocode, delta is defined by
After completing mutation, each target vector
In the selection process of the classical DE, each trial vector
The SQSD algorithm is briefly summarized through the pseudocode as follows (Algorithm
In above algorithm, the step limit
As mentioned earlier, both aeDE and SQSD have their own advantages and disadvantages; therefore, we propose a hybrid approach for aeDE, called HaeDE, to create resonance between their advantages and avoid their disadvantages. The proposed algorithm is presented by following pseudocode (Algorithm
Flow chart of the HaeDE.
In above algorithm, NP is the population size. In case of low dimensions, according to [
This section presents two examples to illustrate and test the performance of the proposed method. Particularly, the first example describes the details of the proposed method when dealing with a well-known function, Bohachevsky1. The purpose of this experiment with a simple bivariate function is to analyze the proposed method behavior. As a result, we can illustrate how the new method works and why it is better than aeDE. In the second example, the performance of the proposed HaeDE algorithm is evaluated on 32 benchmark functions. Those functions are in 50 functions firstly performed by [
In this subsection, we perform experiment on function Bohachevsky1, a simple bivariate function, to analyze the proposed method in detail. In the first phase, we run the aeDE algorithm with
It can be seen from Figure
Surface and contour of function Bohachevsky1.
The convergence behavior of aeDE and HaeDE.
In addition to comparing with aeDE, the convergence behavior of HaeDE itself with different step limit parameters is examined. It can be seen from Figure
Convergence behavior of HaeDE with different step limit value.
In this subsection, we compare the performance of the proposed method with those of five well-known algorithms consisting of the aeDE, PSO, SHO, SSA, and DA over 32 benchmark functions. The control parameter settings for the HaeDE are chosen as mentioned in Section aeDE: the population size NP = 20, the stop criterion PSO: the population size NP = 20, the stop criterion SHO, SSA, DA: the population size NP = 20, the stop criterion
For each benchmark function, we run HaeDE, aeDE, PSO, SHO, SSA, and DA 30 independent times. To ensure the fairness, initial population for comparative methods is chosen such that they are the same. To determine whether HaeDE reaches a statistically better solution than other methods or not, Wilcoxon's paired tests are examined. During the test, if the optimal value is below
The benchmark functions and their characteristics.
No |
|
Name | Formulation |
---|---|---|---|
1 | 2 | Branin |
|
2 | 2 | Bohachevsky 1 |
|
3 | 2 | Booth |
|
4 | 30 | Rastrigin |
|
5 | 30 | Schwefel |
|
6 | 2 | Michalewicz 2 |
|
7 | 5 | Michalewicz 5 |
|
8 | 10 | Michalewicz 10 |
|
9 | 2 | Schaffer |
|
10 | 2 | Six hump camel back |
|
11 | 2 | Bohachevsky 2 |
|
12 | 2 | Bohachevsky 3 |
|
13 | 2 | Shubert |
|
14 | 2 | GoldStein-Price |
|
15 | 4 | Kowalik |
|
16 | 4 | Shekel 5 |
|
17 | 4 | Shekel 7 |
|
18 | 4 | Shekel 7 |
|
19 | 4 | Perm |
|
20 | 4 | PowerSum |
|
21 | 3 | Hartman 3 |
|
22 | 6 | Hartman 6 |
|
23 | 30 | Griewank |
|
24 | 30 | Ackley |
|
25 | 30 | Penalized |
|
26 | 30 | Penalized 2 |
|
27 | 2 | Langerman 2 |
|
28 | 5 | Langerman 5 |
|
29 | 10 | Langerman 10 |
|
30 | 2 | Fletcher Powell 2 |
|
31 | 5 | Fletcher Powell 5 |
|
32 | 5 | Fletcher Powell 5 |
|
The descriptive statistics.
No | Min | FBEST mean (rank) | |||||
---|---|---|---|---|---|---|---|
HaeDE | aeDE | PSO | SHO | SSA | DA | ||
1 | 0.398 | 0.3979 (4) | 0.3979 (5) | 0.3979 (1) | 0.3979 (2) | 0.3979 (3) | 0.3979 (6) |
2 | 0 | 0 (1) | 0 (3) | 0.0413 (6) | 0.0138 (5) | 0 (2) | 0.0014 (4) |
3 | 0 | 0 (4) | 0 (5) | 0 (1) | 0 (3) | 0 (2) | 0 (6) |
4 | 0 | 20.1717 (1) | 20.6465 (2) | 183.3748 (6) | 55.2204 (3) | 72.7645 (4) | 149.0701 (5) |
5 | −12569.5 | −12027.9391 (1) | −11886.7727 (2) | −6470.1728 (5) | −10658.3841 (3) | −7816.4553 (4) | −5825.5215 (6) |
6 | −1.8013 | −1.8013 (2) | −1.8013 (4) | −1.8013 (1) | −1.7817 (6) | −1.8013 (3) | −1.8013 (5) |
7 | −4.6877 | −4.6527 (1) | −4.6486 (2) | −4.5941 (3) | −4.1299 (5) | −4.3921 (4) | −3.854 (6) |
8 | −9.6602 | −9.4678 (2) | −9.4813 (1) | −7.6207 (5) | −7.8455 (3) | −7.7399 (4) | −6.1953 (6) |
9 | 0 | 0.0058 (3) | 0.0068 (4) | 0.0103 (6) | 0.01 (5) | 0 (1) | 0.0016 (2) |
10 | −1.03163 | −1.0316 (4) | −1.0316 (5) | −1.0316 (1) | −1.0316 (2) | −1.0316 (3) | −1.0316 (6) |
11 | 0 | 0 (1) | 0 (3) | 0.0218 (6) | 0.0146 (5) | 0 (2) | 0.002 (4) |
12 | 0 | 0 (3) | 0 (4) | 0 (1) | 0.0151 (6) | 0 (2) | 0.0018 (5) |
13 | −186.73 | −186.7309 (2) | −186.7309 (4) | −186.7309 (1) | −184.6258 (6) | −186.7309 (3) | −186.7289 (5) |
14 | 3 | 3 (3) | 3 (4) | 3 (1) | 5.7 (6) | 3 (2) | 3 (5) |
15 | 0.00031 | 0.0013 (4) | 0.0006 (2) | 0.0005 (1) | 0.0042 (5) | 0.0008 (3) | 0.0042 (6) |
16 | −10.15 | −8.3881 (4) | −8.8613 (3) | −7.4454 (5) | −6.6581 (6) | −9.4795 (1) | −9.296 (2) |
17 | −10.4 | −9.5579 (2) | −9.4621 (3) | −7.8864 (4) | −5.8977 (6) | −10.0486 (1) | −7.7742 (5) |
18 | −10.53 | −9.7993 (2) | −9.9541 (1) | −8.5676 (4) | −5.3894 (6) | −9.2793 (3) | −6.8984 (5) |
19 | 0 | 0.1427 (4) | 0.0989 (3) | 0.8412 (6) | 0.0497 (2) | 0.0251 (1) | 0.4327 (5) |
20 | 0 | 0.0056 (3) | 0.0066 (4) | 0.0168 (6) | 0.0004 (2) | 0.0002 (1) | 0.0107 (5) |
21 | −3.86 | −3.8628 (5) | −3.8628 (4) | −3.8628 (1) | −3.8628 (3) | −3.8628 (2) | −3.8623 (6) |
22 | −3.32 | −3.2747 (2) | −3.2866 (1) | −3.238 (6) | −3.2745 (3) | −3.2429 (5) | −3.2719 (4) |
23 | 0 | 0.079 (3) | 0.1389 (4) | 33.4143 (6) | 0.0135 (1) | 0.0146 (2) | 3.5681 (5) |
24 | 0 | 1.1253 (1) | 1.2893 (2) | 12.1837 (6) | 2.888 (4) | 2.1901 (3) | 5.6063 (5) |
25 | 0 | 0.2291 (2) | 0.0915 (1) | 57.75 (6) | 3.1098 (3) | 21.3042 (4) | 41.9515 (5) |
26 | 0 | 0.0492 (3) | 0.0591 (4) | 163.2041 (6) | 0.0003 (1) | 0.0034 (2) | 20.0337 (5) |
27 | −1.08 | −1.0764 (3) | −1.0764 (4) | −1.0809 (1) | −1.0629 (5) | −1.0809 (2) | −1.0272 (6) |
28 | −1.5 | −1.4175 (2) | −1.4212 (1) | −0.9667 (4) | −0.7674 (5) | −1.3188 (3) | −0.5915 (6) |
29 | −0.2872 (3) | −0.3065 (2) | −0.1523 (5) | −0.26 (4) | −0.3763 (1) | −0.0818 (6) | |
30 | 0 | 0 (2) | 0 (4) | 0 (1) | 94.0393 (6) | 0 (3) | 0.0004 (5) |
31 | 0 | 14.7267 (2) | 83.2492 (3) | 1.3738 (1) | 395.239 (5) | 189.9685 (4) | 637.8159 (6) |
32 | 0 | 356.7752 (1) | 1168.6692 (2) | 7531.9408 (5) | 1495.7068 (3) | 3859.5778 (4) | 9598.2252 (6) |
|
|
96 | 118 | 130 | 84 | 164 |
Although Table
In Table
Wilcoxon signed-rank test results.
Function | HaeDE vs aeDE | HaeDE vs PSO | HaeDE vs SHO | HaeDE vs SSA | HaeDE vs DA | |||||
---|---|---|---|---|---|---|---|---|---|---|
|
Winner |
|
Winner |
|
Winner |
|
Winner |
|
Winner | |
1 | 0.000 | + | 0.000 | − | 0.000 | − | 0.000 | − | 0.000 | + |
2 | 1.000 | = | 0.083 | + | 0.317 | = | 1.000 | = | 0.008 | + |
3 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 0.180 | + |
4 | 0.877 | = | 0.000 | + | 0.000 | + | 0.000 | + | 0.000 | + |
5 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
6 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
7 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
8 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
9 | 0.949 | = | 0.296 | = | 0.246 | = | 0.000 | − | 0.001 | − |
10 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
11 | 1.000 | = | 0.083 | + | 0.180 | = | 1.000 | = | 0.018 | + |
12 | 1.000 | = | 1.000 | = | 0.180 | = | 1.000 | = | 0.012 | + |
13 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
14 | 0.000 | + | 0.000 | − | 0.018 | − | 0.959 | = | 0.000 | + |
15 | 0.102 | = | 0.003 | − | 0.004 | + | 0.478 | = | 0.000 | + |
16 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
17 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
18 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
19 | 0.237 | = | 0.015 | + | 0.006 | − | 0.000 | − | 0.289 | = |
20 | 0.810 | = | 0.040 | + | 0.001 | − | 0.005 | − | 0.047 | + |
21 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
22 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
23 | 0.159 | = | 0.000 | + | 0.000 | − | 0.000 | − | 0.000 | + |
24 | 0.629 | = | 0.000 | + | 0.000 | + | 0.000 | + | 0.000 | + |
25 | 0.318 | = | 0.000 | + | 0.581 | = | 0.000 | + | 0.000 | + |
26 | 0.000 | + | 0.000 | + | 0.073 | + | 0.225 | = | 0.000 | + |
27 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
28 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
29 | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = | 1.000 | = |
30 | 1.000 | = | 1.000 | = | 0.046 | + | 1.000 | = | 0.001 | + |
31 | 0.198 | = | 0.701 | = | 0.001 | + | 0.002 | + | 0.000 | + |
32 | 0.057 | + | 0.000 | + | 0.001 | + | 0.000 | + | 0.000 | + |
+/=/− |
|
|
|
|
|
Finally, we examine the convergence behaviors of the comparative methods. Figure
Convergence behaviors of comparative methods. (a) Schwefel, (b) Griewank, (c) Penalized, (d) Ackley, (e) Penalized 2.
The number of FES of 30 runs for benchmark functions.
The computational time of 30 runs for benchmark functions.
Clustering is a data mining technique that can partition unknown large data into groups so that elements in each group have the similar properties. It is the important first step to understand some basic information from data before implementing deeper analysis [
Mathematically, let
In optimization problem, each individual or each partition of clustering result is a chromosome that represents the position of cluster centers. Because each center is a
Chromosome representation.
As mentioned before, the clustering quality is usually evaluated via an internal validity measure
Let
After identifying the objective function and chromosome representation, we can utilize the HaeDE to find the optimal partition for clustering problem.
In this subsection, the performance of the HaeDE in clustering is tested using the Iris flower dataset, a well-known benchmark dataset introduced by Fisher [
The clustering results. (a) HaeDE, FES = 641, ARI = 0.8857; (b) aeDE, FES = 880, ARI = 0.4124; (c) PSO, FES = 31420, ARI = 0.4730; (d) SHO, FES = 100000, ARI = 0.5149; (e) SSA, FES = 91220, ARI = 0.3775; (f) DA, FES = 50700, ARI = 0.4347.
Summary of clustering results.
Method | FES | Accuracy | ARI |
---|---|---|---|
HaeDE | 641 | 0.9600 | 0.8847 |
aeDE | 880 | 0.6733 | 0.4124 |
PSO | 31420 | 0.6467 | 0.4730 |
SHO | 100000 | 0.6667 | 0.5149 |
SSA | 91220 | 0.5733 | 0.3775 |
DA | 50700 | 0.6133 | 0.4347 |
As can be seen from Figure
Although the proposed method possesses some advantages in terms of finding the global optimum and reducing computational cost, some disadvantages can be indicated as follows: Using the proposed method for solving unimodal functions is not efficient but causes large computational costs in comparison with gradient-based methods. A parallel version of the proposed method is out of scope of this article. Parallelism is a feasible method to reduce the high computational cost by dividing the computational cost between multiple processors. Theoretically, in case that the network is homogeneous, we can predict the speedup by the number of processors (the efficiency = 1). However, the actual speedup obtained may be less than the number of available processors because the actual computer network is often a heterogeneous environment. In addition, for the proposed method, only the first stage, which is the aeDE, can typically be parallelized at all. After the first stage, the SQSD must be run sequentially. Certainly, enhancing the proposed method with parallel programming is an interesting future research direction for a wide range of researchers. Another drawback of the paper is that the application of the proposed method to the clustering still requires a given number of clusters. In future, another encoding method can be proposed to apply the HaeDE to the clustering problem with unknown number of clusters.
In this paper, an efficient hybrid optimization approach based on adaptive elitist Differential Evolution and Spherical Quadratic Steepest Descent was proposed and then applied for clustering problem. The new method benefits from the aeDE’s global explorative manner in the initial steps and from the SQSD’s locally effective exploitative manner in the later steps to improve the aeDE performance and reduce the computational cost significantly. The behavior of HaeDE is examined by a simple function, and its performance is evaluated on a set of 32 benchmark functions as well as an application in clustering problem. In summary, the HaeDE can be considered as a competitive optimization algorithm and can be utilized effectively in clustering and other applications in future. In addition to the mentioned advantages, the proposed method has a few disadvantages, e.g., it wastes the computational resource in case of unimodal functions, a parallel processing strategy for the proposed method is not considered, and the application of the proposed method to the clustering still requires a given number of clusters. They are also interesting future research directions for a wide range of researchers.
The data used to support the findings of this study are included within the article or are made publicly available to the research community at
The authors declare that there are no conflicts of interest regarding the publication of this paper.