MPE Mathematical Problems in Engineering 1563-5147 1024-123X Hindawi 10.1155/2018/8724084 8724084 Research Article Research on Clustering Method of Improved Glowworm Algorithm Based on Good-Point Set http://orcid.org/0000-0003-1967-3814 Li Yaping 1 2 3 http://orcid.org/0000-0002-3378-9779 Ni Zhiwei 1 2 http://orcid.org/0000-0002-2616-3647 Jin Feifei 1 2 Li Jingming 1 2 Li Fenggang 1 2 Nouari Mohammed 1 School of Management Hefei University of Technology Hefei Anhui 230009 China hfut.edu.cn 2 Key Laboratory of Process Optimization and Intelligent Decision-Making Ministry of Education Hefei Anhui 230009 China moe.edu.cn 3 Anhui Economic Management Institute Hefei Anhui 230059 China 2018 532018 2018 26 10 2017 08 01 2018 532018 2018 Copyright © 2018 Yaping Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

As an important data analysis method in data mining, clustering analysis has been researched extensively and in depth. Aiming at the limitation of K-means clustering algorithm that it is sensitive to the distribution of initial clustering center, Glowworm Swarm Optimization (GSO) Algorithm is introduced to solve clustering problems. Firstly, this paper introduces the basic ideas of GSO algorithm, K-means algorithm, and good-point set and analyzes the feasibility of combining them for clustering optimization. Next, it designs a clustering method of improved GSO algorithm based on good-point set which combines GSO algorithm and classical K-means algorithm together, searches data object space, and provides initial clustering centers for K-means algorithm by means of improved GSO algorithm and thus obtains better clustering results. Major improvement of GSO algorithm is to optimize the initial distribution of glowworm swarm by introducing the theory and method of good-point set. Finally, the new clustering algorithm is applied to UCI data sets of different categories and numbers for clustering test. The advantages of the improved clustering algorithm in terms of sum of squared errors (SSE), clustering accuracy, and robustness are explained through comparison and analysis.

National Natural Science Foundation of China 91546108 71271071 71490725 71521001 Provincial Excellent Young Talents of Colleges and Universities of Anhui Province 2013SQRW115ZD Support Program for Young Talents of Colleges and Universities of Anhui Province Natural Science of Colleges and Universities of Anhui Province KJ2016A162 Social Science Planning Project of Anhui Province AHSKYG2017D136 Anhui Economic Management Institute YJKT1417T01
1. Introduction

As an unsupervised data analysis method, clustering analysis is widely applied in such fields as data mining, pattern recognition, machine learning, and artificial intelligence . Different from classification, clustering algorithm realizes categorization by gathering data objects through certain similarity metric and clustering criterion without any prior knowledge. As a branch of statistics, clustering analysis has been studied extensively. Clustering method can be mainly classified into division method, hierarchy method, and density-based method. The K-means algorithm proposed by James Macqueen is a typical clustering algorithm based on division . However, the clustering result of K-means algorithm is greatly affected by initial clustering center point and is very sensitive to outliers. Literature  optimizes the K-means algorithm by integrating the coding, crossing, and aberrance thoughts of genetic algorithm (GA) with the local optimizing ability of K-means clustering algorithm and proposes the K-means clustering algorithm based on GA. Hierarchy-based clustering methods mainly include CURE algorithm  and Chameleon algorithm , of which one cluster is represented by multiple points in CURE algorithm, making the processing of nonspherical data sets better. Representative algorithms of density-based clustering methods include DBSCAN algorithm , which is able to effectively identify class cluster of any shape, but is very sensitive to the setting of artificial parameters (e.g., radius). Rodriguez and Laio put forward a new density-based density peaks clustering (DPC) algorithm  in 2014. In this algorithm, density peaks (i.e., clustering centers) are selected manually through “decision diagram” first, and then, residual data points are allocated to each clustering center on this basis to obtain corresponding clustering result. It is noteworthy that, in recent years, some scholars have started introducing the heuristic swarm optimization algorithm into clustering analysis of different fields and improving clustering effect by virtue of the global searching ability of swarm optimization algorithm. A clustering analysis method combining PSO and K-means is proposed in literature  through the global searching ability of particle swarm algorithm. In addition, Cuckoo algorithm, artificial bee colony algorithm, artificial fish swarm algorithm, and so forth  are also started to be introduced in the research of clustering algorithm.

The GSO algorithm  proposed by Krishnanand and Ghose is a new swarm intelligence optimization algorithm, which is more efficient in solving multimodal problems compared with traditional swarm intelligence optimization algorithms . Aljarah and Ludwig put forward a new clustering based GSO algorithm in 2013. In this algorithm, the GSO algorithm is adjusted to solve the data clustering problem to locate multiple optimal centroids . An new approach for cluster analysis based on GSO algorithm and K-means has been proposed by Onan and Korukoglu . Due to the multimodal nature of multimedia data, Pushpalatha and Ananthanarayana proposed the GSO algorithm based Multimedia Document Clustering (GSOMDC) algorithm to group the multimedia documents into topics in 2015 . A fuzzy clustering algorithm based on GSO algorithm (GSO-KFCM) is proposed by Cheng and Bao in 2017. In this algorithm, the GSO algorithm obtains the optimal solution as the initial clustering center of the kernelized fuzzy mean clustering algorithm .

This paper introduces GSO algorithm into clustering analysis, regards each glowworm as a feasible solution in clustering center of data object space, searches data object space through the optimization process of glowworm, and solves clustering center by obtaining multiple extreme points. In this way, it combines GSO algorithm and K-means algorithm together, provides initial clustering centers for K-means algorithm by means of GSO algorithm, solves the problem that K-means algorithm is sensitive to initial clustering centers, and thus obtains better clustering effects. Meanwhile, considering the effect of the initial distribution of glowworm swarm on clustering center search, this paper optimizes the initial glowworm swarm distribution in GSO algorithm by introducing the theory and method of good-point set [19, 20], which improves the global searching performance of GSO algorithm. The research in this paper mainly includes 3 parts. Section 2 gives explanations on relevant algorithms and theories, which puts forward the optimization idea for clustering analysis-oriented GSO algorithm. Section 3 introduces improved GSO algorithm based on good-point set, combines improved GSO algorithm with K-means algorithm together, and designs the algorithm framework and implementation steps for new clustering method (GSOK_GP algorithm). Section 4 selects UCI data sets of different categories and numbers for clustering experiment and analysis for the GSOK_GP algorithm designed in this paper.

2. Description of Relevant Algorithms 2.1. <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M16"><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:math></inline-formula>-Means Clustering Algorithm 2.1.1. Basic Ideas of <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M17"><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:math></inline-formula>-Means Clustering Algorithm

Basic ideas of K-means clustering algorithm: select k data points at random in the data objects to be clustered to act as initial clustering center points, and allocate other data points to corresponding clustering center points based on their similarity with such initial clustering center points. After one round of allocation, recalculate the clustering centers of each category based on the clustering result of the round, and then, allocate residual data points to obtain the clustering result of the new round. Repeat this process for given times or until the convergence of data center points.

2.1.2. Steps of <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M20"><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:math></inline-formula>-Means Clustering Algorithm

( 1) Problem Description. X = { x 1 , x 2 , , x N } represents a given data object, where xi represents data vector point. Divide X into several disjoint clusters C={C1,C2,,Ck}, where X=i=1kCi,CiCj=,Ci.

( 2) Related Definitions

Definition 1.

Euclidean distance between data points(1)disxi,xj=xi-xjxi-xjT2.

Definition 2.

SSE of clustering results(2)SSE=i=1kxizjdisxi-zj,

where zj is the cluster center of xi. SSE is taken as an important indicator for evaluating clustering result in general.

( 3) Implementation Steps of K -Means Algorithm

Step 1.

Randomly select k samples z(1)={z11,z21,,zk1,} as initial clustering centers.

Step 2.

Allocate other data points in data object X to existing clustering center zi1 as per given principles (e.g., shortest Euclidean distance).

Step 3.

Recalculate clustering center z(2)={z12,z22,,zk2} and zi2=(1/m)j=1mxj, as per clustering result, where xj is the data point allocated to clustering center point zi1.

Step 4.

If zi1zi2, that is, the new clustering center is different from the original one, turn to Step 2 for iteration again, until the convergence of clustering center points or reaching maximum iterations.

It can be learnt from the steps above that initial clustering centers have significant effect on the clustering result and operating efficiency of K-means clustering algorithm and may lead to premature local optimum of K-means clustering algorithm, which causes clustering results with large difference in turn.

2.2. Main Ideas and Steps of GSO Algorithm

In GSO algorithm, each glowworm is deemed as a feasible solution of target problem in space. Glowworms gather towards high brightness glowworm through mutual attraction and location movement, and multiple extreme points are found out in the solution space of a target problem. In this way, the problem is solved. Its main ideas can be described as follows.

Step 1.

Initialize glowworm swarm Z={z1,z2,,zn}. Glowworm number n in swarm, step s, fluorescein initial value l0, fluorescein volatilization rate ρ, domain change rate β, decision domain initial value γ0, domain threshold γmax, and other parameters related need to be initialized and assigned in the initialization.

Step 2.

Calculate glowworm fitness based on objective function. Calculate the fitness f(zi) of each glowworm zi at its location based on specific objective function y=max(f(z)).

Step 3.

Calculate the moving direction and step of glowworm. Each glowworm zi searches for glowworms zj with higher fluorescein value li within its own decision radius ri, and determine the next moving direction and step based on fluorescein value and distance.

Step 4.

Update glowworm locations. Update the location of each glowworm zi based on determined moving direction and step.

Step 5.

Update the decision domain radius of glowworm.

Step 6.

Judge whether the algorithm has converged or reached the maximum iterations (itmax) and determine whether to enter the next round of iteration.

It can be learnt from the steps above that algorithm execution efficiency can be improved and premature local optimum of algorithm can be avoided by optimizing the initial distribution of glowworm swarm.

2.3. Basic Theory of Good-Point Set

Basic definition and structure of good-point set are as follows:

(1) Assume Gs is a unit cube in S-dimensional Euclidean space, which is expressed as(3)xGs,x=x1,x2,,xs,where,0xi1,i=1,2,,s.

(2) Assume Pn(k) is a point set with the number of n in Gs, which is expressed as(4)Pnk=x1nk,x2nk,,xsnk,where,0kn,0xink1,i=1,2,,s.

(3) Assume r=(r1,r2,,rs) is a given point in Gs and Nnr=Nn(r1,r2,,rs) is the number of points not satisfying the inequality below in point set Pn(k).(5)0xinkri,where,i=1,2,,s.

φ n = sup N n r / n - r , where rGs,r=r1r2rs, and φn is known as the deviation of point set Pn(k).

(4) Assume φn is the deviation of Pnk=x1nk,x2nk,,xsnk,k=1,2,,n and meets the requirements below:

φ n = C ( r , ε ) n - 1 + ε , where C(r,ε) is a constant related to r and ε, ε>0.

P n ( k ) is known as a good-point set and r a good point.

It has been proved by applicable theorems that, with respect to approximate integration, the order of deviation φn is only relevant to n and irrelevant to the space dimensions of the sample. Therefore, good-point set can provide better support for the calculation in high-dimensional spaces . Meanwhile, as for a point set object whose distribution is unknown, the deviation φn of n points Pn=x1,x2,,xn obtained by virtue of good-point set is significantly superior to n points obtained by random method. Therefore, a better initial distribution scheme can be provided for the swarm distribution in swarm intelligence algorithm based on this feature of good-point set.

3. Design of GSOK_GP Algorithm

This paper proposes an improved GSO algorithm based on good-point set to solve clustering problems on the basis of analysis of relevant algorithms above and characteristics of clustering problems. Its main ideas can be described as firstly, optimize the initial distribution of glowworm swarm through good-point set, so as to optimize GSO algorithm. Secondly, optimize the initial clustering centers in clustering data objects, and obtain characteristics of multiple extreme points and a clustering center point set with optimized GSO algorithm. Thirdly, select k extreme points as the initial clustering center of K-means algorithm in the clustering center point set as per maximum distance principle. Fourthly, execute the K-means algorithm with initial clustering center to figure out the clustering result. The algorithm framework is shown as Figure 1. Where titmax means the iterations are no greater than maximum iterations, flag>k means the number of extreme points is greater than the number k of initial clustering centers required.

Flow of GSOK_GP algorithm.

3.1. Initial Swarm Optimization Based on Good-Point Set

Optimization of initial distribution of glowworm swarm is to represent the characteristics of solution space more scientifically utilizing glowworm swarm in essence. Randomly generated glowworm swarm cannot cover all conditions of solution space in most cases. Therefore, uniform distribution of glowworm swarm in solution space is an effective strategy. More uniform distribution of swarm can be realized with the theory and method of good-point set above.

Assume the initial glowworm swarm number is n; select n points in s-dimensional space to act as glowworm locations. Select the good-point set Pnk=x1nk,x2nk,,xsnk,k=1,2,,n composed of n good points in s-dimensional space with good-point set theory. There are mainly three methods:

(1) Square root sequence method: rk={pk,1ks}, where pk are different primes.

(2) Cyclotomic field method: rk={2cos2πk/p,1ks}, where p is the smallest prime meeting (p-3)/2s.

(3) Exponential sequence method: rk={ek,1ks}.

Assuming s=2 and n=300, construct good-point set (i.e., initial glowworm swarm distribution) with exponential sequence method. Figures 1 and 2 show the data points (glowworms) distribution under random condition and when applying exponential sequence method, respectively.

Randomly distributed n data points (glowworms).

The comparison between Figures 2 and 3 indicates that the data point distribution in exponential sequence method is more uniform, which is able to cover the solution space better. In the meantime, the structure of its good-point set is more stable; that is, the distribution effect is consistent when n is unchanged. Therefore, a better initial glowworm distribution can be obtained by applying good-point set in initial glowworm swarm distribution.

n data points (glowworms) distributed in good-point set exponential sequence method.

3.2. Flow of GSOK_GP Algorithm

Glowworm individuals are deemed as the feasible solutions of a clustering center point when combining improved glowworm algorithm with K-means algorithm to solve clustering problems. In view of the characteristic that clustering center points are surrounded by data points of data objects, the density of clustering center points is represented by an extreme value of various data point densities within one domain. Therefore, take the density of glowworm individuals in data object set as their fitness, and obtain a superior initial clustering center point set through optimizing of density extreme value by glowworms. The main algorithm flow is as follows.

Step 1.

Initialize with the glowworm swarm based on good-point set. As for the data set X=x1,x2,,xN needing to be clustered, initialize and assign glowworm number n in swarm, initial location of glowworm, step s, fluorescein initial value l0, fluorescein volatilization rate ρ, domain change rate β, decision domain initial value γ0, domain threshold γmax, and other parameters related in the Euclidean space where X is limited.(6)X=x1,x2,,xN,xi=xi1,xi2,,xis,Z=z1,z2,,zn,zi=zi1,zi2,,zis,zijminxi,maxxi.

Step 2.

Calculate glowworm fitness, namely, the number fzi of data points in data set X=x1,x2,,xN in the domain where glowworm zi distance is γi.(7)Dset Xzi=xjdisxj,zi<γi.fzi=countDset zi

Step 3.

Update glowworm fluorescein. lit represents the fluorescein value of glowworm t in round zi of iteration.(8)lit+1=1-ρlit+λfxit.

Step 4.

Determine moving direction. Glowworm zi searches the glowworm with higher fluorescein in decision domain and selects the glowworm zj with higher fluorescein through roulette approach, which acts as the moving direction of the next step. Dset Zzi represents the glowworm set in the domain, Dset Lzi represents the glowworm set with higher fluorescein in the domain, and PDset Lzi represents the probability of each glowworm to be selected. Choose the glowworm zj with the maximum probability to act as the moving direction of glowworm zi.(9)Dset Zzi=zjdiszj,zi<γi,jiDset Lzi=zjdiszj,zi<γi,lj>li,ji,PDset Lzi=p1,p2,,pmPzj=maxp1,p2,,pm.

Step 5.

Update location. Glowworm zi moves by the step s towards the direction of glowworm zj to complete location update of all glowworms.(10)zi=zi+zj-zizj-zi×s.

Step 6.

Update decision domain. γit represents the decision radius of glowworm zi in round t iteration, nS represents the threshold of glowworm number in the domain, and nt represents the glowworm number within the decision radius.(11)γit+1=γit+βnS-nt.

Step 7.

Judge the termination condition of glowworm search and enter iteration of the next round.

Step 8.

Glowworm algorithm terminates, and k extreme points are output to act as the initial clustering center points for K-means algorithm.

Step 9.

Execute K-means algorithm and output clustering result.

3.3. Key Strategies in GSOK_GP Algorithm 3.3.1. Density-Based Fitness Function

Cluster center is a glowworm data point surrounded by adjacent points of low local density in GSOK_GP algorithm; therefore, cluster center can be interpreted as a local optimal point on fitness.(12)fzi=countDsetzicountDsetzi=zjDsetzidijdij=1,disxj,zi<γi0,disxj,ziγi.

3.3.2. Weighted Euclidean Distance

Since there is large difference in value range of the data object in different dimensions, partial attributes with a large value range may have greater effect on the Euclidean distance between data points if only Euclidean distance is applied, which will cause greater effect on the clustering result. Therefore, calculation of Euclidean distance needs to be adjusted through different weights allocation in the process of initial clustering center search by the glowworm if assuming each dimension of the data object has the same effect on the clustering result without prior knowledge.

Assumption 3.

Value range of data object X in each dimension is expressed as follows:(13)U=a1,b1,a1,b1,,an,bn.Set di=bi-ai, D=(d1,d2,,dn,).

W = ( w 1 , w 2 , , w n , ) represents the weight to be assigned to different dimensions:(14)wi=1/di1/di.Improved Euclidean distance calculation method is redefined in this way.(15)disxi,xj=xi-xjxi-xjWT2.

It should be noted that adjustment for Euclidean distance calculation method is only applied in the process of searching initial clustering center in GSO algorithm, and general Euclidean distance calculation approach needs to be adopted in algorithm evaluation, so as to compare and analyze with other algorithms.

3.3.3. Selection of Extreme Point

A relatively large distance between cluster centers is necessary in clustering algorithm. Therefore, select k centers in multiple cluster centers to constitute the initial clustering centers of K-means algorithm; that is, selecting k extreme points in extreme point set J=J1,J2,,Jp to act as the initial clustering centers of K-means algorithm needs to follow distance maximization principle. When p>k, the basic steps for selecting extreme points are as follows:

(1) Firstly, select the glowworm with the highest fitness to act as the first clustering center point.

(2) Secondly, calculate the distances from other clustering center points to the first clustering center point, and select the one with the largest distance to act as the second clustering center point.

(3) Repeat step (2) to calculate the sum of the distances from other clustering center points to clustering centers selected, and select the one with the largest distance to act as the next clustering center point until k clustering center points are obtained.

4. Experiment and Analysis 4.1. Experimental Environment

Matlab is employed to compile GSOK_GP algorithm and two UCI data sets shown in Table 1 are selected to test its effectiveness in this paper. Design parameters of GSO algorithm referring to relevant literatures, and select relevant parameters of M-GSO algorithm as follows based on actual clustering problems: N=50, ρ=0.4, λ=0.6, β=0.08, s=1, l0=5, and ns=6, with maximum iterations: 100.

Selection of experimental data set.

Data set Number of dimensions Number of categories Number of samples
Iris 4 3 150
Glass 9 6 214

SSE, clustering accuracy, and robustness are used to evaluate clustering effect of algorithm in this paper. SSE employs the sum of the Euclidean distances from all data objects to their cluster center points. The calculation approach is as follows:

SSE = j = 1 k x i z j x i - z j x i - z j T 2 , where zj is the cluster center point of xi.

The clustering accuracy proposed by Gan et al. is taken as one of the clustering effect evaluation standards in this paper . Clustering accuracy refers to the proportion of accurately classified samples to total samples. The definition of clustering accuracy AC is as follows:(16)AC=i=1kain,

where k represents the number of categories of data sets, n represents the total number of samples in the data set, ai represents the number of samples accurately classified into Category i.

In addition, the robustness indicators proposed in literature  are used to identify the algorithm stability in this paper. The algorithm robustness in this paper is calculated with the mean square error of results of multiple experiments as per the calculation formula below:(17)R=AC-ACAC×100%,

where AC is the optimal value of clustering accuracy and AC is the average value of clustering accuracy obtained by operating the algorithm multiple times. The smaller the R is, the higher the algorithm robustness will be.

4.2. Experimental Results and Analysis

The data of executing GSOK_GP algorithm 20 times for Iris and Glass data sets, respectively, and independently is shown in Tables 2, 3, and 4. The data of executing K-means algorithm and PSOK algorithm 20 times is cited from literature .

Experimental results of iris data set.

Algorithm Average value of SSEs Average value of ACs (%) Standard deviation value of ACs
K -means 102.57 83.95 0.0451
PSOK 99.61 87.39 0.0420
GSOK_GP 97.32 89.33 0

Experimental results of glass data set.

Algorithm Average value of SSEs Average value of ACs (%) Standard deviation value of ACs
K -means 241.03 51.70 0.0157
PSOK 233.23 52.20 0.0108
GSOK_GP 225.08 53.50 0.0090

Comparison of robustness (R) of each algorithm (%).

Algorithm Iris Glass
K -means 10.7 5.73
PSOK 8.07 3.65
GSOK_GP 0 2.97

There are 150 sample objects in Iris data set, each of which has 4 attributes, which can be classified into 3 categories in total. The experimental results of Iris data set are shown in Table 2.

There are 214 data sets in Glass data set; each object has 9 attributes, which can be classified into 6 categories in total. The experimental results of Glass data set are shown in Table 3.

It can be learnt from Tables 2 and 3 that GSOK_GP algorithm is superior to traditional k-means algorithm and PSOK algorithm on SSE and average accuracy.

Calculation results based on comparing the robustness of traditional k-means algorithm, PSOK algorithm, and GSOK_GP algorithm are shown in Table 4.

Table 4 indicates that the operation results of 20 independent operations of GSOK_GP algorithm for Iris data set are consistent, which proves significant stability. And the fluctuation in the operation results of 20 independent operations for Glass data set is obviously smaller than that of k-means algorithm and PSOK algorithm. Therefore, GSOK_GP algorithm has better robustness in the experiments.

5. Conclusion

Traditional K-means clustering algorithm is widely used due to its simple principle and high execution efficiency. However, K-means algorithm relies on initial clustering centers, which leads to large difference in the clustering result, low accuracy, and lack of stability of traditional K-means algorithm. In this paper, the initial clustering centers in K-means algorithm are optimized with improved glowworm algorithm based on good-point set, and the clustering effect is improved.

The GSOK_GP algorithm proposed in this paper is mainly applied to solving data object clustering problems under unsupervised learning conditions. The difference between the GSOK_GP algorithm and traditional clustering methods is that it combines GSO algorithm and K-means algorithm together to improve the clustering effect. In particular, as for the effect of initial clustering centers on clustering results, this paper provides more scientific descriptions for data object space by introducing the theory and method of good-point set and obtains superior initial clustering center points with the searching ability of GSO algorithm. Through comparison and analysis, GSOK_GP algorithm is proved to have better clustering effect and stability.

In addition, the adverse effect of computing efficiency of GSOK_GP algorithm for glowworm density in case of large data object has also been noticed, which means that the convergence of GSOK_GP algorithm needs to be improved further, so as to apply it better when addressing clustering problems under large data volume.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The work was supported by National Natural Science Foundation of China (nos. 91546108, 71271071, 71490725, and 71521001), fund of Provincial Excellent Young Talents of Colleges and Universities of Anhui Province (no. 2013SQRW115ZD), fund of Support Program for Young Talents of Colleges and Universities of Anhui Province, fund of Natural Science of Colleges and Universities of Anhui Province (no. KJ2016A162), fund of Social Science Planning Project of Anhui Province (no. AHSKYG2017D136), and fund of Scientific Research Team of Anhui Economic Management Institute (no. YJKT1417T01).

Guang J. Liu M. Zhang D. Spectral clustering algorithm based on effective distance Journal of Frontiers of Computer Science and Technology 2014 11 11 1365 1372 Jain A. K. Data clustering: 50 years beyond K-means Pattern Recognition Letters 2010 31 8 651 666 2-s2.0-77950369345 10.1016/j.patrec.2009.09.011 Lu B. Ju F. An optimized genetic K-means clustering algorithm Proceedings of the 2012 International Conference on Computer Science and Information Processing, CSIP 2012 August 2012 China 1296 1299 2-s2.0-84868536122 10.1109/CSIP.2012.6309099 Zhou Y.-J. Xu C. Li J.-G. Unsupervised anomaly detection method based on improved CURE clustering algorithm Tongxin Xuebao/Journal on Communication 2010 31 7 18 32 2-s2.0-77955449038 Wilcox H. Nichol R. C. Zhao G.-B. Bacon D. Koyama K. Romer A. K. Simulation tests of galaxy cluster constraints on chameleon gravity Monthly Notices of the Royal Astronomical Society 2016 462 1 715 725 2-s2.0-84988698656 10.1093/mnras/stw1617 stw1617 Jing Y. Jiawei G. Jiye L. An improved DBSCAN clustering algorithm based on data field Journal of Frontiers of Computer Science and Technology 2012 6 10 903 911 Rodriguez A. Laio A. Clustering by fast search and find of density peaks Science 2014 344 6191 1492 1496 10.1126/science.1242072 2-s2.0-84903289127 Pei Z. Hua X. Han J. The clustering algorithm based on particle swarm optimization algorithm Proceedings of the International Conference on Intelligent Computation Technology and Automation, ICICTA 2008 October 2008 chn 148 151 2-s2.0-57949092753 10.1109/ICICTA.2008.421 Hassanzadeh T. Meybodi M. R. A new hybrid approach for data clustering using firefly algorithm and K-means Proceedings of the 16th CSI International Symposium on Artificial Intelligence and Signal Processing, AISP 2012 May 2012 Iran 7 11 10.1109/AISP.2012.6313708 2-s2.0-84869165710 Huihua Y. Ke W. Lingqiao L. Wen W. Shengtao H. K-means clustering algorithm based on adaptive cuckoo search and its application Journal of Computer Applications 2016 36 8 2066 2070 10.11772/j.issn.1001-9081.2016.08.2066 Yu-ting D. Song W. Wei M. Artificial colony clustering algorithm based on global information Microelectronics & Computer 2017 34 2 20 24 Yu H.-t. Jia M.-J. Wang H.-q. Clustering algorithm based on artificial fish swarm Computer Science 2012 39 12 60 64 Krishnanand K. N. Ghose D. Glowworm swarm optimization for simultaneous capture of multiple local optima of multimodal functions Swarm Intelligence 2009 3 2 87 124 2-s2.0-67349193229 10.1007/s11721-008-0021-5 Zainal N. Zain A. M. Radzi N. H. M. Glowworm swarm optimization (GSO) for optimization of machining parameters Journal of Intelligent Manufacturing 2016 27 4 797 804 10.1007/s10845-014-0914-7 Aljarah I. Ludwig S. A. A new clustering approach based on glowworm swarm optimization Proceedings of the 2013 IEEE Congress on Evolutionary Computation, CEC 2013 June 2013 Mexico 2642 2649 10.1109/CEC.2013.6557888 2-s2.0-84881606908 Onan A. Korukoglu S. Improving performance of glowworm swarm optimization algorithm for cluster analysis using K-means 10 Proceedings of the International Symposium on Computing in Science Engineering 2013 291 297 Pushpalatha K. Ananthanarayana V. S. A New glowworm swarm optimization based clustering algorithm for multimedia documents Proceedings of the 17th IEEE International Symposium on Multimedia, ISM 2015 December 2015 USA 262 265 10.1109/ISM.2015.94 2-s2.0-84969631114 Cheng C. Bao C. A kernelized fuzzy C-means clustering algorithm based on glowworm swarm optimization algorithm Proceedings of the 9th International Conference on Computer and Automation Engineering, ICCAE 2017 February 2017 Australia 78 82 2-s2.0-85020903738 10.1145/3057039.3057045 Hua L. G. Wang Y. Applications of Number Theory to Numerical Analysis 1981 Berlin, Germany Springer MR617192 Chen Y. Liang X. Huang Y. Improved quantum particle swarm optimization based on good-point set Zhongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Central South University (Science and Technology) 2013 44 4 1409 1414 2-s2.0-84878368801 Gan G. Wu J. Yang Z. A genetic fuzzy k-Modes algorithm for clustering categorical data Expert Systems with Applications 2009 36 2 1615 1620 2-s2.0-56349096501 10.1016/j.eswa.2007.11.045 Xiaoying P. Xuejing C. Angru L. Pu Z. Firefly partition clustering algorithm based on self-adaptive step Computer Applied Research 2017 34 12 3576 3579