A New Fuzzy Clustering Algorithm Based on Clonal Selection for Land Cover Classification

A new fuzzy clustering algorithm based on clonal selection theory from artificial immune systems AIS , namely, FCSA, is proposed to obtain the optimal clustering result of land cover classification without a priori assumptions on the number of clusters. FCSA can adaptively find the optimal number of clusters and is designed as a two-layer system: the classification layer and the optimization layer. The classification layer of FCSA, inspired by clonal selection theory, generates the optimal classification result with a fixed cluster number by utilizing the clone, mutation, and selection of immune operators. The optimization layer of FCSA evaluates the optimal solutions according to performance measures for cluster validity and then adjusts the cluster number to output the final optimal cluster number. Two experiments with different types of image evince that FCSA not only finds the optimal number of clusters, but also consistently outperforms the traditional clustering algorithms, such as K-means and Fuzzy C-means. Hence, FCSA provides an effective option for performing the task of land cover classification.


Introduction
Land cover classification from remotely sensed images is considered to be a cost-effective and reliable method for generating up-to-date land cover information 1 .Clustering algorithms, or unsupervised classification algorithms, are built to solve the site labeling problem without the need for training samples for land cover classification 2 .For example, the familiar Kmeans 3 and Iterative Self-Organizing Data ISODATA 4 algorithms iteratively assign the pixels of an image to one of the classes.K-means finds an optimal partition of the data distribution into the requested number of subdivisions, while ISODATA is a modified version of the K-means algorithm.Both of them first assign an arbitrary initial cluster vector.The mean vectors and covariance matrix of clusters are then calculated based on the pixels in the initial cluster; pixels in the image are assigned to the closest cluster to form a new cluster and the label of each pixel is updated.The mean vectors and covariance matrix of clusters are recalculated subsequently based on the new clusters.In every iteration of the classical K-means and ISODATA algorithms, each image pixel is assumed to be in exactly one cluster; an alternative to the crisp membership association uses fuzzy sets to describe the relationship between the data points and the cluster centers.
For instance, Fuzzy C-means FCM 3 is an approach to clustering those partitions of an image data set into C fuzzy subsets using fuzzy membership.In addition to the aforementioned algorithms, Bayesian classifiers 5 and Markov Random Field 6 have also been employed for the unsupervised classification for remote sensing images.Recently, there has been considerable interest in applying unsupervised neural networks 7 , such as Kohonen's Self-organizing Maps SOM , to multi/hyper-spectral remote sensing image classification.SOM was investigated as a possible tool for automated knowledge acquisition.In addition, with the emergence of genetic algorithm GA , some GA-based clustering algorithms have been proposed, which can converge to the global optima with high probability 8 .
Since geographical information including remotely sensed data for land cover classification is imprecise, meaning that the boundaries between different phenomena are fuzzy, fuzzy clustering algorithms, for example, FCM, are better suited for dealing with realworld problems of land cover classification than classical crisp classification models, such as K-means.However, FCM has two major limitations.On the one hand, it requires the a priori specification of the number of clusters.When the number of clusters is specified incorrectly, serious problems may arise.On the other hand, FCM is much more sensitive to the initialization and easily falls into a local optimum 9 .To overcome these obstacles, this paper proposes a fuzzy clustering algorithm based on clonal selection theory from artificial immune systems AIS , namely, FCSA, to automatically evolve the fuzzy partitions of land cover data such that some measure of goodness of the partitions is optimized.Clonal selection theory 10, 11 is a basic theory in immune systems to explain the basic features of an adaptive immune response to an antigenic stimulus.The clonal selection algorithm CSA 12 , derived from clonal selection theory, is proposed as an important model in artificial immune systems AISs , which are inspired by the vertebrate immune systems, and use the immunological properties to support a wide range of applications 13-15 .CSA has been successfully applied to pattern recognition, multimodal optimization, feature selection, and classification by utilizing its biological properties such as immune evolution and immune memory 12, 14, 16 .
To automatically evolve the optimal number of clusters as well as the fuzzy partitioning of the data, the proposed fuzzy clustering algorithm FCSA is designed as a two-layer system: the classification and optimization layers.The classification layer of FCSA can quickly obtain the global optimum and has the better classification results with fixed cluster numbers, since FCSA utilizes different immune operators, such as the clonal operator, mutation operator, selection operator, and those operators can combine the evolutionary search and random search and incorporate the global search with a local search by the clonal operation on candidate solutions.The optimization layer of FCSA controls the process of the classification layer, evaluates the optimal solutions according to performance measures for cluster validity and then adjusts cluster numbers to output the final solution.In this paper, the Xie-Beni XB 17 cluster validity index is selected as the underlying optimizing criterion since it has shown to be better able to indicate the correct number of clusters in several experiments 18, 19 .The FCSA is evolution-like and has several interesting features: 1 the cluster number is dynamically adjustable and automatically obtained; 2 has the capability of maintaining local optima solutions; 3 explores the global optimal.The Flightline C1 FLC1 and TM remote sensing images have been used for demonstrating the effectiveness of the developed unsupervised fuzzy artificial immune classifier by automatically segmenting the images into unknown regions.Experimental results demonstrate that the proposed algorithm outperforms the traditional methods, that is, FCM, and thus provide an effective option for unsupervised land cover classification.
The remainder of the paper is structured as follows.Section 2 gives an overview of the clonal selection theory and the clonal selection algorithm.Section 3 describes the proposed method and algorithm in detail, while Section 4 illustrates the performance of the proposed algorithm as compared to the traditional algorithms.Finally, Section 5 concludes the paper.

Clonal Selection Theory
The human immune system, a complex system of cells, molecules, and organs, symbolizes an identification mechanism capable of perceiving and combating dysfunction from our own cells and the action of exogenous infectious microorganisms.This immune system protects the body from infectious agents such as viruses, bacteria, fungi, and other parasites.Any molecule that can be recognized by the adaptive immune system is known as an antigen.The basic component of the immune system is the lymphocytes or white blood cells.Lymphocytes exist in two forms, B cells and T cells.These two types of cell are rather similar, but differ in how they recognize antigens and in their functional roles.B cells are capable of recognizing antigens free in solution, while T cells require antigens to be presented by other accessory cells.They have distinct chemical structures and produce many Y-shaped antibodies from their surfaces to kill the antigens.Antibodies are molecules attached primarily to the surface of B cells with the aim of recognizing and coping with antigens 20 .
In order to clarify how an immune response is mounted when a nonself antigenic pattern is recognized by a B cell, clonal selection theory has been developed 21, 22 .The main features of clonal selection theory are concerned with 1 proliferation and differentiation on simulation of cells with antigens; 2 generation of new random genetic changes, expressed subsequently as diverse antibody patterns, by a form of accelerated somatic mutation; 3 estimation of newly differentiated lymphocytes carrying low-affinity antigenic receptors.These will be utilized in this paper.
The principle can be detailed as follows.When a B-cell receptor recognizes a nonself antigen with a certain affinity, it is selected to proliferate and produce antibodies in high volumes.The antibodies are soluble forms of the B-cell receptors that are released from the B-cell surface to cope with the invading nonself antigens.Antibodies bind antigens leading to their eventual elimination by other immune cells.Proliferation in the case of immune cells is asexual and it is a mitotic process, in which the cells divide themselves.During reproduction, the B-cell clones undergo a hypermutation process in that the antigen stimulates the B cell to proliferate and mature into terminal antibody secreting cells, named plasma cells.The process of cell division generates a clone.In addition to proliferating and differentiating into plasma cells, the activated B cells with high antigenic affinities are selected to become memory cells with long life spans.These memory cells circulate through the blood, lymph, and tissues.When exposed to a second antigenic stimulus, memory cells commence to differentiate into plasma cells capable of producing high-affinity antibodies, preselected for the specific antigen that had stimulated the primary response 12 .Figure 1 illustrates the clonal selection, expansion, and affinity maturation processes.

Clonal Selection Algorithm (CSA)
Based on the clonal selection theory and the shape space model of the immune system, De Castro and Von Zuben 2002 developed the Clonal Selection Algorithm CSA 12 .It has been applied to support pattern recognition and solve multimodal optimization problems.The algorithm can be described as follows.
Step 1. Randomly initialize a population of individuals, M.
Step 2. For each input pattern P , present it to the population M and determine its affinity with each element of M.
Step 3. Select n of the best highest affinity elements of M and clone these individuals proportionally to their affinity with the antigen.The higher the affinity, the higher the number of copies, and vice versa.
Step 4. Mutate all these copies with a rate proportional to their affinity with the input patternthe higher the affinity, the smaller the mutation rate.
Step 5. Add these mutated individuals to the population M and reselect m of these maturated individuals to be kept as memory cells of the systems.
Step 6. Repeat Steps 2-5 until a certain criterion is met.
Similar to CSA, the genetic algorithm GA is also a heuristic algorithm.However, their underlying mechanisms and methods of evolutionary search significantly differ in terms of inspiration, vocabulary, and fundamentals.While GA uses a vocabulary borrowed from natural genetics and is inspired by the Darwinian evolution theory, CSA makes use of the shape space formalism, along with immunological terminology to describe antigen-antibody interactions and cellular evolution in immune systems.GA performs a search through genetic operators including reproduction, crossover, and mutation, while CSA performs its search through the mechanisms of somatic mutation and receptor editing, balancing the exploitation of the best solutions with the exploration of the searchspace.The CSA maintains a diverse set of local optimal solutions, while the GA tends to polarize the whole population of individuals towards the best one.This mainly occurs because of the selection and reproduction schemes adopted by the CSA described in Step 3 .Essentially, their coding schemes and evaluation functions are not different, but their evolutionary search differs from the viewpoint of inspiration, vocabulary, and fundamentals 23 .In addition, CSA inherits the memory property of human immune systems to build a memory cell population and can recognize the same or similar antigens quickly at different times 14, 24 .

Fuzzy Clustering Algorithm Based on Clonal Selection (FCSA)
A fuzzy clustering algorithm based on clonal selection, namely, FCSA, is proposed to perform the task of land cover classification by automatically evolving the optimal fuzzy partition matrix.A main objective of the proposed algorithm is to get closer to a more natural classification of land cover.
A remote sensing image dataset X {x 1 , x 2 , . . ., x N } is observed, where each object x i will be an earth surface unit or picture element pixel , j 1, . . ., N. N represents the total number in an unlabeled image, N N row × N col where N row and N col represent the image's row number and column number, respectively.In addition, each pixel x j contains the attributes vector with p bands, where nc represents the number of clusters.In the fuzzy cluster analysis, each pixel in the dataset can be assigned to more than one cluster, according to a membership value u ij μ CS i x j , which defines the membership of the pixel x j to the cluster CS i .
To find adaptively the optimal number of clusters, FCSA is designed as a two-layer system: the classification and optimization of FCSA.The optimization layer of FCSA controls the process of the classification layer and evaluates the optimal number of clusters according to performance measures using the Xie-Beni index.Each Xie-Beni index with the different number of clusters may be calculated after the process of classification.The best partition is considered to be the one that corresponds to the minimum value of the Xie-Beni index.The classification layer of FCSA with the fixed number of clusters classifies the image dataset by exploring the optimal membership degrees matrix and centers of clusters with minimum objective function shown in 3.1 , under the constraint shown in 3.2 , where μ ij ∈ 0, 1 indicates the membership of data vector x j assigned to cluster CS i .v i is the ith center of CS and d ij is the Euclidean distance between data vector x j and center v i .m ∈ 1, ∞ is a parameter to control the fuzziness clustering result To obtain the optimal minimum objective function, it is feasible to encode either the center matrix CS or membership function matrix U.The relationship between CS and U can be denoted as in 3.3 .In this paper, we encode CS into antibodies of the proposed algorithm to calculate their values

3.3
To better describe the FCSA, the following notations are used.iii mc denotes the memory cell.mc indicates the best antibody with the highest membership value in each iteration and mc is a candidate solution.
The FCSA algorithm consists of the following steps.

The Classification Layer of FCSA
The classification layer of FCSA is used to find the best fuzzy partition of the image dataset with the fixed number of clusters.
Step 1 initialization and encoding .In FCSA, the antibodies are made up of real numbers; each antibody ab i represents a group of clustering centers with nc prototypes as in A first antibody population AB including M antibodies is randomly generated by selecting distinct points from the AG dataset.AB {ab 1 , ab 2 , . . ., ab M } and M is the number of the antibody population.
Step 2 cycle the generations .After initialization, the simulation of the clonal selection process begins.One generation after another is created and each must prove its affinity to the criterion function.In each iteration, a number of possible solutions are generated by applying the immune operators such as clone, mutation, selection in a stochastic process guided by an affinity measure.The algorithm seeks to evolve an optimal solution to the clustering problem. (

1) Calculation of Affinity
According to the initial antibody population, the affinity of all M abs in the antibody population AB is calculated using the criterion function F F ab i .The higher the criterion function, the better the antibody.However, an optimal fuzzy partition should minimize the objective function J in 3.1 , which is the generalized least squares error function.To maximize the criterion function F, the function may be defined as follows:

2) Selection
From AB, the "n" highest affinity antibodies are selected to compose a new set AB {n} of highaffinity antibodies and the highest affinity memory cell mc is found. (

3) Clone
After receiving antibody individuals closer to the solution, the next generation should mainly be derived from the better-fitting individuals.Thus, the n selected abs are cloned based on their antigenic affinities, generating the clone set C. In the FCSA, the number of clones for each subpopulation is no longer a free parameter but instead a fixed number n 1.This is an interesting feature, since the performance of the CSA algorithm is very sensitive to variations in the number of clones 16, 25 .
The total number of clones generated N c is defined as follows:

3.6
This step draws the evolutionary process closer to the goal.It raises the average affinity value and gives the following steps a good chance to further move towards the solution.

(4) Mutation
Provide each ab in the clone set C with the opportunity to produce mutated offspring C * .The higher the affinity, the smaller the mutation rate.To adaptively determine the mutation rate according to the affinity of each ab, the process is as follows.
Firstly, for each ab i ∈ AB, normalize its affinity F ab i into the range 0, 1 : Then, let each ab i have the chance to mutate; the mutation rate is adaptively calculated as where p m is the mutation rate of each ab, 2 is the empirical value to control the decay, and F ab i is the affinity according to 3.7 .In 3.8 , the range of the mutation rate is 0, 1 .
Finally, the cloned antibodies are mutated with probability p m .The mutation process to each ab in the clone set C is as shown in Algorithm 1.The function mutation B with mutation rate p m , is defined in Algorithm 1.The function random minimum, maximum generates a random real value using a uniform distribution in the range from the minimum to the maximum.Function Δ Ite, u is defined as where m is the iteration number, Ite is the maximal iteration number, r is a random value within the range 0, 1 , and λ is a parameter to decide the nonconforming degree.This step is crucial in the proposed algorithm.It generates random changes of single features of the individual solutions.The value of these changes can be found at the criterion function calculation within the next generation cycle.This helps avoid local maxima and produces new properties of mutated antibodies that can remain if they are successful, while traditional fuzzy clustering algorithms, such as FCM, often get stuck at suboptimal solutions based on the initial configuration of the system.
To avoid chaotic development and maintain the best abs for each clone during evolution, one original ab for each clone without mutation during the maturation process is kept, else it would destroy the positive development of the previous step and disable any major development towards the solution. (

5) Recalculation of Affinity
Calculate the affinity F * ab i of the matured clones C * .

(6) Reselection
From the mature clone set C * , reselect the nabs with the highest affinity to replace the nabs with the lowest affinity in AB.Select the highest affinity ab in C * to be a candidate memory cell, mc candidate .If the affinity of mc candidate is higher than the memory cell, mc, then mc candidate will replace mc and become a new memory cell.

(7) Displace
In order to replace the d lowest affinity abs from AB, d new antibodies are produced by a random process.This step may increase the diversity of the antibody population.
Step 3 stopping criteria .The stopping criteria for the algorithm are as follows.One option is to set a fixed number of iterations as the stopping condition.The other criterion is that if after a few iterations, there is no improvement of the criterion function F value as shown in 3.10 , then the optimal clustering result has been found.Otherwise, return to Step 2 until the stop criteria are satisfied.
where the change threshold ε is a user-defined parameter and selected according to different applications.

mutate B
Finally, the proposed algorithm outputs the value of the memory cell and obtains the optimal fuzzy partition with the current number of classes, nc, nc 1, 2, . . ., C max .

The Optimization Layer of FCSA
Determining the optimal number of clusters is an important issue in FCSA.To evaluate the optimal solutions, FCSA evaluates the validity measure of the c-partition for a range of nc values using the Xie-Beni XB index 8, 17 and then selects the optimal number of clusters with the minimum value of the XB index.Here, nc is an estimate of the upper bound of the number of clusters.
The XB index is defined as a function of the ratio of the compactness π to the separation s.Here π and s can be written as follows:

3.11
where d ij represents the distance between the ith center v i and the jth antigen x j , N represents the total number of the antigens, nc represents the number of classes.
A smaller XB indicates a partition in which all the clusters are compact and separate from each other.Thus, FCSA has to find adaptively the optimal number of clusters with the smallest XB calculated by the corresponding classification result.
The flowchart for FCSA is shown in Figure 2 .

Experiments and Analysis
The proposed FCSA and traditional clustering algorithms for land cover classification were all implemented using Visual C 6.0 and tested on different types of remote sensing image.Two experiments were conducted to test the performance of classification.Only FCSA can classify the image without a priori assumptions on the number of clusters and finally output the optimal number of clusters.To better assess the performance of FCSA, consistent comparisons of classification results with the optimal number of clusters were also performed among FCSA, K-means, ISODATA, and Fuzzy C-means FCM using the classification accuracy of the Flightline C1 and Landsat TM images.

Experiment 1: Flightline C1
This experiment was conducted using a data set designated Flightline C1 FLC1 26 , which was 12-band multispectral data taken over Tippecanoe County, IN, by the M7 scanner in June, 1966. Figure 3 shows the experimental FLC1 image 92 × 107 pixels with spectral ranges from 0.40 to 1.00 μm.
The primary parameters to be provided by users for the classification were the maximum number of classes C max , the maximum iteration MaxIte, antibody population size M, the number of selected antibodies n see also Step 3 in Section 3 , and the number of displaced antibodies, d see also step 7 in Section 3 .Generally, to conveniently apply FCSA, n is often set to N. The affinity function was determined by 3.5 .The values of these parameters were set as follows: M 20, MaxIte 100, n M 20, d 5, and C max 6.The weighting exponent m, used by the FCSA and FCM, was set to 2, which was the optimal value of m within 1.5, 2.5 in practical applications 18 .
FCSA automatically provides four clusters for this image dataset.Figure 4 shows the variation of the XB index with the number of clusters when FCSA is used as the underlying clustering technique.As can be seen from the figure, the minimum value of the XB index  is obtained for four clusters with the FCSA algorithm.In fact, from our ground knowledge, the survey area is an agricultural area that is expected to fall into four classes: corn, oats, red clover, and wheat.Hence, it is evident that FCSA correctly finds the optimal number of clusters in this case.The list of classes and the number of labeled samples for each class are given in Table 1.The field map is shown in Figure 5 based on ground truth data and Figure 6 displays the spectral curves of the above four land cover classes.
To better evaluate the classification performance of FCSA, three traditional clustering algorithms for land cover classification are used in this experiment: K-means, ISODATA, FCM, when the optimal number of clusters is set to 4. Figures 7 a ,   and ISODATA create similar classification results and cannot correctly obtain four clustering partitions.In the classification images of K-means and ISODATA, the oats class disappears and is misclassified as wheat.The reason for the incorrect results is that the spectral curves of the oats class green and the wheat class yellow shown in Figure 6 are too similar to allow differentiation by the K-means and ISODATA algorithms, which have only little differences in the 11th and 12th bands.FCM and FCSA may correctly find the oats class by the corresponding fuzzy partition Figures 7 c and 7 d .Comparing FCSA with FCM, they have similar results in the corn, oats, and wheat classes.However, FCM fares the worst in the red clover class because many red clover pixels are misclassified to the corn class at the bottom of the classification image.In contrast, FCSA achieves the best visual accuracy in the red clover class and also performs satisfactorily in the oats and wheat classes.As a result, the use of FCSA gives better results for all four classes.For a more detailed verification of the results, we compared ground truth data Table 1 with the classified images and assessed the accuracy of each clustering algorithm for land cover classification quantitatively using two statistics, Overall Accuracy OA , and Kappa Coefficient based on the confusion matrix 2 .Columns in a confusion matrix typically represent the reference data and rows represent the classification data.Overall Accuracy is simply the sum of the pixels classified correctly e.g., the diagonal elements divided by the total number of samples in the comparison.The Kappa coefficient can be defined in terms of the confusion matrix as follows: where r is the number of rows in the matrix, x kk is the number of observations in row i and columnj, x k and x k are the marginal totals for row i and column j, respectively, and N is the total number of observations.Tables 2 and 3 list the results of the comparisons between the ground truth data and the classified images obtained by four clustering algorithms: K-means, ISODATA, FCM, and FCSA.It was noted that FCSA is evolutionary and the results obtained are unlikely to be similar twice, that is, FCSA is nondeterministic; the experiment described above was performed 10 times and the final result obtained were again averaged in tables.From Tables 2 and 3, it is apparent that the FCSA classifier provides better classification results than the other classifiers.The details are as follows: the four classifiers have similar results for the corn class, for which the difference is in the range of 10 pixels.Consistent with the visual classification results, the K-means and ISODATA algorithms have the lowest classification accuracy since they cannot correctly partition the image.FCSA achieves a better classification result for wheat and red clover classes than does FCM, while FCM slightly exceeds FCSA in the oats class by 4 pixels.As a whole, FCSA exhibits the best overall classification accuracy of 92.08% with a gain of 17.3%, 17.3%, and 4.99% over the K-means, ISODATA, and FCM algorithms, respectively.FCSA improves the Kappa Coefficient from 0.6463 to 0.8912, an improvement of 0.2449.One reason for this is that the conventional clustering algorithms often becomes stuck at suboptimal solutions based on the initial configuration of their systems and have a low precision, such as K-means, FCM.Being different from traditional clustering algorithms, FCSA, inspired by immune systems and based on clonal selection algorithm, is a data-driven, self-adaptive method that can adjust itself to the data without any explicit specification of functional or distributional form for the underlying model.FCSA extends the search space by the process of cloning and quickly finds the optimal solution by the mutation steps.Therefore, FCSA can generate the optimal clustering results to make it flexible in modeling real, complex relationships, which is an important advantage that can adapt to the complex distributions in land cover classification.
In addition, the conventional clustering algorithms require ideal conditions and are sensitive to the initial clustering centers, for example, a priori assumptions on the number of clusters.However, because of the complexity of ground substances and the diversity of disturbance, ideal conditions are not often met in real classification calculations.When the number of clusters is incorrectly defined by users, traditional clustering algorithms find it difficult to obtain satisfactory classification results.By the two-layer system, FCSA can adaptively find the optimal number of clusters to make it appropriate in different real complex conditions, which is another important advantage that adapts to the complex distribution in land cover classification.Therefore, FCSA has the capacity of self-learning and is robust.Based on the above, we can conclude that FCSA is a better clustering algorithm for land cover classification.

Experiment 2: Wuhan TM
The image data used in this experiment refers to the city area of Wuhan in the central part of China.The image 400 × 400 pixels with a spatial resolution of 30 meters was acquired by Landsat-5 on October 26, 1998.The dataset is composed of six spectral bands and their spectral ranges are from 0.45 to 2.35 μm. Figure 9 shows the variation of the XB index with the number of clusters when FCSA is used.As can be seen, the minimum value of the XB index is obtained for five clusters with the FCSA algorithm.FCSA automatically yields five clusters and five is the optimal number of clusters.The result is consistent with the real class distribution of land cover based on the ground information available to us.As shown in Figure 8 , some characteristic regions in the image are the well-known Yangtze River cutting across the middle of the image, a city, Wuhan, to both sides of the river.Two parallel lines observed in the middle of the image are the First and Second Bridge over the Yangtze River in Wuhan.The red pixels depict the vegetation classes according to the principles of the standard false color composite.The lakes are found in the right side of the image.The white pixels are known to be roads or open spaces according to visual interpretation experience.Apart from these, there are several water bodies, rare soils, and so forth in the image.Based on the above information, the image is expected to fall into five classes: river, vegetation, lake, building, and road.It can be noted  three traditional clustering algorithms.However, FCSA achieves the best visual accuracy in the vegetation and building class and also performs satisfactorily for other classes.Table 5 lists the results of the comparisons between the ground truth data and classified images obtained by four clustering algorithms: K-means, ISODATA, FCM, and FCSA.From Table 5, it is apparent that FCSA produces better classification results than the other clustering algorithms.The details are as follows: K-mean and ISODATA have the lowest accuracy because the lake class disappears in their classification images.This is evidence again that they are sensitive to the initiation steps.FCSA exhibits the best overall classification accuracy of 80.43%, that is, the best percentage of correctly classified pixels among all the tested pixels, with a gain of 39.3%, 39.3%, and 6.16% over K-means, ISODATA, and FCM, respectively.FCSA improves the Kappa coefficient from 0.2811 to 0.7346, an improvement of 0.4535.These evince that FCSA is a very competent clustering algorithm, which makes it promising for land cover classification.

Conclusions
A fuzzy clustering algorithm based on clonal selection for land cover classification, namely, FCSA, was proposed in this paper.Traditional clustering algorithms, such as fuzzy cmeans, require the a priori specification of the number of clusters and easily fall into a local optimum.The proposed algorithm has attempted to tackle the problems of FCM by use of the clonal selection algorithm to provide near-optimal solutions without a priori assumptions of the number of clusters.For this purpose, FCSA is designed as a two-layer system: the classification layer and the optimization layer.In the classification layer, FCSA is used to find the optimal fuzzy partition to a fixed number of classes by the immune operators of the clonal selection algorithm, clone, selection, mutation operators, and so forth, while in the optimization layer FCSA uses the Xie-Beni index as a measure of the validity of the corresponding partition to find the optimal number of classes.
Two experiments were carried out to test the performance of FCSA using Flightline C1 and TM remote sensing images.Compared with three traditional clustering algorithms, Kmeans, ISODATA, and FCM, only FCSA can adaptively find the optimal number of clusters and FCSA has consistently demonstrated its better performance with the optimal number of clusters.Since K-means and ISODATA cannot correctly partition the image because one class often disappears and a significant amount of confusion is provided in their classification results, their average Overall Accuracy OA and Kappa Coefficient are worst, 57.96% and 0.4637, respectively.FCM improves the average OA and Kappa Coefficient to 80.68% and 0.7466, respectively.The best classification result is provided by FCSA, its average OA and Kappa Coefficient being 86.26% and 0.8129, respectively.These evince that FCSA is applicable for performing the task of land cover classification and has high classification precision.In future work, we will analyze the sensitivity of the proposed algorithm in relation to the parameters, for example, population size, for improving the classification performance and may test FCSA using high-dimensional datasets, such as hyperspectral remote sensing imagery.

FCSAFigure 2 :
Figure 2: The process for the fuzzy clustering algorithm based clonal selection FCSA .

FCSAFigure 4 :
Figure 4: Variation of the XB index with the number of classes for the FLC1 image using FCSA.
7 b , 7 c , and 7 d illustrate the classification results using K-means, ISODATA, FCM, and FCSA, respectively.The visual comparisons of the four clustering results in Figure 7 show varying degrees of accuracy in pixel assignment.It can be seen from the classification images that the four classifiers have similar classification results in the corn class.For the other classes, K-means Background Corn Oats Red clover Wheat

Figure 7 :
Figure 7: The classification results with four clusters for the Flight C1 image.a K-means.b ISODATA.c FCM. d FCSA.
Figure 8 shows the standard false color composite image of Wuhan TM using bands 4, 3, 2. The values of parameters are set as M 20, MaxIte 100, n M 20, d 5. Unlike experiment 1, C max is set to 7 according to the distribution of land cover classes in the image.The parameters in the other traditional algorithms are the same as in experiment 1.

Figure 10 :Figure 11 :
Figure 10: The spectral curves of five land cover classes.

i
AB denotes the set of antibodies and ab represents a single antibody, where AB {ab 1 , ab 2 , . . ., ab M }, M is the number of the antibody population.Each antibody . . .M represents a possible solution of the cluster result, p is the number of ab's features, p p × nc, AB ⊂ Ê p .N represents the total number of unlabeled remote sensing image pixels and p the bands for each pixel and the image.AG ⊂ Ê p .
ii AG denotes the set of antigens, which represent unlabeled data or image pixels.AG {ag 1 , ag 2 , ..., ag N }, N is the number of the antigen population, ag i ag1i , ag 2 i , . . ., ag p i , and p is the dimension of features.For land cover classification,

Table 1 :
List of classes and number of labeled samples in each class for experiment 1.

Table 2 :
Comparison of four clustering algorithms in experiment 1.

Table 3 :
Comparison of four clustering algorithm performances with four classes in experiment 1.

Table 4 :
List of classes and number of labeled samples in each class for experiment 2.

Table 5 :
Comparison of four clustering algorithm performances with five classes in experiment 2.