For unsupervised color image segmentation, we propose a twostage algorithm, KmsGC, that combines
Unsupervised color image segmentation plays a key role in various image processing and computer vision applications, such as medical image analysis [
Apart from these approaches, graph cut based techniques emerged as increasingly useful tools for energy minimization problems producing provably highquality solutions in practice, and they have been successfully applied to a wide range of problems in computer vision in recent years. In image segmentation context, since Boykov and Jolly first demonstrated how to use binary graph cuts to build efficient object extraction tools for image segmentation in [
In this work, we aim at making use of the advantages of clustering technique and graph cut optimization technique, combining them together, and propose a new unsupervised color image segmentation solution. We solve the color image segmentation problem via multiway graph cuts and establish a compactness criterion to achieve an unsupervised, fast, and effective segmenting solution.
The rest of the paper is organized as follows. In Section
An image can be defined by a pair
We distinguish between two special vertices of
In [
For the energy minimization problem, there are some methods for the optimality guarantees in certain cases. For example, dynamic programming can be used for a few onedimensional energy function minimization, such as snakes. Mean field annealing can deduce the minimum of the energy by estimating the partition function. But computing of the partition function is computationally intractable. Graduated nonconvexity is a kind of continuation method, but the quality of the output may not be known except for certain cases (see [
(a) A graph
Clustering is a process of classifying data items into similar groupings.
Determining the optimal number of clusters is one of the most difficult problems for clusteringbased image segmentation methods. Most methods cast this problem into the model selection [
In this part, we proposed a new unsupervised color image segmentation solution. As the number of clusters is necessary for the
We know that “minimum intracluster variance and maximum intercluster distance” lead to compact clusters [
First, we want to find compact clustering with “minimum intracluster variance.” An intracluster distance measure is defined as the average distance between the data items and their cluster centers within clusters, and we want it to be as small as possible. It can be defined as
Next, we use an intercluster distance measure, the distance between clusters, to describe compact clustering with “maximum intercluster distance,” and we want it to be as large as possible. We calculate it as the distance between cluster centers and take the minimum of this value, defined as
Since the minimum value of intrameasure and the maximum value of intermeasure lead to compact clustering, we combine them and define a compactness criterion as
To obtain compact clusters, we obviously want to minimize the
Based on the work above, we present a new unsupervised color image segmentation approach (referred to as KmsGC) which consists of two stages. In the first stage, we give the image pixels initial partitional clustering by
In the first stage, we do clustering with
The inner cluster variance is defined as
In the second stage, we address the construction of energy function in our application as follows. We firstly deal with the data penalty term, the first term in (
Secondly, we describe the discontinuity penalty term, the second term in (
We set the spatially varying smoothness cost as the filtering of the image according to each RGB channel and take the maximum of the three values as the final value.
For the energy function minimization algorithm, as the global energy minimization needs enormous computational costs, local minimum becomes a desirable option. Standard moves technique is often used for calculating the local minimization, which changes a pixel’s label in a time. Many methods use standard moves, such as iterated conditional modes (ICM) [
In summary, the proposed KmsGC algorithm can be described as in Algorithm
In order to evaluate the performance of the proposed algorithm, we design a number of experiments. First, the accuracy of determining the cluster number must be tested and compared. Second, the qualitative performance is to be accessed by comparing the segmented results with human segmentations and other approaches. Third, the segmented results are to be compared against other approaches by means of effective indices to judge the quantitative performance of the proposal.
Experiments are performed on a typical 2 GHz Intel PC with 2 GB RAM, the MATLAB implementation of our algorithm on spatial point sets, standard images, and real color images.
In terms of the computation time, it depends on the size of image, the value of
We first test the accuracy of the proposal on the cluster number determination. The proposal was tested on various spatial point sets for their ideal cluster numbers are known beforehand and also on the standard images for these images are simple so their segmentation results are straightforward. In the experiments, we found that the minimum value of the compactness criterion occurred at the ideal cluster number for each spatial point set, which verifies that the optimal cluster number can be determined according to the minimum value of the compactness criterion. Here, we give the results of a point set and a standard image to show how the algorithm works. Figure
(a) A point set and (b) its optimal cluster number determination. In (b),
More than ten standard color images are also used to test the performance of cluster number determination. In these tests, we set
(a) An original image, (b) segmented image, and (c) the cluster number determination. In (c),
Below we compare our unsupervised solution with the popular regularization criterion MDL. There are many MDL criterion based segmentation algorithms, such as [
For the proposed KmsGC, we use a compactness criterion and the energy function to reach compact clustering, which is consistent with the base point of human segmentation. The MDL based algorithm has been used for many unsupervised segmentation solutions and it is objective, but it is experimentally difficult to determine the optimal stopping time of the merging, so it may generate oversegmentation or undersegmentation. For example, at the right column of the second row of Figure
MDL based algorithms usually merge two adjacent segments if this merging decreases the coding length the most until no decrease happens, like [
Segmentation results, comparing with human segmentations. Each row from left to right: original image, human segment, human segment, and our segment.
Regarding detailed preservation, MDL based segmentation sometimes cannot preserve details. For example, in the left picture of the second row in Figure
For the MDL algorithm, if the boundary constraints are not incorporated into the MDL criterion, it may be hard to reach correct and smooth boundaries. Compare our boundary with the results in Figure
In this section, we visually evaluate the qualitative performance of the segmentation results on the Berkeley image database. This database provides several human manually segmented results for each image and here we employ two of them as the perceptual evaluation references per image. We set the evaluation criterion as: the more similar to these human segmentations, the better the segmented results. We test the proposal on all the 300 images (image size
(a), (b), (c), and (d) are the cluster number determination of the images (a), (b), (c), and (d) of Figure
From Figure
In the bear image, the contour of the segmented bear is highly consistent with the first human segment; the segmented boundary of the bear and the bands is smooth. Important details are detected, such as the ears and nose of the bear.
In the 2nd image, the church is accurately segmented and the boundary is smooth.
In the third image, the segmented region contours of the stone and the tree are close to those of the human’s, and the boundary is smooth. The segmented regions are of large size while the details are preserved.
In the flower image, note that in the second human segmentation the leaves are segmented while they are not in the first human’s. In our result, the petals and the leaves all have been segmented, and the contour of the flower core is correct. The contours of the petals are generally correct, and the leaves are in large segments.
In the woman image, although the illumination variation exists, the woman’s face, neck, and hair are accurately segmented and also the forehead is well segmented, which indicates that the number of clusters is accurate.
In the 6th image, the objects like the man, fishes, and the sea bottom are well segmented where the number and size of the segmented regions are almost equal to those of human segments, which make them easy to be automatically recognized.
Observing the result of the 7th image, the distant objects like the tree and hills are accurately segmented where the contour of the tree highly matches those of the human segments, and the boundaries are smooth, which can be a substitute for the human segmentation. In the near area, the cow and the haystack are well segmented in general.
Figure
We set the upper limit cluster number
As the segmented results illustrate that the segmentations are very close to human segmentations, it may suggest that the proposed algorithm on cluster number determination is effective.
We now visually compare the proposal with two clusteringbased unsupervised segmentation algorithms: MS [
First compare with the MS algorithm. Observing the experimental results of MS in [
Compare
Examples of the segmentation results. 1st and 3rd columns are the original images; 2nd and 4th columns are the segmented images.
Second compare with CTM algorithm. Apart from the analysis in the introduction, the CTM algorithm solves the unsupervised problem by minimizing the lossy description length of the feature vectors, which will hold the performance of MDL. Except for the discussion on MDL based segmentation algorithm above, we notice that, for CTM algorithm, the experimentation shows that the results are sensitive to the parameters. Based on the segmentation results provided in [
We now compare the quantitative performance of the proposed algorithm against MS and CTM. The comparison is based on four quantitative performance measures which have been used in many works as follows [
In the experiments, we set all parameters as in [
Table
Quantitative evaluation result of algorithms on the four indices (the best values for each index are in bold letters). PRI ranges between
Approach  PRI  VoI  GCE  BDE 

Humans  0.8754  1.1040  0.0797  4.994 
KmsGC 

2.5616  0.2932 

CTM 
0.7561  2.4640  0.1767  9.4211 
CTM 
0.7627  2.2035  0.1846  9.4902 
CTM 
0.7617 


9.8962 
MS  0.7550  2.4770  0.2598  9.7001 
It is important to notice that the quantitative evaluation result also indicates that none of the algorithms is a clear winner in terms of all four indices. Notice that CTM algorithm shows different performance while the parameter
We may look into the quantitative evaluation results and observe the relationship between the visual and the quantitative evaluation. According to the definition of the indices, PRI index measures the correctness of the pixel labeling, and VoI index measures the difference between two segmentations based on conditional entropy of the information. So a higher PRI value means more accurate segmentation results. Benefiting from optimization of the informationtheoretic criterion, CTM algorithm deserves the best performance on VoI index. BDE index measures the average displacement error of the boundary pixels. The proposal optimizes the boundary by an energy function, so the segmentation results demonstrate that the segmented boundaries match those of the human segmentations, which brings a best BDE value. As noticed in [
We first discuss the conceptual advantages of the proposed method. The proposed KmsGC is a twostage solution which combines
For the unsupervised solution, a compactness criterion is constructed based on clustering principle to determine the optimal number of clusters, which is straightforward and ensures compact clustering. And
In general, the proposed model successfully incorporates
Comparing our segmentation results with the segmentation based on
A variety of experimental results demonstrate that the proposed algorithm KmsGC is effective and able to produce desired image segmentation results. It is important to notice that we do not have any complicated parameter to tune, and the algorithm is straightforward and unsupervised.
However, some disadvantages exist in the proposal:
Oversegmentation is an intrinsic problem appearing in clusteringbased segmentation. The technique of region merging has been brought up to solve this problem. In [
In this work, the human segmentations are assumed as the best perceptual references. To reach better performance, the segmentation result of the proposal has to be tuned to be closer to the human segmentation. So it is necessary to learn the patterns of human segmentation’s behavior and then improve the proposal to get results in a more similar way to the humans. These are some of the challenging problems to be solved in the future.
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors would like to thank the editors and the reviewers for their valuable reviews and comments. They would also like to deeply thank Dr. Dongquan Liu and Paul Liu for their great contributions to the improvement of this paper.