^{1}

^{2}

^{2}

^{1}

^{3}

^{1}

^{2}

^{3}

Within the learning framework of maximum weighted likelihood (MWL) proposed by Cheung, 2004 and 2005, this paper will develop a batch Rival Penalized Expectation-Maximization (RPEM) algorithm for density mixture clustering provided that all observations are available before the learning process. Compared to the adaptive RPEM algorithm in Cheung, 2004 and 2005, this batch RPEM need not assign the learning rate analogous to the Expectation-Maximization (EM) algorithm (Dempster et al., 1977), but still preserves the capability of automatic model selection. Further, the convergence speed of this batch RPEM is faster than the EM and the adaptive RPEM in general. The experiments show the superior performance of the proposed algorithm on the synthetic data and color image segmentation.

As a typical statistical technique, clustering analysis has been widely applied to a variety of scientific areas such as data mining [

In general, the Expectation-Maximum (EM) algorithm [

In the papers [

In this paper, we further study the MWL learning framework and develop a batch RPEM algorithm accordingly provided that all observations are available before the learning process. Compared to the adaptive RPEM, this batch one need not assign the learning rate analogous to the EM, but still preserves the capability of automatic model selection. Further, the convergence speed of this batch RPEM is faster than the EM and the adaptive RPEM in general. The experiments have shown the superior performance of the proposed algorithm on the synthetic data and color image segmentation.

The remainder of this paper is organized as follows. Section

Suppose that an input

In the MWL learning framework [

Suppose that a set of

Subsequently, under a specific weight design, the papers [

initialize the parameter

as

and (

with

where

is converged.

To estimate the parameter set within the MWL framework, we have to maximize the empirical WL function

Through optimizing (

initialize the parameter

all

where

is converged.

In the above batch RPEM, its capability of automatic model selection is controlled by the weight functions

To deal with how to assign an appropriate value of

Furthermore, our empirical studies have found that a smaller

(a) Synthetic data set 1 with the well-separated clusters, and (b) synthetic data set 2 with the clusters overlapped considerably.

For each data set, we conducted the three experiments by setting

Performance of the Batch RPEM over the Parameter

Data set 1 | Data set 2 | Data set 1 | Data set 2 | Data set 1 | Data set 2 | |

−0.9 | G | G | G | G | G | G |

−0.8 | G | G | G | G | G | G |

−0.7 | G | G | G | G | G | G |

−0.6 | G | G | G | G | G | G |

−0.5 | G | P | G | P | G | G |

−0.4 | G | P | G | P | P | P |

−0.3 | G | P | G | P | P | P |

−0.2 | G | P | G | P | P | P |

−0.1 | G | P | G | P | P | P |

The performance of the batch RPEM as

Nevertheless, when

The converged positions of the seed points as

In addition, we also investigated the assignment of

The converged positions of the seed points learned via the batch RPEM as

To evaluate the performance of the batch RPEM algorithm, we have conducted the following three experiments.

This experiment was to evaluate the convergence speed of the batch RPEM. We utilized

(a) The initial positions of the three seed points and their converged positions learned by (b) EM, (c) adaptive RPEM, and (d) batch RPEM, respectively.

Nevertheless, as shown in Figures

Learning curves of

The value of the cost function

This experiment will investigate the performance of batch RPEM performance as

The converged positions of the seed points learned by the batch RPEM.

This experiment further investigated the batch RPEM algorithm on color image segmentation in comparison to the EM algorithm. We implemented the image segmentation in the red-green-blue (RGB) color space model that represents each pixel in an image by a three-color vector. We conducted color image segmentation on a

Segmentation of the hand image: (a) original image, (b) the result given by the EM, and (c) the result given by the batch RPEM.

The original house image.

For the house image, we initially assigned the seed points to be 80. A snapshot of the converged segmentation results of the EM and the batch RPEM is shown in Figure

Segmentation of the house image by (a) EM; (b) batch RPEM.

In this paper, we have developed a batch RPEM algorithm based on MWL learning framework for Gaussian mixture clustering. Compared to the adaptive RPEM, this new one need not select the value of learning rate. As a result, it can learn faster in general and still preserve the capability of automatic model selection analogous to the adaptive one. We have evaluated the proposed batch RPEM algorithm on both synthetic data and color image segmentation. The numerical results have shown the efficacy of the proposed algorithm.

This work was jointly supported by the grants from the Research Grant Council of the Hong Kong SAR with the Project Code: HKBU 210309, the Natural Science Foundation of China (60974077), the Natural Science Foundation of Guangdong Province (s2011010005075), Guangzhou Technology Projects (11c42110781), the Grants 60403011 and 60973154 from the NSFC, and NCET-07-0338 from the Ministry of Education, China. This work was also partially supported by the Fundamental Research Funds for the Central Universities, HUST:2010ZD025, and Hubei Provincial Science Foundation under Grant 2010CDA006, China.