Automatic Segmentation of Medical Images Using Fuzzy c-Means and the Genetic Algorithm

Magnetic resonance imaging (MRI) segmentation is a complex issue. This paper proposes a new method for estimating the right number of segments and automatic segmentation of human normal and abnormal MR brain images. The purpose of automatic diagnosis of the segments is to find the number of divided image areas of an image according to its entropy and with correctly diagnose of the segment of an image also increased the precision of segmentation. Regarding the fact that guessing the number of image segments and the center of segments automatically requires algorithm test many states in order to solve this problem and to have a high accuracy, we used a combination of the genetic algorithm and the fuzzy c-means (FCM) method. In this method, it has been tried to change the FCM method as a fitness function for combination of it in genetic algorithm to do the image segmentation more accurately. Our experiment shows that the proposed method has a significant improvement in the accuracy of image segmentation in comparison to similar methods.


Introduction
Image segmentation is one of the difficult issues in the field of image processing.Image segmentation is the process of assigning a label to every pixel in an image so that pixels with the same label share certain visual characteristics.Many applications such as object identification, feature extraction, and object position identifications and classification require accurate image segmentation.Several methods of medical image segmentation have been proposed, such as edge based, region based, or a combination of both.The purpose of medical image segmentation is to provide a more meaningful image which can be more easily understood and analyzed.
The edge-based methods use edge information in an image to determine the boundaries of objects and, hence, to form closed regions that determine different objects in an image.In some image segmentation methods, this method has been consistently used with the edge of the area for segmenting magnetic resonance imaging (MRI).
Chun and Yang performed image segmentation according to the edge information [1].In addition to edge information, they made use of a similarity measure which was obtained as the median pixel variance parameters.The method used a fuzzy validity function as well as genetic algorithms and tried to find limit and suitable search space for image segmentation.Finding the main edges and removing redundant edges were the main issues in this method.Moreover, atlas-based segmentation methods were successfully employed for different applications.For instance, Heckemann segmented 67 brain images using 29 marked images [2].
On the other hand, k-means and the fuzzy c-means (FCM) are two successful region-based approaches.FCM can be obtained by a little modification in the k-means algorithm.FCM has been successfully used for MR image segmentation.
Neural network method needs a large number of training data and long time required for network training.For many experts, manual image segmentation is difficult and time consuming, and this leads to the use of automatic method.
Among the image segmentation methods, the FCM is more popular.Vast usage of it is due to its simplicity and accuracy.However, FCM has weaknesses in noise detection; many attempts are done for covering this weakness.In [3], with using the objective function FCM and the use of neighbor pixels in addition to the pixel and also pixel division have been used.In [4][5][6], FCM method is used, and for improving the accuracy in image segmentation, membership function is changed.The defined object function in FCM is used, the local and nonlocal data segmentation are done, and the effect of each neighbor in the objective function is tested [7].According to the total biases and gradient of sparse matrix, the FCM objective function has been changed [8].The weight of local points in the objective function is calculated, and fuzzy clustering algorithm and continuity of local areas are used to make a better classification [9][10][11].In [12], it is tried to change the FCM method to improve image segmentation.In order to reduce noise impact, local and nonlocal information have been used for image segmentation.In this paper, a new method is used to determine the dissimilarity with respect to local and nonlocal distances of each pixel.An algorithm based on histogram and using predefined window class centers was introduced.Then, the modified FCM method that in its neighboring areas are also used, and ultimately a new technique that is called a neighborhood-based membership ambiguity correction is used to eliminate small noises [13].In [14], a hierarchical genetic method is used.In this method, chromosome is divided into two parts; that is, randomly placed one section of chromosomes with center point and the other with a random value of zero.The centers are determined by enabling or disabling.In [15], the genetic method and LVQ network for hierarchical classification of MR images are used.In this method, each chromosome is divided into two parts: one part is LVQ network weight and the other is determined by enabling or disabling the neurons of the LVQ network.In [16], a combination of statistical expectation maximization (EM) and pulse coupled neural network (PCNN) for segmentation of MR images has been used.
In most of the articles in the area of FCM segmentation, they believe that it is an effective method.

The Presented Method
Noise is inevitable in medical images.Therefore, this is essential to reduce the effect of noise prior to image segmentation.Since the method presented in this paper is based on FCM, we can perform noise reduction after noise detection and then perform image segmentation.Alternatively, we can estimate the probability of presence of noise in each pixel and change the effect of that pixel in FCM method accordingly.In this paper, we apply noise reduction to improve the image and, hence, to obtain a more reliable image segmentation.

Method to Reduce
Noise on the Image.In this paper, we distinguish noise with respect to local neighborhood of each pixel.In addition, we consider the neighboring pixels while detection of noise.Considering the fact that we aim at a fast image segmentation method, we need to make use of a fast noise reduction method.

Detection of the Noisy Pixels.
To detect a noisy pixel with the help of neighboring pixels, we followed the method proposed by [17] and defined the neighborhood of a pixel as shown in Figure 1.
Distance of central pixel with one of 8 first-order neighbors is calculated using The distance of the corresponding pixel from all its neighbors was obtained.These distances were sorted in an ascending order, and 5 neighbors with the longest distances are selected.
If the average of these 5 values was greater than a certain threshold, it means that at least five of the neighboring pixels have been very different so that the pixel is not considered as an edge position, and it is probably a noisy pixel.As we will discuss later, even if a pixel was wrongly detected as noisy, it would not affect much the algorithm as the noise correction method we use can usually handle it.

Correction of the Noisy Pixels.
When a pixel was identified as noisy, we use its neighboring pixels to correct it.The reason we use neighboring pixels lies in the fact that a pixel is expected to be similar to its surrounding pixel.
To do this, we select the pixel where total distance of its neighbor pixels with neighboring pixels of noisy pixel is less; this means that Euclidean distance between each neighboring pixel selected from the neighboring pixel in the same position of noisy pixel is calculated.Pixels to produce the lowest value means that their neighbors are more like together.Two pixels with similar neighbor values are expected to be close to each other, and, hence, the pixel can be replaced by the noisy pixel.

Image Segmentation.
Given the fact that FCM method is simple and accurate and also due to the fuzzy nature of the image segmentation, many works on segmentation have used this method.Different works have tried to tackle some of the weaknesses associated with FCM.An important problem in image segmentation is to guess the number of parts of an image.Some of the methods in the literature have ignored this factor and performed image segmentation with a predefined number of segments.However, in some specific cases such as MR images, the number of segments of an image can be guessed.But even in medical images, sometimes when image quality is low or variant or there exist tumors in the image, it is not possible to estimate the number of segments.
Having an accurate estimate of the number of segments in an image is crucial as parameters that determine accuracy as boundary overlaps highly depend on that.
In this work, we try segment to increase the accuracy in image segmentation in an automatic manner and without expert aid.In [18], objective function is based on the distance of each point from determined centers, and the extent of membership of each point to these centers is used to determine the centers in the next steps.
By considering the predefined parameters and number of segments and by using the principle of maximum entropy and genetic algorithm, we try to improve the existing parameters.In this method, the value of the membership of each pixel is clearly observable.Differences in the number of segments of an image directly affect the membership of each pixel.The method proposed in this paper utilizes the genetic algorithm and the FCM objective function and automatically estimates the number of segments for each given image.

Chromosome Representation.
The values corresponding to segment centers are placed in each chromosome.Hence, the length of a chromosome shows the number of different groups of data or, in other words, the number of segments in an image.We can control the maximum number of segments by measuring the length of each chromosome.Each cell of chromosomes can be initialized according to pixel values in the range of 0 to 255.For the first initialization, these numbers will be randomly generated.Table 1 is an example of a chromosome whose values are randomly generated.
If the chromosomes are initialized in the range 0 to 255, it means that the number of segments is equal to the length of the chromosome.But we want to set the number of image segments automatically.Some cells of chromosomes should have values that can show that they are not valid centers.
To represent this, we can use negative values.The negative values of a cell correspond to the fact that this cell should not be considered as a data center.Hence, the number of nonnegative values in a chromosome determines the number of data centers.Some examples of chromosomes with length 6 are given in Table 2.
The first chromosome in Table 2 shows three centers with the values of 150.7, 121.1, and 94.3.The second chromosome shows six valid centers, whereas the third one contains four valid centers.

Initial
Population.We construct the initial population using the method described earlier.To do this, we create arbitrary number of randomly initialized chromosomes.Also, we want to assign negative value to some allele of chromosomes.As mentioned, the negative values are not considered as centers of data.The number of valid data can be different in each chromosome.According to a fixed and predetermined probability, the values of chromosome alleles will change to negative.The number of valid data in each chromosome can be at least 2 and at most equal to its total length.

Fitness Computation.
With respect to the accuracy of FCM method, we leverage this method to calculate the compatibility.For each chromosome, using the estimated data centers and the pixel values of the image, we construct a matrix  that consists of membership values for each pixel.The elements of this matrix are calculated according to The objective function of Formula ( 2) is defined as where  is any real number greater than 1,    is the degree of membership of   in the data center j,   is the th dimension of the d-dimensional measured data,   is the d-dimension center of the data center, and || * || is any norm expressing the similarity between any measured data and the center, while   are set of centers that are stored in chromosomes.
With respect to   value that is considered as fitness, the best value of the   occurs when each data center is placed in the center of data sets, and this causes   be small, therefore as much   be smaller, then the approach is better, and we are try to get the smallest amount for   .
If all of the data for the chromosome are a valid set of data centers or that the entire chromosome is equal to the number of valid data,   is calculated according to the aforementioned method that is acceptable, but with respect to the fact that in each chromosome, the number of valid data as class centers is different, the value of   obtained for comparing chromosomes fitness value is not proper and proportional to the number of the centers, and the fitness should be corrected such that the comparison between chromosomes becomes possible.
It should be noted that with increasing the number of centers, the distance of each   from the nearest center decreases, and it is expected that by increasing centers and decreasing the distance of each   with nearest center, the   value is reduced, and because the   is generally less in chromosome with more valid numbers, usually chromosomes with high number of centers are selected, but our favorite state is when data centers are being guessed in the centers of data sets and automatically determine the minimum required number of data centers with respect to data entropy.Based on the importance of the   value, we need to add another parameter to correct fitness value according to the changes in the number of the centers.For improvement of the presented method, we assign a penalty factor for recognizing the increment of the number of data centers.We add  to show the factor of penalty.Parameter  is obtained from the total Value in each chromosome, are the data centers.Thus, the length of each chromosome shows the number of different categories or in other words shows the number of segment of image.We can control the maximum number of the data sets with controlling the length of chromosomes.Each cell of chromosomes initialized can be in the range of 0 to 255.In the first place, the numbers will be randomly generated.An example of randomly generated chromosome is shown in here.20 Then, with multiplying,   in  fitness is calculated for each chromosome according to 2.2.4.Selection.We use the roulette wheel technique to produce the mating pool of chromosomes.The main idea of the roulette wheel technique is to associate more chance to better chromosomes.

Crossover.
Crossover is the next step after the selection of parent chromosomes.In this step, a new offspring is generated as a result of combining two parents.
2.2.6.Mutation.Each allele of the chromosome changes according to the probability   .Mutation is used to perform a search over the entire range of answers.Figure 2 shows an overview of our approach to segmentation.

Results
To evaluate the proposed method, we use an MR image of brain of size 217 × 181 pixels, each of whose pixels range from zero to 255 (256 gray level).The number of initial population is set to 20 and the maximum number of generations to 100.
The number of individuals to be replaced in each generation In the resulting segments, in order to verify the correctness of segments which are in the same group, we use different colors which are random.
Figure 3 shows the output of the proposed method for automatic segmentation of the original MR image.
In order to verify the performance of our algorithm, we use sensitivity, specificity, Jaccard, and k index parameters measures.If A and B are the automatic and manual segmentations of an image, respectively, then       will be the true positive, and     − ,     −  will be the false positive and false negative, respectively.
According to Table 3, sensitivity is defined as [19] Sensitivity Specificity as Similarity is defined as [20]  ( ) The parameters settings for the genetic operations in this experiment are determined as shown in Table 4. Table 5 shows the performance of the proposed method when the initial number of chromosomes is set to 20 and 100 generations.As shown in Table 5, every time we increase the length of the chromosomes, the output results become more accurate.The answers are different in each run of the algorithm because the chromosomes are initialized randomly in each run of algorithm.Also, maximum number of iteration of the genetic algorithm is 100.
In addition, in Figure 4, we show the performance of our algorithm on a noisy image.
The overall results of our method on a noisy image are given in Table 6.

Suggestions
Given that the proposed method is based on the inner class distance, it has been attempted to act according to  the distance with the nearest center and correction center.However, one can think of using outer class distance to obtain increased classification accuracy.In this case, in addition to each pixel belonging to the nearest center, the parameter of being distant from the other points is also taken into account.With respect to these two parameters, the precision of medical image segmentation methods can be expected to be increased.

Conclusion
FCM is a popular clustering method and has been widely applied for medical image segmentation.However, traditional FCM always suffers from noise in the images.Although many researchers have developed various extended algorithms based on FCM, none of them are flawless.A method based on genetic algorithm with use of FCM is proposed in

Figure 4 :
Figure 4: Performance of our algorithm on an image with 3% noise.Image (a) is the original image, (b) is the brain for segmentation, (c) is the background of the image removed from brain, and (d) is MR brain image which is divided into 3 segments (shown with 3 different colors).

Table 1 :
A sample chromosome with length of 6.

Table 2 :
Three chromosomes of the initial population generated randomly.

Table 3 :
Different modes of pixels.

Table 4 :
Parameters for genetic operations.

Table 5 :
Results obtained from the proposed method.

Table 6 :
Results obtained from the proposed method.