Automatic Cell Segmentation in Cyto- and Histometry Using Dominant Contour Feature Points

Automatic cell segmentation has various application potentials in cytometry and histometry. In this paper, an automatic cluster (touching) cell segmentation approach using the dominant contour feature points has been presented. Dominant feature points are the locations of indentation on the contour of the cluster. First, dominant feature points on the contour of the cluster are detected by distance profile. Next, using shape features of the cells, these feature points are selected for segmentation. We compared the results of the proposed method with manual segmentation and observed that the method has an overall accuracy about to 82%.


Introduction
Cytometry and histometry are fields of quantitative pathology and cell biology. Automatic imagebased histo-and cytometry is used to quantify regions or isolated cells of specimens. For regions that have to be analyzed, the segmentation can be performed without problems [4]. Difficulties arise when features are to be extracted from single cells or nuclei, e.g., histometry in nuclear chromatin structure [10], count of silver stained nucleolar organizer regions (AgNOR) [1,2], FISH signal count [12], histometry of architecture [2,5,8], etc. In these cases, many specimens contain clusters of cells or nuclei that are not detected as individual entities. In most specimens, particularly clinical specimens, these cells form touching or overlapping clusters in the images. Large intensity differences no longer exist between the cells of such cluster. As a result, those methods [6,13] which segment the clusters by considering the significant differences of intensities between cells of the cluster may not work properly, as do methods which combine border and region conditions [3,11,14]. In this paper, we have developed a simple method based on dominant contour feature points and shape criteria of the cells to segment such clusters into individual cells. In histometry, depending on section thickness, tissue type (organ) and cell growth type, the number of isolated cells is very small because of cluster problems. Overlapping or occluding cells are frequently found witout any significance for inner cell border lines.

Material
We considered 553 images of nuclei of colon tumours from four different patients for the evaluation of the method. The images are digitized by microscope with a mounted TV-camera using a 100× objective (NA 1.3), narrow bandpass filter (537 nm central wavelength). The slides were stained according to Feulgen. Pixel size was 0.25 µm and the size of each image was 128 × 128 pixels. The image acquisition proceeding avoids influences by varying section thickness and focal adjustments, which can alter textural features considerably. The objective of these images was to detect the central nucleus of each image.

Preprocessing: generation of mask input image
From the grey scale image, the pixel value frequency distribution (histogram) is fitted by a 2nd order polynomial (P) and a Gaussian (G), where typically P reflects the portion of nucleus pixels and G the background. A threshold T is calculated near the flank of the Gaussian [7]. The fitting is performed with routine gaussfit from idl (Research System Inc, Boulder, CO, USA). Applying threshold T to the image results in a mask image (a binary image) which is first filled and then smoothed by an opening with a structuring element of the form '+' [13]. A grey scale image with its binary image and the smoothed binary image is shown in Fig. 1. The central nucleus of this image is not in isolated form.

Dominant feature points detection
Dominant feature points are the locations of indentations on the contour of the cells. To find such points we trace all points on the contour of the component and compute the eight distance profiles of the contour pixels from eight directions. In Fig. 2 the 8 directions are shown. A distance profile of the contour pixels of a component for a direction, say direction no. 5, is the distance from the contour pixels to the bounding box of the component in that direction (to the right). Also, while tracing the contour we compute 8 logical functions corresponding to the eight distance profiles. The value of a logical function of a traced pixel for a direction will be true if its next neighbor pixel in that direction is an object pixel. Those local maximum points of all the distance profile functions where the value of their corresponding logical function is true are considered as dominant feature points.
To illustrate this, consider    rectangles when the value of its logical function is false. Although there are three local maxima in this distance profile (see Fig. 3 arrows ←), only the one labeled C is considered for a dominant feature point, as the value of the logical function of this profile function at C is true. The local maximum at C in the profile corresponds to the feature point ←C on the cluster. For 5 change points of the logical function the correspondence between contour and profile is outlined by ↓ in Fig. 3 (left and right).
We have considered the dominant features in 4 classes a, b, c, d and they are labeled as follows: Feature points which are obtained by considering the projection profile for the direction 0 and 1, 2 and 3, 4 and 5, 6 and 7 (see Fig. 2) are labeled by a, b, c and d, respectively. In some cases we may get two feature points very near to each other. Feature points within a distance less than three pixels on the contour are considered as one and the point whose label is minimum is taken (see Fig. 4 feature point B).
The feature points of the cluster component of Fig. 3 are shown in Fig. 4. Here, different class feature points are marked by different symbols. Feature points A and B belong to the class labeled d ( ), E and F belongs to the class labeled b ( ), C and D belongs to the class labeled c and a (×, ), respectively.
If we get two or more feature points in a component we consider that component as a cluster. A component which has only one feature point is treated as a confusion component which needs manual help for processing. Components without feature points are isolated components.

Segmentation
Segmentation of clusters has been done by a two stage approach. In the first stage, among these dominant feature points we find some possible point pairs which follow a set of rules. In the second stage, is not greater than the threshold T. 4. p/180 > 0.7 or dist(RS) < T 1 where p is the relative indentation angle for the two feature points R and S as described below and T 1 is a threshold with value 5. This condition will increase the probability for tangent cuts. For a point pair, say RS, (see Fig. 6) we compute the angle between the internal bisectors of the angle XSW and YRZ as shown in Fig. 6. We call this angle the relative indentation angle (p). Here, X and W are points T 1 pixels apart along boundary of the cell from S in clock-wise and counter-clock-wise direction, respectively. Similarly, Z and Y are points T 1 pixels apart along boundary of the cell from R in clock-wise and counter-clock-wise direction. The bisectors of the angles XSW and YRZ, which are marked by arrows in Fig. 6, show the direction of indentation. The condition p/180 > 0.7 forces a certain alignment of the directions of indentations for point pairs to be accepted for segmentation.
Feature point pairs of the cluster component of Fig. 3 are shown in Fig. 5. Here, AE, AF, BC, BE and CD are the selected point pairs which satisfied the above rules. Let the set of such selected point pairs be F SET .
From F SET we choose the pairs for segmentation according to their weight. The weight of a point pair is w 1 + w 2 + w 3 , where w 1 is the relative distance weight, w 2 is the relative indentation weight and w 3 is the relative shape continuity weight as described below.  1. Relative distance weight: w 1 = 1 − d/d max ; 2. Relative indentation weight: w 2 = p/180; 3. Relative shape continuity weight: w 3 = (∠ZRS + ∠WSR + ∠YRS + ∠XSR)/720 (see Fig. 6).
Here, d max is the maximum distance among the point pairs of F SET and d and p are the distance and the relative indentation angle of the point pair, respectively.
A point pair will be considered for segmentation if the line segment obtained by the point pair does not cross other line segments of point pairs which are already considered for segmentation. When a point pair is selected for the segmentation, a line is drawn between the points and all pair points which were formed by one of these two points are deleted from F SET . If we get more than one point pair with equal weight, we take the point pair with the highest distance weight for segmentation.
Thus, our cluster a segmentation algorithm CLUST-SEGM is as follows: Step 1: Form the set F SET from the feature points.
Step 2: Calculate the weight of the pairs of F SET .
Step 3: Choose the point pair from F SET with maximum weight. If the line segment obtained by this point pair does not cross any other line segments, which are already considered for segmentation, then segmentation is done by this line. We call this a valid segmentation. If a valid segmentation is obtained, delete all point pairs from F SET which are formed by one of these two points. Repeat this step until no point pair exists in F SET for valid segmentation. The segmentation result of cluster containing the central nucleus of Fig. 1 is shown in Fig. 7.

Results and discussion
We analysed 553 images obtained from the four specimens. Out of these images, the central nucleus was in isolated form in 351 images. The central nucleus for remaining images was in cluster (touching) form and we used these images for the performance evaluation of our method. An image was considered to be correctly segmented if its central nucleus was correctly segmented.
The performance of the method was measured by calculating the ratio of correctly segmented images to the total number of images. Correctly and incorrectly segmented images were determined by visually comparing the acquired images and the segmentation results.
We noted that our method works well when the binary image of a cluster was reasonably good. For example, see Fig. 8. Here, binary masks of the central nucleus cluster obtained by the automatic thresholding method (as described earlier) and by manual thresholding are shown in Fig. 8(b, c). Segmentation results after smoothing of the clusters are shown in Fig. 8(d, e), respectively. Here, segmentation from the binary mask image obtained by automatic thresholding gives a wrong result, but segmentation from the binary mask image obtained by manually improved threshold gives the correct result.
We obtain an overall performance of the proposed method of about 66% when we considered automatic thresholding. We noticed that the performance increased to about 82% when the binary mask image was obtained by manual thresholding.
The specimen-wise performance of our method on the images obtained by automatic thresholding as well as manual thresholding is outlined in Table 1.
The execution time for an image of 128 × 128 pixels, containing a cluster component of 6 dominant feature points is 0.79 second by a Silicon Graphics machine (INDY, 200 MHz, IP22 processor).
To validate the present algorithm, the factors influencing the result have to be isolated: 1. Mask input image (binary image). The primary thresholding operation is of topmost importance. If a human expert cannot segment the image properly, the algorithm will not either. 2. Wrong feature points.
If we get wrong feature points due to the roughness of the boundary, our method may give a wrong segmentation result. For example, see the result Fig. 8(d). Here, we got a wrong result due to the wrong feature points.
Quality considerations of this algorithm for binary image generation are beyond the scope of this paper. The proposed segmentation method can be applied to binary images from any other method.
A comparable method of segmentation of touching or occluding objects is the skeleton approach, where the dominant featue points are obtained from the exoskelton. The selection of the cutting lines is done by watersheds. This was first published by Meyer [9].
The proposed method does not contain an internal quality control. However it results in a considerable reduction of manual interaction time. The strategy in high resolution microscope image analysis is to avoid any artefact data by mis-selection and mis-segmentation. This necessitates a high amount of manual inspection time.
Automatic cluster cell segmentation has various application potentials in cytometry and histometry. Many cytometric and histometric specimens may contain large proportion of clustered cells or nuclei. Our method can segment these clustered cells or nuclei