Deformable Models for Segmentation of CLSM Tissue Images and Its Application in FISH Signal Analysis

In this paper we present an application of deformable models for the segmentation of volumetric tissue images. The three‐dimensional images are obtained using confocal microscope. The segmented images have been used for the quantitative analysis of the Fluorescence In Situ Hybridization (FISH) signals. An ellipsoidal surface initialized around the cell of interest acts as a deformable model. The deformable model surface voxels are subjected to various internal and external forces derived from underlying image features as well as externally imposed constraints. The deformable model converges to the optimum cell shape when the vector sum of all the forces acting on the model is zero. The result of segmentation is used to confirm the cell membership of the FISH signals and to reject all the signals that lie outside the cell nuclei. Three‐dimensional region isolation and labeling technique is used to label and count the FISH signals per cell nucleus. A simple study on the effect of different segmentation methods over a quantitative analysis of FISH signals is also presented.


Introduction
Quantitative evaluation of the features in a thick tissue specimen image is helpful in the diagnosis and prognosis of the patient. The accurate and detailed analysis of the morphological, cytological as well as histological features of the tissue specimen needs a precise and complete segmentation of the tissue image into isolated cells. Use of low level techniques such as thresholding, edge-detection, region growing, relaxation labeling, etc., to segment the cells may not give accurate results and is numerically expensive in the case of complex three-dimensional (3D) images [1,2]. Methods based on mathematical morphology such as watershed segmentation appear to be less practical due to over segmentation of the tissue images. Dependency of watershed methods on low level segmentation for finding the zone of influence for each regional minimum limits the accuracy of segmentation [3,4]. For an accurate and reliable analysis of the cyto-and histological features in complex, multispectral, volumetric images, a high level approach with some degree of interactiveness is necessary.
In this article we used a deformable model or active volume approach for segmenting the cells and assigning the cell membership to Fluorescence in Situ Hybridization (FISH) signals in volumetric prostate tissue images. The deformable model presented here is the 3D extension of the active contour models or Snakes proposed by Kass et al. [5]. An approximate geometric model of a cell such as ellipsoid surface is placed around the cell either interactively or by other means. The voxels belonging to this model surface act as a reference point for searching the actual cell surface. The reference point is moved towards the actual surface of the cell by applying image derived forces while maintaining the smoothness of the surface. The model surface converges to the cell surface when the vector sum of all the forces acting on the model surface reference points becomes minimum. Most of the 3D deformable models are edge based in the sense that they consider only the edge information during optimization process. We have included the region based control over the deformable model surface by using the prior information from low level segmented, labeled image volume. In the following sections we show how deformable model based segmentation can be used in case of 3D tissue specimen images and its usefulness in evaluating shape features and other malignancy related features of the tissue specimen. We demonstrate the utility of segmentation by deformable models in quantitative evaluation of the FISH signals.
In molecular pathology, morphological grading, staging, tumour size and the number of FISH signals per cell nucleus are mainly used to evaluate the malignant potential of prostate cancer. The number of FISH components indicate the gain (trisomy) or loss (monosomy) of certain base-sequences in deoxyribonucleic acid (DNA). Several studies [6,7], have shown that an aberration related to chromosome 7 is found in malignant and non-malignant tumours of lung, kidney, brain, as well as in prostate. Therefore, for improved diagnostic and prognostic values, quantitative evaluation of FISH signal is useful. To guarantee a reproducible, unbiased signal count per nucleus and realistic morphological feature values, a whole 3D nucleus has to be studied [8]. Visual counting of the FISH signals in a 3D stack of images is a tedious, time consuming and fatiguing task. Until recent times, an area around the cell of interest was displayed as a gallery of image slices and the FISH signals were manually counted. This process is neither reliable nor reproducible and nor feasible when testing a large volume of data. There is little work done on automation of such processes as it involves difficult tasks like precise segmentation of volumetric tissue images, segmentation of FISH signals from a noisy signal channel and assigning the FISH signal membership. This work is an attempt to reduce this highly interactive process by designing the deformable models for segmenting the cells of interest and automatically measuring FISH signals per cell nucleus.

Specimen preparation
Routinely processed, formalin-fixed and paraffinembedded tissue specimens from radical prostatectomies of 19 patients with prostatic adenocarcinoma were used in this study. Consecutive sections were made consisting of 15 µm sections for FISH with subsequent CLSM scoring. Sections were mounted on slides and baked overnight at 37 • C for better adherence. α-specific DNA probes specific for the centromeric region of chromosome 7 were generated as described in [8]. The probes were biotin labeled by nick translation according to standard procedures. The hybridization mixture consisted of carrier DNA (0.5 µg/µl herring sperm DNA), mastermix 2.1 (final concentration: 55% formamide/1 × SSC, 10% dextrin sulfate) and the labeled DNA probes. This mixture was denatured at 72 • C for 5 minutes and chilled on ice until use. For FISH of thick sections, a modified Zitzelberger protocol was been used [7]. The biotin labeled centromere 7 probe was detected by streptavidin-fluorescein isothiocyanate (FITC) and biotinylated anti-streptavidin conjugates. After washing, the nuclei were counterstained with propidium iodide (PI, Sigma) and subsequently mounted with an antifade solution [8,9].

Image acquisition
The specific set-up features for acquiring volumetric images of thick prostate specimen are as follows: Fluorescence images are scanned using a Confocal Laser Scanning Microscope Zeiss LSM 410, Lens Zeiss PNF 100×, NA 1.3, zoom = 2, obtained by the scanning unit. The scanned field of 62.5 µm × 62.5 µm is sampled to 256 × 256 pixels giving a lateral resolution of 0.25 µm. Excitation laser lines are selected according to the fluorochromes used. Both propidium iodide (PI) which is used as DNA counter stain and FITC labeled signals both are excited by the argon line 488 nm. If only the sampling theorem need be met, an approximate axial resolution of 0.5 µm can be achieved, keeping in mind that the confocal axial resolution of a lens with numerical aperture NA = 1.3 is about the wavelength 0.6 µm of the excitation laser. This results in a voxel size 0.12 µm × 0.12 µm × 0.3 µm. The fluorescein emission containing information of the fluorescence in situ hybridization signals is associated with the green channel and the counter stain emission showing the morphology of the tissue is associated with the red channel of a RGB color image.

Image enhancement
Several quantitative distortions can be introduced into 3D stacks of image slices (optical sections) during data collection. Once the sources of these errors are identified, some can be prevented by modifying the microscope system while others can be corrected either at the time of data collection or afterwards. Here we discuss the elimination of later type of distortions. Some of the major artifacts in confocal tissue images are the non uniform gradient magnitude along the cell boundary, uneven illumination of the specimen, blurring, and the low gradient magnitude where the cells are touching or overlap. The fine textural nature of the cell chromatin and the presence of dense intra-cellular and intercellular matters adversely affect the automatic analysis of the image. Thus, to highlight the desired features and minimize the noise artifacts, the image stack is subjected to enhancement steps such as deblurring by Weiner filtering [17], photobleaching, contrast stretching, window slicing, morphological opening and closing, surface crispening, directional Gaussian smoothing, etc. [10,11]. Some of these methods are briefly explained below with respect to confocal microscopy images.
The effect of non-cellular matter and uneven background can be reduced by window slicing. A local threshold is chosen by examining the local histograms in the image. Since most of the data sets show a unimodal histogram, we used (µ + kσ) as the threshold value where µ is the mean of the grey level histogram and σ is the standard deviation of the grey values in the image, where k is an experimentally determined constant. All the voxels having grey value below the threshold are assigned lowest grey value, i.e., dark, while other voxels considered as belonging to the object, are kept undisturbed. The resulting image will have the lowest grey value, i.e., zero for spatial background and hence the effect of dense non-cellular matter is reduced. Fig. 1(b) shows the result of window slicing on a confocal image slice. However, windowslicing may result in the appearance of small artifacts in the background as well as small holes in the cells. This may be due to the presence of dense intra and intercellular matter in the tissue specimen image. These artifacts are removed using size and shape heuristics. The size of the isolated objects is calculated. The objects having a size below the pre-defined threshold are considered as artifacts and removed. The size threshold is chosen on a trial basis. Small hole-like structures within the cells are restored to their original grey values. Figure 1(c) shows the result of applying a size and shape heuristic filter.
Due to the absorption of light by the specimen as well as photo bleaching of the fluorescent dyes, there is a gradual reduction in the intensity of light when we see through the depth of the image stack. We applied a heuristic enhancement method to reduce the effect of photobleaching.
LetĪ 1 ,Ī 2 , . . . ,Ī n be the average image intensity of the approximate foreground voxels in the corresponding image slices of the image stack.
. . , n then the correction for photobleaching is applied as I k (i, j) = I k (i, j) + cβ k where c is an experimentally determined constant. For all practical purposes, the effect of photobleaching can be considerably reduced by using this method. The intensity graph shown in Fig. 2 plots the average intensity of the foreground of each image slice against depth of the image stack. Figure 2(a) is the intensity plot before restoration and Fig. 2(b) is after intensity restoration by the above method.
Many of the cells are connected to each other by thin and frail connections that increases the interactiveness of the segmentation process. These thin and frail connections can be removed by morphological opening on the two tone version of the image, using an appropriate structuring element [12]. We used 3 × 3 × 3 structuring element (6 connected) of unit magnitude. Figure 1(d) shows the result of morphological opening over a single image slice. The boundary enhancement is obtained by adding the discrete Laplacian magnitude to the image function. Figure 1(e) shows the result of boundary enhancement over a single CLSM image slice. Boundary enhancement by adding a high pass signal also enhances the spurious noisy peaks. Simple average filtering to reduce the noisy intensity peaks blur the object boundary. In order to protect the surface sharpness during the smoothing process and to avoid the influence of distance pixels, we applied a two-dimensional directional smoothing technique on each image slice. The process of directional smoothing using a kernel of size 5 × 5 is shown in Fig. 3. The average grey values for all the 4 directional sub-windows were calculated. The maximum of these 4 values is the smoothed grey value of the corresponding central pixel. This reduces the blurring of the edges while smoothing [13]. Figure 1(f) shows the result of simple spatial averaging for smoothing while Fig. 1(g) shows the result of directional Gaussian smoothing of the CLSM image slice.
The cell clusters in the volumetric image are then labeled using a 3D component labeling algorithm [14] and stored separately in a temporary file. Figure 1(h) shows the result of 3D component labeling over a single image slice. This labeled image is used to provide the region based information for proper convergence of deformable models.

Building potential surfaces
When the deformable surface is subject to image forces, it moves in the image space to minimize the total potential energy associated with the model. For the easy and smooth movement of the deformable model, proper potential surfaces have to be developed. The control points of the deformable model moves or slides along these potential surfaces and converges to the desired features. The potential surface consists of an image intensity map, gradient map and distance map. The control points of the deformable model are viewed as moving in a respective potential surface(s) when the external force(s) acting on the control points are derived from intensity, gradient magnitude and distance feature of the image, respectively.
The cell surface is marked by a local gradient peak. Hence the gradient potential surface gives a good estimate of the location of the cell surface. The threedimensional gradient of the image intensity is calculated to build the corresponding potential surface. For this purpose, an enhanced and window-sliced image is used. If f e (i, j, k) is the enhanced image slice then the gradient magnitude map can be written as Conventional distance transform methods [13] replace each voxel in the volumetric image by its distance from the nearest feature voxel. For our purpose we do not need the distances of each voxel in the image but only the voxels that fall within the pre-defined search region around the overall cell surface. We used a 3D dilation with a unit magnitude 3 × 3 × 3 structuring element for calculating the distance transform. For simplicity, we considered all the twenty-six neighborhoods of the origin of a 3 × 3 × 3 structuring element to be of distance 1 from the origin that approximately gives a 3D chess-board distance. The overall surface of the cells in the image stack is traced using a low level edge detector such as Robert's edge detector. Let X 1 be the set of voxels in the image before dilation. Let X 2 be the set of object voxels after dilation. Then X 3 = X 2 − X 1 will be the set of newly appeared object voxels due to dilation. The voxels in X 3 are given a distance value of one. The process of dilation is re- peated and each time the newly appeared object voxels X 3 , due to dilation, is given one higher magnitude than the distance value given in the previous iteration. The number of repetitions is proportional to the size of the search region Θ. Figure 4 shows potential surfaces based on gradient and distance maps. Only a single image slice is shown for the lack of space. Figure 4(a) shows the gradient potential surface of the image slice after noise cleaning and enhancing. Figure 4(b) shows the distance based potential map over a single image slice.

Initialization and optimization
A deformable model is mathematically defined as an ordered set of n voxels or surface patches, V = {v 1 , v 2 , . . . , v n } where each surface patch v i is defined over the finite spatial grid of size M . In simple words a deformable model is an approximate initial model of the object, consisting of n surface voxels. In the present case we have considered an ellipsoid surface as an approximate model of the cell. After initializing the ellipsoid surface around the cell, the voxels belonging to this surface are subject to different image forces until they move along the potential surfaces and converge to the actual cell surface.
The objective of the initialization is to place the deformable shape in realistic positions and in proper spatial orientation to facilitate quick and proper convergence. Initialization of the model distant from the proper location results either in wrong segmentation or creates too much of a computational burden. In the case of 3D histo-pathological images, initializing the geometric models is a difficult task. This is because the cells can have irregular shapes and irregular spatial orientations. Initialization can be done by using standard algorithms such as Hough transforms, etc. [15]. In the present case Hough-transforms do not provide proper initialization due to the compact arrangement of the cells in a tissue, amount of noise artifacts and highly irregular cell shapes. When a small degree of interactiveness is allowed for initialization, it is computationally inexpensive to use the surface of a simple geometric model as the initial reference.
If (x, y, z) is the center of the ellipsoid then the surface coordinates of the ellipsoid are given by where x, y and z are the Cartesian co-ordinates of the center of the ellipsoid and a, b and c are respective radius in x, y and z directions. The spatial angle is represented by where θ and φ. θ is the angle between XY plane and the line joining the surface coordinate and the center of the ellipsoid. Similarly, φ is the angle between the XZ plane and the line joining the same surface coordinate and the center of the ellipsoid. Since the image is displayed in XY , Y Z and XZ planes simultaneously, it is easy to locate the approximate center of the cell and the number of image slices the cell occupies. If N s are the total number of image slices the cell is occupies then N s /2 can be considered as an approximate axial radius 'c' of the deformable ellipsoid. For providing the lateral radii 'a' and 'b' we designed an elliptic cursor as shown in Fig. 5 whose lateral radii and the angle of orientation can be varied with the help of mouse buttons. It is extremely difficult to calculate the spatial orientation of the cell nuclei before segmenting them. This is because two touching cell nuclei can have totally different spatial orientations when compared to the orientation of the individual cell nucleus. Hence the spatial orientation of the ellipsoid model is interactively adjusted using the GUI and the mouse buttons, where the angle of orientation is read in with the help of elliptic cursor. If i, j, k and i , j , k are the orthogonal unit vectors in old coordinates x, y, z and new coordinates x , y , z directions respectively, then the relation of the old and new coordinate system is given by where (i , i) is the angle between the vectors i and i, i.e., between new and old axis of abscissas, (j , i) is the angle between new axis of ordinates and old axis of abscissas and so forth. OXY Z is the old system of coordinates and O X Y Z is the new system of coordinates. Figure 6(a) shows the initialized ellipsoidal model in XY , Y Z and XZ planes.
Let v(s) be the initialized deformable model surface. The potential energy associated with a deformable model E DM can be defined as where E internal , E external , E constraints are the internal energy, external energy and the energy due to imposed constraints respectively. The internal energy of the deformable model depends only on the co-ordinates of the control points and is given as The internal energy includes the elastic energy (first derivative of the control point position) and the bending energy (second derivative of the control point po-sition), where ω 1 (s) and ω 2 (s) are set to be small constants to control the magnitude of the internal energy. The internal energy of the deformable model forces the model to shrink. The external forces consist of energies due to the image gradient, distance map, image intensity and other externally imposed hard constraints. Use of only gradient energy as the external force requires initialization to be very close to the global optima. This is a difficult proposition as we initialize the ellipsoidal surface approximately around the irregular shaped cells. To avoid the effect of noisy gradient peaks, we used distance transforms to calculate the external energy in the initial stages, and when the deformable surface moves close to the global optimum; the grey level gradient energy calculated from the enhanced image is used as a dominant external force. The use of distance transforms alone to calculate the external energy is not feasible since it depends on the accuracy of the initial segmentation done for calculating the distance transform. We have not used the image intensity features as the image stack is not uniformly illuminated and the variation in the average intensity of the foreground voxels persists even after enhancement and restoration.
To avoid the extreme variation in gradient magnitudes and distance values, we searched a small region Θ = (1 × 3) along the normal vectors of the control points for calculating the energy. The minimum gradient magnitude and the distance value within the search region are used as external forces derived from respective features. Thus, where d ij and g ij are the distance and gradient values for the ith control point in the corresponding search region Θ i . Discretization of the energy equations can be found in [16].
If the cells are considerably shape deformed then the deformable surfaces may fail to converge to distant voxels representing the cell surface. In such cases one can interactively mark distant surface voxels. These marks are used as externally imposed constraints. The control points of the deformable model that are nearest to the marked constraints, are forced to converge to these marked constraints neglecting the energy optimization of the model surface. Use of more external constraints slows down the convergence and increases the interactiveness in the optimization process.
The properly scaled external force is added to the total potential energy of the deformable model, i.e., if x t , y t , z t are the coordinates of a control point at time 't' then after one iteration of the optimization, the new coordinates of the control points will be , The control point of the deformable model moves in the direction to minimize the total energy of the deformable model. When the external forces derived from the image features balances the internal energy of the deformable model, the optimization process is stopped. The position of the control points or the deformable surface is taken as the loaction of the cell surface.
As the external forces acting on the control points of the deformable model surface are derived from the underlying image features, there is every chance that the part of the deformable model surface may become attracted towards the strong surface features of the neighbouring cells. This results in wrong segmentation. To reduce this effect we made the deformable model surface to be a function of the particular cell. This can be done by considering the low level segmented and labeled image volume. The deformable surface has been given the same label as that of the corresponding cell of interest. Only forces due to the cell with the same label are made to influence the optimization process.
Since the optimization process is an iterative process, one of the important aspects is to formulate a terminating criterion. If E area (v(s)) is the surface area of the deformable model at any instant t, then we stop the iterations when where λ is an experimentally determined small threshold value and E area (v(s, t)) = (v(s, t)) ds. Figure 6 shows the application of the deformable model technique to mark the surface of a cell belonging to a fairly close cluster. Figure 6(a) shows the initialization in of the ellipse in one image slice. Figure 6(b) shows the sequence of the part of the image showing the marked cell surface after optimization. Figure 7 shows the result of segmentation by active surfaces on a different data set.

Segmentation and counting of FISH signals
We applied the results of segmentation by deformable models for counting the number of FISH signals in each cell nucleus. In molecular pathology, FISH signals play a significant role in deciding the malignancy potential of the prostate tumour [7]. The goal of quantitative FISH evaluation is to measure the pathological alterations, i.e., gain (trisomy) or loss (monosomy) of a particular chromosome (chromosome 7 in the present case) in the cell nuclei. In the signal channel of a multispectral image volume, FISH signals appear as a tiny 3D object consisting of high intensity voxels extending to several image slices and are located entirely within the cell nucleus. Though signals located outside the cell nuclei are not of pathological importance since only the trisomy or monosomy of the chromosomes within the cell nuclei is related to malignancy of the tumour. For a normal cell, the cell nucleus contains two FISH signals due to aberrations of chromosome 7 and any variation in FISH count can be attributed to an abnormal pathological status [6,7]. But counting the FISH signals depends on distinguishing them from noise and ascertaining their membership to a particular nucleus. Visual counting of the FISH signal is a difficult, cumbersome and fatiguing process. The only possible way for the visual inspection of signals inside the cell nuclei is by displaying the gallery of the images containing the cell and then counting the signals keeping in mind the three-dimensional nature and hence the continuity of FISH signal to several image slices. We used object isolation by the volume growing technique to detect the signal.

Segmentation of FISH
Segmentation of the FISH concerns the proper selection of the features to identify the FISH signals and to reduce the various types of errors that may occur while counting the signals automatically. Some of the major features we used to identify the FISH signals are: (i) size; (ii) shape (three-dimensional nature); (iii) average intensity; (iv) relative intensity; (v) location of the signal; (vi) number of intensity peaks within the signal, etc. An appropriate maximum and minimum size threshold for the signal are determined experimentally. Only those signals that fall within the size are considered for further processing. The shape of the sig-nal is due to its three-dimensional nature. Considering the excellent axial resolution of the image stack, only those signals that occupy a minimum of two image slices in a continuum are considered as FISH signals subject to other conditions. A threshold for aver- age and peak intensity of the signals is set experimentally. Relative intensity is the ratio of total intensity of the signal to average of the total intensity of all the signals within the cell nucleus. Signals that lie completely within the cell are automaticaly chosen for counting. A problem arises when the signals fall close to the boundary or midway through the common boundary between touching cells. In such cases, the signal is given a membership of a cell in which more than 50% of the signal is located. Still an error may occur due to missed and split FISH signals, overlapping signals, improper segmentation of the cells, etc. We tried to reduce these errors by signal enhancement and noise removal steps explained below.
The region of interest in the FISH signal channel (green channel in the multispectral image volume) is defined by superimposing the surface of the cell as ob-tained by optimum surface of the deformable model in the red channel of the multispectral image volume over the FISH signal channel. Once the region of interest is defined, the task shifts to noise removal, segmentation and counting of FISH signals in the defined region. The isolated group of voxels are removed by size filtering where a minimum size of a FISH signal is used as a priori information. All the connected structures that are lower than this size threshold are considered as noise artifacts and removed. The signals that exceed the maximum size limit are checked for overlapping. The number of intensity peaks in such a signal is counted. If there are more than one intensity peaks and the distance between the intensity peaks is above the minimum threshold, the object is considered as authentic signal. The number of signals counted in such cases is equivalent to the number of intensity peaks in the signal. There can be a cluster of noisy voxels that are linked to each other by a thin and frail connection. These clusters escape the size filter as the total volume exceeds the threshold. We have applied morphological opening to remove the thin and frail links as well as to remove the isolated voxels. The FISH signals are enhanced using a 3D top-hat filter [20]. A top-hat filter consists of a central core and a surrounded by a rim. If the difference between the brightest values in the outer rim region and the brightest value in the inner core region is above a pre-defined threshold, with core region brightness being greater than the rim region, then the voxels of the core region are enhanced further while that of the rim region are reduced, and the result is written into a separate image file. Figure 8 shows the FISH signal channel as a sequence of images with a superposed boundary of the cell, with reduced noise and the enhanced FISH signal.
The resulting image volume contains FISH signals with virtually superimposed cell surface. A threedimensional component labeling is applied separately for labeling and counting the FISH signals within each cell of interest. If U m is the image data set of a tissue specimen containing N cell nuclei of interest, and if there are j FISH signals in the ith nucleus then they are documented as U m (i) = j where i = 1, 2, . . . , N . Figure 8 shows the result of processing the FISH signal channel. Figure 8(a) is the unprocessed signal channel. Figure 8(b) is the noise reduced signal channel. Figure 8(c) shows the result of 3D top-hat transformation and Fig. 8(d) shows the overlaid cell boundary for counting the FISH signals in each cell nucleus.

Experimental results and discussion
The development and application of image analysis procedures are performed under the Interactive Data Language (IDL) and C on IRIX 5.2. As hardware development platform a Silicon Graphics work station INDY is used. Figure 6 shows results of application of deformable models for segmenting a cell of interest in tumour tissue. Normally, all the cells in the tissue image are not of interest to the pathologists. Use of deformable models has an advantage of selecting the cells of interest before automatic analysis of the image. The precision with which we can trace the boundary using deformable models helps in accounting of FISH signals that are very near to the surface and may not fall within a low level detector segmented boundary. Moreover, the precise shape features of the cell nucleus that can be obtained by using deformable models for segmenting the cells can also be used as one of the factors in deciding the pathological status of the tissue specimen. Table 1 gives the comparative scoring of the FISH signals by different methods, over several different specimens. It also includes the count over 2D images obtained on the same specimen. The comparison of 2D and 3D results, considering the count by a pathologist as the gold standard, proves the need for 3D analysis.
Comparison among various analyses tools shows how the use of deformable models for segmenting the tissue give precise and reliable results rather than other methods. The percentage error while counting the FISH signals by automatic methods is also shown in Table 1.
The high level segmentation and analysis techniques such as the deformable models are computationally expensive and need a fair degree of interactiveness in the process. In cases where the accuracy of boundary decimation is not so important as compared to the automation, deformable models are not suitable for use. In complicated image structures, the need for interactive initialization and placing of hard constraints can take more time than anticipated.