Automated Image Analysis of Lung Branching Morphogenesis from Microscopic Images of Fetal Rat Explants

Background. Regulating mechanisms of branching morphogenesis of fetal lung rat explants have been an essential tool for molecular research. This work presents a new methodology to accurately quantify the epithelial, outer contour, and peripheral airway buds of lung explants during cellular development from microscopic images. Methods. The outer contour was defined using an adaptive and multiscale threshold algorithm whose level was automatically calculated based on an entropy maximization criterion. The inner lung epithelium was defined by a clustering procedure that groups small image regions according to the minimum description length principle and local statistical properties. Finally, the number of peripheral buds was counted as the skeleton branched ends from a skeletonized image of the lung inner epithelia. Results. The time for lung branching morphometric analysis was reduced in 98% in contrast to the manual method. Best results were obtained in the first two days of cellular development, with lesser standard deviations. Nonsignificant differences were found between the automatic and manual results in all culture days. Conclusions. The proposed method introduces a series of advantages related to its intuitive use and accuracy, making the technique suitable to images with different lighting characteristics and allowing a reliable comparison between different researchers.


Introduction
Branching morphogenesis results on the creation of branched structures in the body and is a key and fundamental feature of several organs development and growth, such as lungs, pancreas, salivary glands, mammary glands, kidney, and prostate [1][2][3]. The lung branching morphogenesis (LBM) of fetal rat explants, grown in vitro, has been an essential tool in research of molecular and cellular development mechanisms. This methodology has been widely studied, at different gestational ages in vivo and in vitro, in many research centers due to its stability and versatility [4][5][6][7].
Usually, LBM analysis involves a morphometric analysis of lung explants differentiation and growth, during a 5day period, using stereo microscope images acquired at 24hour intervals. For each day, a comprehensive study of the branching pattern of embryonic lungs by quantifying the branches perimeter, area, outer contour, and number of peripheral airway buds is performed.
Although, over the past decade, there have been significant advances in understanding of the genetic control of lung development, to the best of our knowledge, all LBM studies are still performed by manual image quantification using generic 2D curves software. LBM analysis remains a timeconsuming procedure, dependent on researcher expertise and error-prone. Often, it prevents the biological result comparison among different researchers, to deduce biological validations and theorems, due to ambiguous and inaccurate measurements [8]. Therefore, besides the different image processing approaches proposed in the biological research domain [9][10][11][12][13][14][15][16][17][18], none of these works are applied to LBM analysis preventing us from discussing the state-of-the-art technology to deal with the same problem. Considering the literature pitfalls, we propose a new methodology capable of automatically performing the LBM morphometric analysis in order to reduce or even eliminate the researcher dependence, providing fast, robust, userindependent, and accurate results.

Methods
All methods were developed using C++, ITK (Segmentation & Registration Toolkit), and VTK (Visualization Toolkit). RGB images were acquired at the Life and Health Sciences Research Institute (ICVS) of School of Health Sciences, University of Minho, Portugal, using a stereo microscope (Olympus SZX16). Each CT slice has 768 × 576 pixels, with a pixel resolution of 40 m. All images were acquired at the ×20 magnification.
Depending on the biological laboratory trials, LBM images can be acquired with different color conditions (examples in Figures 1(a) and 1(b)). The computer application presented in this work was tested and validated for both kinds of images. The segmentation process assumes that the region of interest to be segmented is the union of one or more small primitive regions previously calculated from the input image. To this extent, different seeds were automatically calculated and placed within the lung epithelia allowing a multithreading cluster growth.

Preprocessing
Filtering. The RGB input images were first converted to grayscale values by averaging and normalizing the 3-color components. Then, the grayscale image was input to an anisotropic diffusion algorithm [19] which reduces the noise spots corrupting the image.
This algorithm depends on three parameters (empirically defined): the number of iterations, edge parameter ( ), and an edge-stopping diffusivity function ( , ), according to Tukey's function: This outcome was used to automatically segment (1) the outer contour of the lung explant object and also (2) the inner epithelia by merging different clusters according to an image partitioning.

Lung Explant Outer
Contour. An adaptive and multiscale thresholding technique was used to accomplish the segmentation of the whole lung explant object from the background.
Initially, a global threshold that maximizes the image entropy between a segmented object (the lung explant) and its background was automatically calculated. For that, consider an image with pixels, ( ) as the image intensity at position ( = 1, 2, . . . , ), and IMin and IMax equal to 0 and 255. Moreover, consider ( ) as the result of a threshold algorithm with a threshold level at ( ), ( ) as the entropy of the objects of ( ), ( ) as the entropy of the background of ( ), and as the total entropy ( ( )+ ( )). Ranging from IMin to IMax, the value of was determined by the intensity ( ) that maximized . Although this global threshold produces suitable contours in all lung objects with image properties as Figure 1(a), it fails for Figure 1(b) due to the irregularities, small contrast variability, and outer contour ambiguities.
Consequently, for this kind of images, the initial outer contour was redefined with the following steps: (a) calculation of an initial binary object and its centroid ( in Figure 3 (a) (d) determine the distance average = ∑ = =1 / ; it defines circle with radius /2 and center at ; (e) each point corresponds to the center of a new circle with radius /4 that was used to automatically determine a local threshold level using the same entropy maximization criterion; (f) the threshold value, for all pixels that were not inside of any circle with center at , was smoothly interpolated using a B-Spline approximation method described in [20]; (g) the resulting binary image allowed the determination of an iso-contour for the whole lung object; this contour was later smoothed with a Gaussian distribution producing a shrinking effect (Figure 3(c)).

Epithelial Segmentation
The image gradient magnitude of the filtered image was input to an algorithm that divides the input image into small regions. Figure 4(a) shows how the different regions were labelled by a starting point (red, Figure 4(a)) and follow the flow line, whose direction was the gradient of minimum local intensity. The minimum gradient (green, Figure 4(a)) path of each pixel of the input image was tracked by recursively selecting a pixel in the 8connected neighborhood (yellow, Figure 4(a)). If more than one pixel exist, the last pixel found was taken, considering as a reference pixel. Every pixel along the path is marked as a local minimum of the gradient magnitude and assigned a distinct label.
In the end, the whole image was divided into primitive regions. Each region shares the same statistical properties and the boundaries coincide with the ridges of the gradient magnitude surface (Figures 4(a) and 4(b)).

Clustering Regions
Creating Seeds. The image partitioning contains a set of nonoverlapping regions. Although the probability of having region boundaries corresponding to boundaries of important objects increases with oversegmentation, it can also create many insignificant boundaries. This stage describes how one dealt with this problem and the inner lung epithelium was automatically determined. Briefly, the procedure consists of the identification and clustering of similar primitive regions. Each cluster starts growing from different seeds, initialized within the lung epithelia, along different lines (with = 1, 2, . . . , 8 (total number of lines), Figure 5(a)).
First, a neighborhood (with = 1, 2, . . . , 8, white circle in Figure 5) defines a kernel with center at centroid , circle shape, and radius of 8 pixels. An iterative process transverses each line with (white arrow in Figure 5), while it calculates the average distribution of the kernel neighborhood originating different candidate seeds. The final seed for clustering grown will be the one where the average distribution within the kernel was minimum (black circles in Figure 5 defining a seed ).
The regions belonging to each seed were used to calculate initial statistical properties of the epithelia (centroid, average distribution, minimum and maximum values, region edges, region neighbors, and entropy) that were used for clustering growth.
Clustering Growth. The merging procedure was based on the similarity between regions formulated mathematically as a local optimization problem using the minimum description length principle [21]. If any primitive region is neighbor of a cluster, initialized at a seed , a decision rule will state if it should be included or not.
Let ( , ) be a two-dimensional function that denotes the 2D input image with constant regions and ( = 1, 2, . . . , ) the original image intensity in the th region. Let ( , ) be a region of ( , ) with circular shape that includes a maximum number of regions ( = 100, experimentally calculated), located within the outer contour and center at the new query region (tested whether its inclusion in the cluster is adequate) (Figure 6(a)). Using a region ( , ) with a limited number of primitive regions reduces the probability to merge statistically outlier regions, providing a truer picture of the inner epithelia.
According to the minimum description length, the image data was coded in order to determine the total number of bits necessary to encode the region ( , ) given by  Figure 5: Overview of the clustering procedure. The white circle represents a kernel at center ; the black circles represent the initial seeds for clustering growth. partitioning, 1 the number of bits required to code the starting point, and 2 the number of bits required to code each element of the boundary chain code).
As a local optimization problem, the inclusion of new region in the cluster aims at providing the largest positive description length gain in ( , ) at each step with local optimization.
If a new region is merged in the cluster, the cluster will have more pixels. Hence, more bits are needed to encode the cluster image intensities. However, the common boundary segment between and disappears and the total description length might decrease, since the number of bits needed to encode the new region boundary information decreases.
With ( , ) being the number of common boundary elements of and and new the new cluster resulting from the merging procedure, the value of description length gain associated with this merging is given by If > 0, the image intensities between and are quite similar, whereby these regions belong to the same object. Thus the entropy increase is compensated by the elimination of the common boundary, and then these two regions should be merged in order to minimize ( ).

Buds
Counting. The number of peripheral buds was counted based on the lung inner epithelial skeletonizing [22]. The skeletonizing is performed by removing the boundary and corner points of the epithelial object, until only the skeleton remains as white pixels.
The number of peripheral buds was determined as the number of parents (circles in blue, Figure 7(c)) of the skeleton branched ends (circles in red, Figure 7(c)). Only the branched ends located near the outer contour (distancing less than 25% of the axis that transverses the lung object vertically) were considered.

Results
The performance of this new computer algorithm was tested and validated in a total of 210 images: 90 (18 images for each day of culture) with image conditions as Figure 1(a) and 120 (24 images for each day of culture) with image conditions as Figure 1(b). All images were previously segmented by three experienced researchers: each user manually segmented the inner epithelia and outer contour and counted the number of peripheral buds. The performance of the manual quantification was accessed by comparing the results among different researchers. The inner epithelia and outer counter were evaluated using the dice similarity coefficient (DSC) that quantifies the spatial overlap between different segmentations as follows: where TP is the number of true positives, FP the false positives, and FN the false negatives. The DSC score ranges from 0%, indicating no spatial overlap between sets of binary segmentation results, to 100%, indicating complete overlap. Table 1 shows the mean DSC score and error difference for the LBM morphometric analysis among three researchers. Table 2 shows the mean differences when the automatic method was compared to each manual result. The peripheral buds error is the difference average error between users in Table 1 and difference average error between manual and automatic methods in Table 2.
Best results were obtained in the first two days for both types of images with less standard deviations. High similarities between the manual and automatic procedure were not easily obtained for the last two days of culture, due to the increased number of peripheral airway buds and lung architecture complexity.
Statistical analysis using ANOVA test (under SPSS Windows version 17.0 software where values lower than 0.05 were accepted as significant) shows nonsignificant differences between the automatic and manual results (inner epithelia and outer counter) in all culture days. On the other hand, the error of peripheral buds was nonsignificant only on the first three days of culture. Nonsignificant differences were found between researchers. The time for LBM morphometric analysis was reduced to 1-2 seconds/image in contrast to the manual one (2-3 minutes/image).
Results from different images regarding images with different lighting conditions are shown in Figure 8.
A user interface was also developed that allows the user to spatially understand the microscope image and rapidly produce an automatic segmentation. If necessary, manual editing may be used to correct automatic segmentation results. Although this manual editing (used to eliminate false merges) improves DSC scores (>98%) and decreases the standard deviations, it also adds user dependence and slows the morphometric analysis (depending on the editing degree). This strategy has been used at the ICVS research lab to study and analyze the effect of different concentrations of a specific inhibitor in lung branching morphogenesis.

Discussion
A computer application was developed, providing an automatic procedure to enable a fast LBM analysis. The morphometric analysis efficiency and robustness were increased while time consumption, user dependence, and subjectivity were decreased or even eliminated. The observer variability was eliminated since all regions were computed and merged automatically.
The segmentation rate depends on the number of regions needed to be merged to select the entire lung epithelial region. However, the processing time was always about 98% less than the manual one.
The merging procedure was essential to achieve a good segmentation, since a significant amount of regions were initially created by the partitioning algorithm. The automatic seed selection was suitable to segment the inner epithelia using the minimum description length criterion that  selectively cluster regions based on their image intensity distribution similarity. It was seldom observed that clusters merged dissimilar regions due to ambiguity and lack of definition of the inner lung explants contours. The worst results were obtained in the last two days of culture (Table 2), whose images present low details and perceptibility, and the branched ramifications increase drastically. However, the usage of a local image region, centered at the query region, tested whether its inclusion is suitable using the minimum description length principle. This allowed a reliable cluster to grow along epithelial object that changes gradually over space.
The local threshold was efficient to automatically delineate the outer contour with high DSC scores in images with lighting characteristics presented in Figure 1(b).
The main difficulties found in the segmentation procedure were the contrast ambiguity and variances of inner epithelia, complexity of the branched shape, and size, and the presence of neighborhood regions with the same density.
Moreover, response to drugs and biological markers and different culture medium can induce image intensity variations and shadows near the outer contour.
These difficulties were more evident at the last two days of culture where DSCs are lower, even when comparing the results from different researchers. However, the inner epithelial segmentation was always segmented with scores around 90%, even at the last two culture days, indicating successful segmentation.
However one has to recognize some limitations within this work. The proposed algorithm overpredicts the number of peripheral buds with statistical significance with manual counting. However, peripheral buds counting is a controversial issue, without any stated method. As presented in Table 1, the coefficient of variation (defined as the ratio between the standard deviation and mean) among researchers is around 1, showing that there is no coherency between them. With this methodology, one aimed to broaden and generalize this procedure among different researchers. However, further