This paper presents an optimized level set evolution (LSE) without reinitialization (LSEWR) model and a shape prior embedded level set model (LSM) for robust image segmentation. Firstly, by performing probability weighting and coefficient adaptive processing on the original LSEWR model, the optimized image energy term required by the proposed model is constructed. The purpose of the probability weighting is to introduce region information into the edge stop function to enhance the model’s ability to capture weak edges. The introduction of the adaptive coefficient enables the evolution process to automatically adjust its amplitude and direction according to the current image coordinate and local region information, thus completely solving the initialization sensitivity problem of the original LSEWR model. Secondly, a shape prior term driven by kernel density estimation (KDE) is additionally introduced into the optimized LSEWR model. The role of the KDE-driven shape prior term is to overcome the problem of image segmentation in the presence of geometric transformation and pattern interference. Even if there is obvious affine transformation in the shape prior and the target to be segmented, the target contour can still be reconstructed correctly. The extensive experiments on a large variety of synthetic and real images show that the proposed algorithm achieves excellent performance. In addition, several key factors affecting the performance of the proposed algorithm are analyzed in detail.
Fundamental Research Funds for the Central Universities of ChinaZYGX2018J079China Scholarship Council2017060750681. Introduction
Image segmentation is an important intermediate step in the field of computer vision, which aims to partition a given image into a set of nonoverlapping regions where the internal pixels are homogeneous with respect to intensity, color, texture, motion, semantics, etc. Its output quality directly determines the success or failure of higher level visual tasks such as 3D reconstruction, motion analysis, pattern classification, and object recognition. Among various types of image segmentation theories, the LSMs are widely used because they are capable of outputting closed and smooth target contours and can naturally handle topology changes. The core idea of this type of method is constructed by Osher and Sethian [1], and its key is to implicitly represent contour as the zero level set of a higher dimensional level set function (LSF) and then compute a time-dependent equation to obtain a deforming surface.
According to the types and properties of image features (local features or global features) used by these models in building their energy functions (usually composed of external energy terms and internal energy terms), we can roughly divide the existing popular LSMs for image segmentation into the following three categories: edge-based methods [2–8], region-based methods [9–16], and hybrid fitting energy-based methods [17–21].
Edge-based methods usually rely on gradient information to construct their core energy items. When the edge features of the image are clear and the background noise level of the image is not high, they can achieve good segmentation results. However, when there is a certain amount of noise in the image or the edge of the target is very blurred (the corresponding gradient amplitude is very small), the evolution process of such models generally exhibits the following problems: falling into local minima (easy to be pulled by background noise information to the wrong location), edge leakage (unable to locate the blurred target edges), and sensitive to initialization (the final segmentation result is directly related to the position and shape of the initial contour).
For example, the image gradient-driven geodesic active contour (GAC) model proposed by Caselles et al. [2] does achieve good segmentation results on image data with significant gradient strength. However, when there is noise in the image or the gradient strength of the target is not obvious, its segmentation accuracy will drop sharply, and it will synchronously present the series of problems mentioned above. Li et al. [3] proposed a segmentation framework called level set evolution (LSE) without reinitialization (LSEWR), which is also based on the gradient information of the image. The framework does output good segmentation results on some high-quality images. However, it has two obvious drawbacks: one is that the speed of curve evolution cannot be adaptively changed with the change of local image characteristics, and the other is that its segmentation result is highly sensitive to initialization.
Region-based methods usually rely on the statistical information of the region (local region or global region) inside and outside the active contour to construct their dominant energy terms. The common regional information elements have intensity, color, texture, motion characteristics, etc. Compared with edge-based methods, such methods have significant performance improvements in terms of weak edge capture, background noise suppression, and insensitivity to initialization. However, this type of method also has obvious shortcomings, i.e., they sometimes cannot accurately locate the true target edge. For example, Chan and Vese (C-V) [9] constructed an energy functional based on global region information, which can effectively measure the difference between the current pixel value and the statistical mean of the region. Applying it to images without obvious edges or even without edges can usually achieve good segmentation results. However, in many cases, it cannot give high-precision edge location. Li et al. [14] constructed a region scalable fitting (RSF) energy model for inhomogeneous image segmentation based on the local statistical information (the local region mean in the Gaussian-weighted window) of the image. By using a Gaussian distribution with different local means and variances to describe the local image intensity, Wang et al. [15] constructed a LSM called local Gaussian distribution fitting (LGDF) energy model for image segmentation. Zhang et al. [16] presented an active contour model driven by local image fitting (LIF) energy model whose energy functional is defined by minimizing the difference between the fitted image and the original image. The previous three local region information-driven LSMs have achieved good segmentation on most images, but they all have an obvious drawback, i.e., their segmentation results are highly sensitive to the position and shape of the initial curve; that is to say, different initialization methods may correspond to different segmentation results. In Section 4.1, we will set up a section to study this problem.
In order to overcome the shortcomings of the above two types of methods and reasonably inherit the advantages of each type of method, the hybrid fitting models driven by the energy terms of hybrid nature were born. By integrating the energy elements of the edge-based and region-based models in different forms and supplemented by different forms of coupling coefficients, they show different superiority in different applications. However, such models sometimes have the following two problems: (1) the complexity of the numerical discretization process, i.e., the time complexity of the evolution process, is often very high and (2) it is easy to produce a situation where the regional terms and the edge terms are difficult to balance. At this time, the evolution process will lead to erroneous and strange driving forces, resulting in unexpected fluctuation in the evolution process. For example, Wang Li et al. [21] constructed a hybrid LSM called local and global intensity fitting (LGIF) energy model for image segmentation. It couples the energy functional components of the C-V model [9] and the LIF model [16] together through a constant coefficient. In some image data segmentation tasks, it does achieve good segmentation results. However, it just sets its coupling coefficient based on experience. Obviously, such an approach is unscientific and buries some unpredictable dangers to the turbulence of the evolution process.
The LSMs can output good segmentation results in most cases, however, due to the complexity and diversity of input data and image segmentation problems. In some cases, it is not enough to just use image data itself for segmentation, especially when there are unfavorable factors such as occlusion, clutter, and low contrast in the images. Under such circumstances, adding known shape information as a constraint to energy functional is an effective solution.
In view of this, this paper proposes a shape prior embedded LSM and applies it to the practice of segmentation of different kinds of images. Experimental results show that our method achieves very ideal segmentation performance. Firstly, we made a deep optimization of the edge stop function and the coefficient of weighted area term in the original LSEWR model constructed by Li et al. [3]. When optimizing the edge stop function, we introduce the thought of probability weighting. The existence of the statistical probability term makes the edge stop function contain region information. Therefore, our model has a stronger weak edge capture capability; when optimizing the coefficient of the weighted area term, we abandon the original constant coefficient because the constant-type coefficient makes the model only evolve in a single direction, which is obviously problematic in practical applications, and the most obvious phenomenon is that the segmentation result is highly related with the initialization. To overcome this limitation, we modify the constant weighting coefficient to a variable which is directly related to an image coordinate and local region information. The modified weighting coefficient can adaptively adjust the amplitude and speed of the evolution according to the position of the active contour; i.e., the final segmentation result is completely independent of initialization. Secondly, we embed the shape priors into the optimized LSEWR model. When constructing the shape prior-driven energy term, we adopt the KDE thought proposed by Cremers et al. [22], and the two cases of single prior and multiple priors are considered separately. Even if there is obvious affine transformation in the shape prior and the target to be segmented, the target contour can still be reconstructed correctly.
The remainder of this paper is organized as follows: In Section 2, we shall discuss the energy functional construction process of the proposed method in detail. Then, the numerical implementation strategies for the proposed model are presented in Section 3. Section 4 validates the proposed model by extensive experiments and discussions on a variety of images. Finally, some conclusive remarks are provided in Section 5.
2. The Proposed Model
In order to improve the integrity of the segmentation results of the LSM in the presence of partial occlusion, low contrast, and strong background clutter, we embed the energy term with shape priors into the LSM based solely on image data. The expression of the proposed model is as follows:(1)E=a×Eimage+b×Eprior,where a and b are the weight coefficients, both of which are located in the interval 0,1 and a+b=1, and Eimage and Eprior are the energy terms driven by the image data and shape prior, respectively. The organic combination of these two energy terms forms a new energy functional with shape prior information. Eimage comprehensively considers the edge and region information of the image, which is similar to the model of Li et al. [3] in terms of formula structure, but we use a different edge stop function and a different coefficient for the weighted area term, and it is defined as follows:(2)Eimageϕ=μ∫Ω12∇ϕ−12dxdy+λ∫Ωgδϕ∇ϕdxdy+∫ΩvgHϕdxdy,where ϕ is the LSF, μ and λ are two control parameters, the function H⋅ and δ⋅ are the one-dimensional Heaviside and Dirac function, respectively, v is a variable coefficient parameter related to the pixel coordinates of the image, which will be described in detail later, and g is a probability-weighted edge stop function, which is defined as follows:(3)g=11+PΩO∣I×PΩB∣I×∇Gσ∗Ic,where I is the image to be segmented, Gσ is the Gaussian filter kernel, “∗” is the convolution operator, and PΩO∣I and PΩB∣I are the posterior probabilities that the sample pixel belongs to the target and background, respectively. By using Bayes’ theorem, we can infer their expressions:(4)PΩO∣I=PI∣ΩOPΩOPI∣ΩOPΩO+PI∣ΩBPΩB,PΩB∣I=PI∣ΩBPΩBPI∣ΩOPΩO+PI∣ΩBPΩB.
The prior probability PΩO/B in equation (4) can be assumed to be equal to 0.5 when it is unknown in advance, and further assuming that the conditional probability PI∣ΩO/B obeys the Gaussian distribution:(5)PI∣ΩO/B∝12πσO/Bexp−x−μO/B22σO/B2,where μO/B and σO/B are the mean and variance of the Gaussian distribution, respectively. Let PΩO∣I=x, since PΩO∣I+PΩB∣I=1, then PΩB∣I=1−x, PΩO∣I×PΩB∣I=−x2+x. Let y=PΩO∣I×PΩB∣I, then y=−x2+x. Obviously, this is a quadratic function. According to the extreme value condition of the quadratic function, it is easy to know that when PΩO∣I=PΩB∣I, the product term PΩO∣I×PΩB∣I will get its maximum value; accordingly, equation (1) will reach its minimum value. The physical meaning of PΩO∣I=PΩB∣I is that the current pixel belongs to both the target and the background; i.e., the current pixel is located at the intersection of the target and the background—the contour of the target, that is, the evolution process will stop at the contour of the target; such a termination behavior is exactly what we need. Besides, the existence of the statistical probability term makes the edge stop function contain region information. Therefore, our model has a stronger weak edge capture capability.
In the above description, by introducing the region information into the edge stop function g of the original LSEWR model, we have optimized and improved the model’s ability to capture weak target edges. Next, we will optimize the value of the coefficient v in equation (2). According to the description of Li et al. [3], the value of v is constant. After specifying its value, the curve will only evolve in one direction, which makes it not only lack sign and amplitude adaptability but also difficult to meet the variety of initialization forms. For example, when the initial curve only intersects a part of the target region, such a unidirectional evolution mode will not output the correct segmentation result. To overcome this defect, we introduce a variable weight coefficient; that is to say, its sign and amplitude are directly related to the pixel value of the current position. The variable weight coefficient has the following two general features: (1)it can automatically change its sign according to the current pixel value, and the resulting effect is that the active contour can choose its traveling direction adaptively, thus weakening its dependence on the initial position of the curve, and (2) it can adaptively change its amplitude according to the image gradient, and the resulting effect is that the active contour has a powerful ability of capturing multilayer contour, thus eliminating the phenomenon of edge leakage.
In order to match the above features, we propose the following weight coefficient:(6)νI=k⋅signΔGσ∗I⋅∇Gσ∗I,where “∗,” “∇,” and “Δ” are the convolution operator, gradient operator, and Laplacian operator, respectively, sign⋅ is the sign function, and k is a control constant.
Below, we give some further analysis:
(1) The second derivative of the image changes its sign after crossing the boundary; that is to say, the sign of the second derivative on both sides of the target boundary is opposite. In addition, we also know that the active contour is divided into two parts by the real target boundary, one is located in the target area and the other is outside the target area. Before the active contour eventually stops at the real boundary of the target, it needs some kind of driving force to continuously pull it; therefore, it is important to determine the direction of the driving force. For the active contour fragment within the target area, its second derivative is positive, that is, ΔGσ∗I>0; we further have signΔGσ∗I>0 and νI>0; thus, the direction of the driving force at this time points to the outside; and the consequent evolutionary effect is that the active contour evolves toward the target contour in the form of expansion. For the active contour fragment outside the target area, its second derivative is negative, that is, ΔGσ∗I<0; we further have signΔGσ∗I<0 and νI<0; thus, the direction of the driving force at this time points to the interior; and the consequent evolutionary effect is that the active contour evolves toward the target contour in the form of shrinkage. Based on the above analysis, we can draw the following conclusions: The active contour (the zero level set corresponding to the LSF) can adaptively determine its evolutionary direction based on the pixel properties, thus getting rid of its dependence on the initial position of the curve completely. As a result, we can place the initial curve anywhere in the image.
(2) The amplitude of the weight coefficient νI depends on the gradient amplitude of the image, so it can adaptively adjust its value according to the image information: when the active contour moves to the vicinity of the target contour, the gradient intensity is larger at this moment, resulting in a larger amplitude of νI. This coefficient adaptive phenomenon greatly improves the multilevel targets extraction ability of the active contour.
(3) In order to deal with different levels of segmentation needs, we specifically introduce a control constant k. When the task flow needs to extract the contour of the target from multiple levels (from the weak contour to the strong contour), the value of k can be appropriately increased. Conversely, we can appropriately lower the value of k when the task flow only needs to extract the main target contours in the image plane.
In addition, to achieve accurate segmentation of image targets with slight or severe occlusion, we here construct the shape priori term Eprior in equation (1) based on Cremers et al.’s [22] thought. We need to consider the following two cases separately:
Single shape prior. In the case of single shape prior (only one target in the test image is similar to the object in the shape prior library), the shape prior energy is defined as
(7)Epriors=∫ΩHϕ−Hψ¯2dxdy,where ϕ is the current LSF, ψ¯ is the result of affine transformation of shape prior ψ according to the moment [23] of ϕ, and H⋅ is the Heaviside function.
Multiple shape priors. For the case of multiple shape priors (multiple targets in the test image are similar to the objects in the shape prior library), we first use the KDE method to estimate the probability density and then calculate its negative logarithm to form the required shape prior energy term:
(8)Epriormϕ=−ln1N∑i=1Nexpd2Hϕ,Hψi¯2σprior2,where ψi¯ is the result of affine transformation of the i-th shape prior ψi according to the moment of ϕ, d2⋅ is the Euclidean distance of two LSFs, and σprior is the width of kernel function under KDE framework, and we can calculate its value by using the following formula:(9)σprior2=1N∑i=1Nminj≠id2Hψi¯,Hψj¯.
By minimizing equation (8) or (9), the current LSF will evolve toward a particular shape prior; that is, the constraint force exerted by the shape prior in the evolution process will make the current LSF more and more similar to the shape prior until it converges to the desired target form. In this way, even if there is a slight or severe occlusion phenomenon in the target to be segmented, the evolution process can reconstruct the desired target contour correctly according to the shape prior.
Minimizing the energy functional E with respect to ϕ by using the calculus of variation and the steepest descent method, we can easily deduce the corresponding gradient descent flow as(10)∂ϕ∂t=−∂E∂ϕ=aμΔϕ−div∇ϕ∇ϕ+λδϕdivg∇ϕ∇ϕ−vIgδϕ−1−a∂Eprior∂ϕ,where δ⋅ is the Dirac function, and the expression of ∂Eprior/∂ϕ consists of the following two forms:(11)∂Epriors∂ϕ=2δϕHϕ−Hψ¯,∂Epriorm∂ϕ=∑i=1NβiHϕ−Hψi¯δϕ2σprior2∑i=1Nβi,where(12)βi=exp−d2Hϕ,Hψi¯2σprior2.
3. Numerical Implementation
In numerical implementation, the literature [3] uses a regularized Dirac function defined as follows:(13)δεoldx=0,x>ε,12ε1+cosπxε,x≤ε.
The support domain of equation (13) is −ε,ε, which determines that the control ability of the evolution process is local. In order to expand its scope of action, this paper uses the following regularized Dirac function to replace δϕ in equation (2):(14)δεnewx=1πεε2+x2,x∈R,limx⟶0δεnewx=δx.
Since the support domain of function δεnewx is −∞,+∞, equation (10) will act on the entire LSF so that the global minimum of the energy functional can be obtained, which further improves the ability of the zero level set to detect multilayer contours and the ability to capture deeply concave regions and multiple target boundaries. We use the regularized Dirac δεnewx with ε=1.5, for all the experiments in this paper.
In addition, the existence of the diffusion term ∫Ω1/2∇ϕ−12dxdy in the proposed model makes it possible to use a simple finite difference scheme to discretize equation (10) defined in the continuous data domain, instead of adopting a complex upwind difference scheme [24] as in the traditional LSMs. Instead, all the spatial partial derivatives ∂ϕ/∂x and ∂ϕ/∂y are approximated by the central difference, and the temporal partial derivative ∂ϕ/∂t is approximated by the forward difference. The approximation of equation (10) by the above difference scheme can be simply written as(15)ϕi,jt+1−ϕi,jtΔt=Rϕi,jt,where Rϕi,j is the approximation of the right-hand side of equation (10) by the above spatial difference scheme. The difference equation (15) can be expressed as the following iteration:(16)ϕi,jt+1=ϕi,jt+Δt⋅Rϕi,jt.
The computer programming implementation of the proposed algorithm is based on equation (16), and the procedures of the proposed algorithm are summarized in Algorithm 1.
Initialize the LSF ϕ as a simple numerical function according to equation (18)
Set related control and attribute parameters, including a, b, λ, k, N, μ, Δt, σ, and ε
Update the value of δϕ according to equation (14)
Iteratively update the values of g⋅ and vI according to equations (3) and (6), respectively
Generate the value of ∂Eprior/∂ϕ according to equation (11), where the single shape prior and multiple shape priors need to be considered separately
Update the LSF ϕ according to equation (10)
Repeat the operations shown in steps III to VI until the evolution process reaches the state of convergence
4. Experimental Results and Discussion
In this section, we shall validate (corresponding to Section 4.1) the performance of the proposed method on a variety of test images and give a detailed analysis (corresponding to Section 4.2) for several key factors that influence the experimental results. All experiments are implemented by using Matlab R2012a on a computer with Intel Core i7 2.3 GHz CPU, 8G RAM, and Windows 7 operating system. For the following parameters, we use the same values for all experiments, i.e., a=0.2, b=0.8, λ=1, and k=3. The standard deviation σ of the Gaussian filter kernel shown in equation (6) will change from image to image, i.e., its value with a certain degree of image and task dependency. In addition, the parameters N in equation (8), μ in equation (10), Δt in equation (16), and ε in equation (14) also vary with the changes of image data and segmentation tasks. For these task-related parameters, we will give their specific values in the caption position of the figures.
In Section 4.1, we will carry out detailed verification and testing from the following four aspects: application on synthetic images with affine transformation and artificial interference, application on real infrared images with a similar target region pattern, insensitivity to contour initialization, and adaptability to weak target edges. Each of them corresponds exactly to an application direction or dimension of the proposed algorithm. What needs to be emphasized here is that in order to test the wide-range adaptability of the proposed algorithm, we choose the experimental data from different channels. Part of the data comes from the technical literature of the direction of image segmentation, and part of the data comes from the Internet. In a word, these data are weakly correlated. In addition, when conducting comparison experiments, we choose the following comparison methods: LSEWR model, RSF model, LGDF model, LIF model, and Cremers et al.’s model (abbreviated as CREMERS [25]). The previous four segmentation methods are all representative and classical methods in the field of LSM. They are all data-driven type and do not contain any prior constraints, and the LSEWR model is the basis and starting point of the model framework of the proposed method. The CREMERS model is a typical representation of the shape prior class segmentation method, and the logic of its model building process is very clear. In conclusion, it is reasonable and scientific to use these methods as the family of comparison methods in this paper.
In Section 4.2, we will analyze several key factors affecting segmentation performance in detail from the following aspects: objective measure of segmentation results, the principles of choosing time step μ and Δt, an easy way to initialize an LSF ϕ, the effect of the parameter ε on segmentation performance, and the relationship between evolution speed and image resolution.
4.1. Experimental Results4.1.1. Application on Synthetic Images with Affine Transformation and Artificial Interference
In this section, we will validate the adaptability of the proposed model to affine transformation and artificial interference in the case of a single shape prior through a set of comparison experiments. The comparison methods involved are LSEWR model, RSF model, LGDF model, LIF model, and CREMERS model. The left image of Figure 1 is a binary image, which is the shape prior on which this group of comparison experiments relies. The middle image and the right image of Figure 1 are two clean images without noise interference. They are generated by using the shape prior of the left image of Figure 1 as the starting point. The middle image of Figure 1 has the same posture and target region coverage as the shape prior, but we have made artificial modifications to the gray value of the target and background, while the right image of Figure 1 has a significant affine transformation (including translation, scaling, and rotation) in the effective target region compared to the left image of Figure 1.
Shape prior and synthetic clean images. The left image: shape prior. The middle image: clean image without affine transformation. The right image: clean image with affine transformation.
The first row of Figure 2 is the actual test images, in which the green curve represents the initial contour of the evolution process. Compared with the original clean images shown in Figure 1, these two images have several more interference bars, which are exactly the same as the gray value of the effective target region. They are the interference information that we intentionally superimpose to test the robustness of the algorithms. The second to fifth rows of Figure 2 are the segmentation results by using the LSEWR, RSF, LGDF, and LIF models, respectively. Visually, we can clearly see that the four models LSEWR, RSF, LGDF, and LIF all output incorrect segmentation results. This is mainly because these models are all pure data-driven LSMs; thus, they treat the horizontal bar interference as a foreground target. The sixth row of Figure 2 is the segmentation results of the CREMERS model. On the image without affine transformation, the model does output correct result (as shown in the left image of the sixth row of Figure 2). However, when there is an affine transformation between the image target and shape prior, the model outputs wrong segmentation result (as shown in the right image of the sixth row of Figure 2). This is mainly because Cremers et al. did not take into account the objective factor of affine transformation when constructing their shape prior segmentation model. The seventh row of Figure 2 shows the segmentation result of our method. Under the influence of KDE thought, the input shape prior can change dynamically according to the current target contour. Therefore, even in the case where there is a significant affine transformation between the target and the shape prior, our method can also correctly reconstruct the target contours (as shown in the image at the bottom right of Figure 2), and the horizontal bar interference with the same intensity value as the target is also eliminated synchronously.
A set of comparison segmentation experiments on synthetic images with a single prior. The first row is the input images along with artificial interference and initial contours. The second to seventh rows are the segmentation results by using the LSEWR, RSF, LGDF, LIF, CREMERS, and our SP-LEM models, respectively. The green, red, and blue contours are the initial contour, the final target contour, and the input shape prior, respectively. The relevant task parameters are N=1, μ=0.04, Δt=5.5, σ=3, and ε=3.
4.1.2. Application on Real Infrared Images with Similar Target Region Patterns
When there is uncertain number of region patterns similar to the target in the background of the image, the LSMs of pure data-driven type will be difficult to properly suppress the background interference. In this case, a LSM with shape priors will show unique advantages. Under the guidance of the prior energy term, the model will move quickly toward the target pattern of interest. In this section, we will test the performance of our KDE-embedded shape prior term in the case of multiple shape priors by using two sets of real infrared images.
Figure 3 presents the first set of verification experiments in which there is an interference type of gray pattern in the background of the infrared tank images shown in the first column of Figure 3; i.e., the gray distribution of interference region is similar to that of the target region to be segmented. Figure 4 shows the shape prior database of tank targets, which is the basis for the experimental process shown in Figure 3. Here, we use the same comparison methods system as in Figure 2, and the second to fifth rows of Figure 2 are the segmentation results by using the LSEWR, RSF, LGDF, and LIF models, respectively. Since the four models LSEWR, RSF, LGDF, and LIF only rely on the data information of the image itself, the evolution curve is easily pulled by the background interference to the wrong nontarget locations. The sixth row of Figure 3 shows the segmentation results of the CREMERS model. Since it does not contain a model unit that can effectively process the affine transformation, the shape prior cannot be successfully transformed to the real target contour. One thing that needs to be specifically stated here is that within the multiple shape prior framework, the shape prior selection strategy we designed for the CREMERS model is as follows: traversing all the images in the database, and the image with the highest Intersection over Union (IoU) value [26] between the ground truth of the target to be segmented is the shape prior required for the segmentation process. Thanks to the reconstruction capacity of KDE thought and the support capacity of the multiple shape prior database (as shown in Figure 4), the proposed method outputs correct segmentation results (as shown in the seventh row of Figure 3).
A set of comparison segmentation experiments on real infrared tank images in the case of multiple shape priors. The first row is the input images along with initial contours. The second to seventh rows are the segmentation results by using the LSEWR, RSF, LGDF, LIF, CREMERS, and our SP-LEM models, respectively. The relevant task parameters are N=1, μ=0.03, Δt=6, σ=2.5, and ε=1.5.
Shape prior database of tank targets.
Figure 5 shows the second set of comparison experiments for this section. The first row of Figure 5 shows the real infrared ship images used in this set of experiments. There is a certain number of shape patterns of interference type in their background; i.e., the shape of the interference region is similar to the object to be segmented. Another feature of this set of images is that they contain one or multiple targets of interest in a single image. Figure 6 shows the shape prior database of ship targets, which is the basic data of the shape prior energy term that the experiment shown in Figure 5 relies on. In the first row of Figure 5, we directly superimpose initial contours on three infrared images with ocean background. The family of comparison methods here is the same as that of Figure 3, and the second to fifth rows of Figure 5 are the segmentation results by using the LSEWR, RSF, LGDF, and LIF models, respectively. From the second to fifth rows of Figure 5, we can clearly see that the evolution of the four comparison methods LSEWR, RSF, LGDF, and LIF are all deflected by the background interference region, which directly leads to their segmentation errors. The sixth row shows the segmentation results of the CREMERS model. Since it lacks an internal mechanism that can effectively process the affine transformation, it outputs wrong segmentation results on all three images. As we expected, our method can accurately segment real ship targets, and its convergence contours are shown in the seventh row of Figure 5.
A set of comparison segmentation experiments on real infrared ship images in the case of multiple shape priors. The first row is the input images along with initial contours. The second to seventh rows are the segmentation results by using the LSEWR, RSF, LGDF, LIF, CREMERS, and our SP-LEM models, respectively. The relevant task parameters are N=2, μ=0.05, Δt=4.5, σ=3.5, and ε=3.
Shape prior database of ship targets.
4.1.3. Insensitivity to Contour Initialization
When using the LSMs to perform image segmentation tasks, we often need to set a spatial base point, i.e., the initial curve, for the evolution process of the LSF. We can either set it manually (the person is in the task loop) or use the rough segmentation results of some algorithms as the initial curve of the evolution process. The degree to which the initial curve affects the final segmentation result has become an important indicator for measuring the degree of automation of the algorithm. When the initial curve has no effect on the segmentation results, we can call the algorithm fully automated. Otherwise, we need to carefully design the position and posture of the initial curve because different initial curves may evolve into different segmentation results. In this section, we will test the sensitivity of the proposed algorithm to initialization through a set of comparison experiments. The methods involved in the comparison are LSEWR model, RSF model, LGDF model, and LIF model. In this set of comparison experiments, we intentionally turned off the shape prior item. In other words, the simplified model at the moment is also a pure data-driven type. What needs to be additionally explained here is that the proposed method has closed its shape prior shape term. In order to reflect the fairness and scientificity of the comparison process, we removed the CREMERS model which contains a shape prior logic from the comparison methods family. All the methods involved in the comparison outside the proposed algorithm are representative pure data-driven methods in the field of LSM. Figure 7 shows the results of this set of comparison experiments, where the first to fifth rows are the first subgroup of comparison experiments (represented by the symbol number (I)), and the sixth and tenth rows are the second subgroup of comparison experiments (represented by the symbol number (II)). The first row of (I) is the input image and the different initial contours required for the LSMs; the common feature of these initializations is that the intersection of the initial curve and the target region is empty, while the second to sixth rows of (I) correspond to the final contours (corresponding to the region boundary of the final segmentation results) of LSEWR, RSF, LGDF, LIF, and our SP-LSM models, respectively, and the data layout of (II) is the same as that of (I). In this subgroup, we take a cluster of circles to achieve initialization; from left to right, the radius of the circle gradually decreases. We can clearly see that although different initialization forms have been adopted, our method all output the same and correct segmentation results (as shown in the sixth and tenth rows of Figure 7), while the other four comparison methods show different forms of segmentation errors (as shown in the other rows except the sixth and twelfth rows of Figure 7); i.e., their segmentation results are closely related to contour initializations. Through this group of experiments, we can draw the following conclusion: our method is robust to contour initialization and has a stable convergence. All this is attributed to the fact that our model uses an adaptive weight coefficient, which allows us to set the initial position of the curve arbitrarily. This good characteristic greatly enhances the application ability of our method in practice.
Demonstration experiment for testing the insensitivity of the comparison models to initializations. Subgroup (I): the first row of (I) is the input image along with the initial contours, while the second to sixth rows of (I) correspond to the final contours of LSEWR, RSF, LGDF, LIF, and our SP-LSM models, respectively. Subgroup (II): the data layout of (II) is the same as that of (I). The relevant task parameters are μ=0.03, Δt=7, σ=1.5, and ε=2.5.
4.1.4. Adaptability to Weak Target Edges
In this section, we will test the ability of the proposed algorithm in terms of weak target edge capture. We select several representative methods in the field of LSM to participate in the comparison. They are LSEWR model, RSF model, LGDF model, and LIF model. In this set of comparison experiments, we will continue to close the shape prior term, the purpose of which is to provide a fair and consistent logical framework for the family of methods involved in the comparison; i.e., they are data-driven segmentation methods and do not contain any form of priori information. The reason for deleting the CREMERS method from the comparison methods family is the same as the experimental procedure corresponding to Figure 7. We present this set of comparison experiments in Figure 8, where the first column is the input images (they are some typical images with weak target edges) with the initial contours superimposed, and the second to sixth columns are the segmentation results by using the LSEWR, RSF, LGDF, LIF, and our SP-LEM models, respectively. Our method uses a probability-weighted edge stop function, which embeds regional statistical information into the edge stop function. Therefore, it has a strong weak edge extraction ability (as shown in the rightmost column of Figure 8). While the four methods involved in the comparison are all dragged by the spurious information in the background of the image to the wrong locations; i.e., they all have different degrees of segmentation errors (as shown in the second to fifth columns of Figure 8).
Comparison experiment for weak target edge capture applications. The first column is the input image along with initial contours. The second to sixth columns are the segmentation results by using the LSEWR, RSF, LGDF, LIF, and our SP-LEM models, respectively. The relevant task parameters are μ=0.045, Δt=5.5, σ=0.8, and ε=2.
4.2. Discussion4.2.1. Objective Measure of Segmentation Results
We will take the following Dice similarity coefficient (DSC) [27] metric to objectively measure the accuracy of the segmentation results:(17)DSC=2NSreference∩StestNSreference+NStest,where “∩” and “∪” represent the intersection and union of two regions, respectively, Stest and Sreference are the output regions of the segmentation algorithm and the ground truth, respectively, and N⋅ represents the number of pixels in the enclosed set. Obviously, the closer the DSC value to 1, the better the segmentation result. Here, we take the comparison experiments shown in Figure 8 as an example to show the descriptive ability of the objective DSC metric shown in equation (17). The objective data in Table 1 well measure the statistical differences between the current segmentation results and the gold standard (ground truth) based on equation (17). Through these data, we further confirm from the objective metric dimension that the proposed method does have a strong weak target edge capture ability.
Segmentation metric for the five images in the first column of Figure 8.
Image number
Algorithms
DSC
#1
LSEWR
0.0712
RSF
0.0213
LGDF
0.0072
LIF
0.0011
SP-LSM
0.9855
#2
LSEWR
0.1188
RSF
0.0131
LGDF
0.0024
LIF
0.0257
SP-LSM
0.9817
#3
LSEWR
0.0060
RSF
0.0205
LGDF
0.0031
LIF
0.0022
SP-LSM
0.9806
#4
LSEWR
0.0460
RSF
0.1560
LGDF
0.2302
LIF
0.0674
SP-LSM
0.9846
#5
LSEWR
0.0039
RSF
0.0044
LGDF
0.0029
LIF
0.0017
SP-LSM
0.9882
4.2.2. Principles of Choosing Time Step <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M150"><mml:mi>μ</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M151"><mml:mrow><mml:mi>Δ</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:math></inline-formula>
When numerically implementing the proposed algorithm, the value of time step Δt can be much larger than the same parameter used in the traditional LSMs. We attempted to select values for time step in a relatively wide range, for example, from 0.15 to 120. Now, naturally, a question arises: What is the value range of Δt in which the LSE process does not cause oscillation? Through the experiments in this paper, we have found that as long as the product of μ and Δt satisfies μΔt<0.25, the evolution process of the level set can be guaranteed to be stable. It is true that large time step can accelerate the evolution process, but when its value is too large, there may be serious edge locating errors. Therefore, we need to choose a reasonable compromise between the larger time step and the locating accuracy of the contour. For most test images, our range of values is Δt≤12.5.
4.2.3. An Easy Way to Initialize an LSF <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M158"><mml:mi>ϕ</mml:mi></mml:math></inline-formula>
Under the proposed regularization mode, not only the reinitialization operation which must be adopted in the traditional LSMs is completely removed but also the LSF is no longer required to be initialized as a signed distance function. Our simple initialization method is as follows:(18)ϕinitialx,y=−c,x,y∈Ωinitial−∂Ωinitial,0,x,y∈∂Ωinitial,c,x,y∈Ω−Ωinitial,where ∂Ωinitial is a set of points on the boundary of the region Ωinitial enclosed by the initial contour (manually set or automatically generated by other segmentation algorithms) and c is a positive constant; we recommend setting the value of c according to rule c>2ε, where ε is the attribute parameter from equation (13).
Obviously, the proposed initialization method is very different from a signed distance function. During the evolution, although the LSF may not be able to maintain as an approximate signed distance function in the whole image plane, we can guarantee that the LSF remains as an approximate symbol distance function near the zero level set under the penalizing diffusion scheme shown in the first term of equation (2). We have found that as long as this condition is achieved, it is sufficient to ensure the stability of the evolution process.
4.2.4. Effect of the Parameter <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M166"><mml:mi>ε</mml:mi></mml:math></inline-formula> on Segmentation Performance
The parameter ε in equation (14) has a direct impact on the capture range and accuracy of the proposed segmentation model, specifically: the parameter ε affects the profile of δεϕ, and a bigger ε will cause a broader profile, which will expand the capture scope but decrease the accuracy of the final contour. The change curve of function δε⋅ about the independent variable x shown in Figure 9 also visually reflects the above conclusion; i.e., the span of the curve (which will have a direct impact on the capture range of the segmentation model) increases with the increase of parameter ε, while the sharp degree and peak value of the curve (which will have a direct impact on the location accuracy of the segmentation model) decrease with the increase of the parameter ε.
The change curve of the function δε⋅ under different values of ε.
Figure 10 shows the verification experiment of the influence of parameter ε on segmentation performance. The test image contains a small amount of Gaussian noise and several deeply concave regions (as shown in Figure 10(a)). Here, we use a set of gradually increasing values of ε whose set of values is 0.05,0.5,1,3,5,10,15 to segment it. When the value of the parameter ε is greater than 5 (as shown in Figures 10(f)–10(h)), the segmentation behavior is indeed extended to a large image range, and the deeply concave regions are successfully captured. However, the spurious information in the background brings great interference to the segmentation process, and the degree of interference increases with the increase of parameter ε. On the contrary, when the parameter ε is very small (as shown in Figure 10(b)), the curve is completely incapable of moving, and the evolution process cannot be advanced into the deeply concave regions. That is to say, under such a parameter configuration, the evolution curve cannot be effectively expanded, which is equivalent to reaching the extreme of location accuracy. What needs to be specifically stated here is that we do not show the change curve in Figure 9 under the extremely small parameter ε corresponding to Figure 10(b). The main reason is that under such an extremely small parameter ε, the peak of the curve will become extremely steep, and other curves will be very low, which is not conducive to the display of comparison effect, so we omitted it.
Verification experiment of the influence of parameter ε on segmentation performance. The relevant task parameters are μ=0.03, Δt=8, and σ=2.5. (a) Input image, (b) ε = 0.05, (c) ε = 0.5, (d) ε = 1, (e) ε = 3, (f) ε = 5, (g) ε = 10, and (h) ε = 15.
According to our statistical analysis, when parameter ε is taken in range 0.5,3, a good compromise between capture range and location accuracy can be achieved.
4.2.5. The Relationship between Evolution Speed and Image Resolution
In Section 4.1, we did not quantitatively analyze the relationship between image resolution (it is measured by the total number of pixels in the image) and evolution speed (it is measured by CPU time). In this section, we will quantitatively analyze it through a group of experiments. The image data required for this set of experiments are generated by continuously downsampling (interpolation type) the input image. In addition, the multiresolution analysis tools such as wavelet decomposition can also be used to generate the images with gradually reduced resolutions. Figure 11 presents a set of validation experiments used to study the relationship between evolution speed and image resolution, and the basic image it uses is shown in Figure 12; the target to be segmented is an infrared aircraft with an approximately homogeneous gray distribution in the middle of the image. Figure 12(a) shows the segmentation result at the original resolution, and Figures 11(b)–11(e) show the segmentation results with the resolution of the latter input image reduced by half relative to the previous input image. When setting the initial curve, we adopt a rectangular contour covering almost the whole image plane, which is the same distance from the upper and lower image boundaries, and both are equal to five percent of the height of the image. At the same time, it is the same distance from the left and right image boundaries, and both are equal to five percent of the width of the image. Table 2 lists the mapping data between CPU time and image resolution. For the data change phenomenon in Table 2, we give the following intuitive explanation: when the resolution of the image is large, there are more pixels to refresh in the evolution process of the LSF, so the CPU time will naturally increase accordingly.
A set of validation experiments used to study the relationship between evolution speed and image resolution. The relevant task parameters are μ=0.015, Δt=6.5, σ=1, and ε=2.5.
The base image required for the verification experiment of Figure 11.
The CPU time corresponding to each segmentation result in Figure 11.
Image resolution (pixels)
CPU time (seconds)
646 × 800 pixels (Figure 11(a))
63.711
323 × 400 pixels (Figure 11(b))
24.504
162 × 200 pixels (Figure 11(c))
14.702
81 × 100 pixels (Figure 11(d))
7.056
41 × 50 pixels (Figure 11(e))
4.411
In order to obtain a quantitative functional relationship between CPU time and image resolution, we perform polynomial fitting on these discrete data shown in Table 2. The fitted function curve is shown in Figure 13. From Figure 13, we find that the discrete data can be well fitted by a one-order polynomial function (linear function). The mathematical expression corresponding to the fitted curve is as follows:(19)fx=0.00010986x+717473.
The fitted function curve generated by taking the data in Table 2 as input.
5. Conclusions
In this paper, the original LSEWR model is deeply optimized from two aspects: firstly, the two concepts of probability weighting and adaptive coefficient are introduced. The purpose of probability weighting is to embed regional statistical information into the edge stopping function to enhance the model’s ability to capture weak edges, while the purpose of introducing adaptive coefficient is to make the LSE determine its evolution direction and intensity adaptively according to current pixel coordinate and local region information and thus completely overcome the initialization problem of the original LSEWR model; i.e., the segmentation results are highly sensitive to contour initialization. Secondly, the shape priors are embedded into the optimized LSEWR model. Under the influence of KDE thought, even if there is obvious affine transformation in the shape prior and the target to be segmented, the target contour can still be reconstructed correctly. The extensive and promising experimental results on numerous synthetic and real images have shown that our method can efficiently improve the image segmentation performance, in terms of application on synthetic images with affine transformation and artificial interference, application on real infrared images with similar target region pattern, insensitivity to contour initialization, and adaptability to weak target edges. In addition, we also specifically analyzed several key factors that have a significant impact on the performance of the proposed algorithm.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest to this work.
Acknowledgments
This work was supported by the Fundamental Research Funds for the Central Universities of China under Grant no. ZYGX2018J079. The authors gratefully acknowledge the financial support from China Scholarship Council (CSC) under Grant no. 201706075068.
OsherS.SethianJ. A.Fronts propagating with curvature-dependent speed: algorithms based on Hamilton-Jacobi formulationsCasellesV.KimmelR.SapiroG.Geodesic active contoursLiC.XuC.GuiC.FoxM. D.Level set evolution without re-initialization: a new variational formulationProceedings of the IEEE Conference on Computer Vision and Pattern Recognition2005San Diego, CA, USAXinbo GaoX.Bin WangB.Dacheng TaoD.Xuelong LiX.A relay level set method for automatic image segmentationZhuS.GaoR.A novel generalized gradient vector flow snake model using minimal surface and component-normalized method for medical image segmentationWuY.WangY.JiaY.Adaptive diffusion flow active contours for image segmentationWuY.WangY.JiaY.Segmentation of the left ventricle in cardiac cine MRI using a shape-constrained snake modelZhouS.LiB.WangY.WangC.WenT.LiN.The line- and block-like structures extraction via ingenious snakeChanT. F.VeseL. A.Active contours without edgesNiuS.ChenQ.de SisternesL.JiZ.ZhouZ.RubinD. L.Robust noise region-based active contour model via local similarity factor for image segmentationLiuL.ChengD.TianF.ShiD.WuR.Active contour driven by multi-scale local binary fitting and Kullback-Leibler divergence for image segmentationDingK.XiaoL.WengG.Active contours driven by local pre-fitting energy for fast image segmentationDingK.XiaoL.WengG.Active contours driven by region-scalable fitting and optimized Laplacian of Gaussian energy for image segmentationChunming LiC.Chiu-Yen KaoC.-Y.GoreJ. C.Zhaohua DingZ.Minimization of region-scalable fitting energy for image segmentationWangL.HeL.MishraA.LiC.Active contours driven by local Gaussian distribution fitting energyZhangK.SongH.ZhangL.Active contours driven by local image fitting energyWangD.A fast hybrid level set model for image segmentation using lattice Boltzmann method and sparse field constraintWangD.Hybrid fitting energy-based fast level set model for image segmentation solving by algebraic multigrid and sparse field methodFangL.QiuT.ZhaoH.LvF.A hybrid active contour model based on global and local information for medical image segmentationWangD.ZhangT.YanL.Fast hybrid fitting energy-based active contour model for target detectionWangL.LiC.SunQ.XiaD.KaoC.-Y.Active contours driven by local and global intensity fitting energy with application to brain MR image segmentationCremersD.OsherS. J.SoattoS.Kernel density estimation and intrinsic alignment for shape priors in level set segmentationTehC.-H.ChinR. T.On image analysis by the methods of momentsSethianJ. A.CremersD.SochenN.SchnörrC.Towards recognition-based variational segmentation using shape priors and dynamic labelingXieZ.HuangY.JinL.Weakly supervised precise segmentation for historical document imagesVovkU.PernusF.LikarB.A review of methods for correction of intensity inhomogeneity in MRI