Vascular Extraction Using MRA Statistics and Gradient Information

Brain vessel segmentation is a fundamental component of cerebral disease screening systems. However, detecting vessels is still a challenging task owing to their complex appearance and thinning geometry as well as the contrast decrease from the root of the vessel to its thin branches. We present a method for segmentation of the vasculature in Magnetic Resonance Angiography (MRA) images. First, we apply volume projection, 2D segmentation, and back-projection procedures for first stage of background subtraction and vessel reservation. Those labeled as background or vessel voxels are excluded from consideration in later computation. Second, stochastic expectation maximization algorithm (SEM) is used to estimate the probability density function (PDF) of the remaining voxels, which are assumed to be mixture of one Rayleigh and two Gaussian distributions. These voxels can then be classified into background, middle region, or vascular structure. Third, we adapt the K-means method which is based on the gradient of remaining voxels to effectively detect true positives around boundaries of vessels. Experimental results on clinical cerebral data demonstrate that using gradient information as a further step improves the mixture model based segmentation of cerebral vasculature, in particular segmentation of the low contrast vasculature.


Introduction
TOF-MRA was the dominant noncontrast bright-blood method for imaging the human vascular system, using a short echo time and flow compensation to make flowing blood much brighter than stationary tissue.
Changes in cerebral vessel features are precursors of serious diseases such as cardiovascular disease and stroke.The early detection of these changes is extremely important in order to perform early intervention.Among different vascular analysis tasks, cerebral blood vessel extraction plays an extremely important role as it is the first essential step before any measurement can be made [1].
However, vessel segmentation is challenging due to complex background and noisy images.There are two aspects for such difficulties: one is low image quality for low contrast within angiogram and inhomogeneous background.The other is the complex vessel structure for its appearing in varied shape and scale at different locations.Existing segmentation algorithms may fail in low contrast and thin vessels and better results usually require sophisticated approaches.However, it is highly important to be able to completely remove other brain tissues while keeping the vessels for visual inspection.
This paper is organized as follows.We state the the related segmentation algorithms in Section 2. The experimental framework is described in Section 3. In Section 4 we perform experimental analysis.Finally, conclusions are presented in Section 5.

Related Work
As reviewed in [2], various segmentation techniques have been introduced in the literature.A complete review of existing methods for cerebral vessel segmentation can be referenced at [3].For completeness, we briefly summarize state-of-the-art methods in this section.
The first proposed methods are using active contour to perform this segmentation problem.Methods belonging to this category often follow the same framework where each image having an initial boundary evolves to minimize a surface energy depending on the image intensity values and the local smoothness properties of the desired boundary.The level set technique [5,6] provides as a flexible tool in handling morphological variations.Approaches have been proposed to improve the performance of the original active contour models.A direction-dependent level set approach combining direction information of the eigenvectors computed by vesselness filters and gradient to adjust the weights of the internal energy is proposed by Forkert et al. [7].A local region-based active contour model with morphology fitting energy is presented by Sun et al. [8].A line-shaped profiling method using fuzzy morphology maximum and minimum opening for vessel and background is put forward by Babin et al. [9].An active contour model combining the statistical information of four Gaussian distributions and the vessel shape information of vascular vector field is proposed by Tian et al. [4].
However, these methods often require the parameter adjustment testing process when performing on the new images to achieve optimal performance.In other words, the performance of these methods is highly dependent on the appropriateness of the energy function.
The use of probability statistics information [10,11] for blood vessel segmentation is also popular.This approach is based on the assumption that the pixels are assumed as independent variables and the profile of those pixels has the shape of a probability distribution such as Gaussian distribution.Parametric estimation is often applied in the procedure.For each pixel, the probability is greater than others that will be retained.
Different distributions have been proposed to improve the performance of the accuracy, ranging between the use of one Gaussian distribution to fit the vessels, two Gaussian distribution and one Raleigh distribution to fit the brain tissues [12], a linear combination of discrete Gaussians (LCDG) [13], one Gaussian distribution to model the vascular structure, one Gaussian and Rayleigh distribution functions used to model the low intensity region [14], or just a Bayesian framework with the Maximum a posteriori (MAP) probability criterion [15], to provide a better thresholding value for vessel segmentation and reduce false classification at nonvessel structures.
Besides the three main approaches mentioned above, a number of hybrid methods have been proposed.The method proposed by Gao et al. [16] used the result of Gaussian mixture model as the initialization of level set approach.Shang et al. [17] applied Gaussian mixture model for thick vessels and a vector active contour model for others.Liao et al. [18] used a parametric intensity model for thick vessels and most parts of thin vessels, and the remaining gaps are filled by a Markov random field approach.
Despite relative success from those methods, segmentation of long thin structures is still considered as a delicate task.The variation in shapes and diameters leads to less flexibility and poor accuracy in experimental results.Those methods often fail to capture elongated low contrast structures well.In this study, we focus on statistical methods for segmentation of blood vessel images.Traditional statistical models show the following limitations of the existing approaches.
(1) Most of them presume only on intensity of the datasets for probability distribution.Actually, one characteristic is not enough to deal with the complex structure.
(2) Almost all of them classify each pixel into two classes: vascular or nonvascular.However, some pixels of low intensity belong to background and some with high intensity belong to blood vessels.
(3) Almost all the voxels are involved in the computation procedure.Actually, the real background and real blood vessels are obviously not to be involved in the whole computation.
Thus, developing a fast method that attains high accuracy remains challenging.In this present work, we introduce an algorithm of blood vessel segmentation for MRA data.The method we propose is motivated by the observation that, in the vicinity of the boundaries of the vessels in middle intensity regions, the gradient is a more suitable choice for classification.The intensity value of each voxel in our datasets is a 16-bit positive integer without any rescaling, ensuring accurate intensity information.And we model the middle intensity regions by Rayleigh Gaussian mixture model.The stochastic expectation-maximization (SEM) algorithm is employed to estimate from the observed blood vessel the unknown parameters needed for the parameter estimation.Then the -means approach based on gradient of the remaining voxels reserved from previous step is applied to effectively detect true positives around boundaries of vessels.

Methods
This work focuses on determination of blood vessel voxels in middle intensity regions in 3D-MRA volume.Figure 1 illustrates our vessel segmentation framework, which consists of two main components: subtraction of the background and reservation of the real blood vessels by volume projection, 2D segmentation, and back-projection procedures and the SEM-based statistical method with intensity distribution and -means method with gradient information for final segmentation.In the following, we outline details specific to each step of the algorithm.

Background Subtraction and Vessel Reservation.
In a brain 3D-MRA, vascular regions usually occupy a very small proportion inside the entire image volume.This serves imbalance between the intensity distributions of vessels and background.Such imbalance could cause inaccuracy in vessel segmentation methods and affect the segmentation quality for 3D-MRA.As compared with the original 3D-MRA slices, there is a larger proportion of vessels in its corresponding maximum intensity projection (MIP) image.The proposed method exploits this property to increase the accuracy of statistical segmentation method.
MIP can clearly show the overall shapes and paths of the blood vessels and is computationally fast.So we first project the volume onto the 2D plane along the -axis.According to our previous work [19], the volume projected back from the the -axis direction is the largest.The segmentation on MIP images builds on our previous work [20].The back-projection is used for the subtraction of background and reservation of vessels.Those voxels would not be involved in the following computation.

Statistical Classification of Voxel.
Model selection is an important issue in this kind of statistical segmentation techniques.In traditional methods, histogram of the intensity distribution of the brain 3D-MRA image could be classified into three intensity regions.To the best of our knowledge, Gaussian and Rayleigh or uniform distributions have been used by those statistical approaches for extracting blood vessels from the TOF-MRA data.
According to the histogram of the data volume, a mixture modal of one Rayleigh and two Gaussian distributions is appropriate.Thus the total probability density function for the intensity in TOF MRA images is expressed as follows: where Θ denotes the parameters of the mixture model.  ,   ,   are the prior probability of the three parts, and   +   +   = 1. is the Rayleigh distribution: where   ,  = 1, 2 denotes Gaussian distribution functions with mean   and deviation   is given in the following: In traditional methods, when obtaining the parameters for a statistical model of volumetric MRA data, the probability distribution for cerebral vessels and background can be derived, respectively.According to the maximum a posteriori (MAP) classification, a voxel with intensity   will be classified as cerebral vessels if its posterior probability is greater than the background posterior probability The class labels of the background mixture components are denoted by  and  1 , while  2 is the vessels class.However, one global threshold is not suitable for covering the wide distribution of vessel intensity.Figure 2 shows one example of gradient of remaining middle part.We notice that some of the boundaries of vessels could not be extracted just from a global   threshold.That is one limitation of traditional statistical methods, which classify the whole volume into vascular or nonvascular category using only intensity in one step.
Our method is different, we classify the volume into three parts: the background, the middle region, and the vessel area.When the first segmentation stage is finished, we have obtained most of the vessels.But, some pixels especially at the border of vessels could not be chosen as the intensity is too close to around tissues.To eliminate tissues, the remaining region should be processed with another proper characteristic.
Fortunately, gradient is a salient feature that describes the boundary of the segmented object.As can be seen from Figure 2, the gradients in the vicinity of vessels are higher than other regions.Then we applied -means method which is based on the gradient of remaining voxels for further classification.
Following is the parameter estimation procedure.As traditional expectation maximization (EM) algorithm is sensitive to the initialization.In our model, the mixture of distributions which characterizes the statistic of the middle intensity region is estimated by a stochastic expectation maximization (SEM) algorithm.The algorithm takes an iterative approach to estimating the parameters by maximizing the log likelihood of the mixture model.At each iteration, the SEM algorithm performs the following three steps: E step, S step, and M step.Let  be the set of parameters in (1),  be the set of intensity values in the volume, and  be the set of labels for all the distributions in the mixture model.The three steps of the SEM algorithm can be mathematically presented as follows, given the initial estimation  0 at the th iteration.} of all the voxels randomly, according to the distribution obtained in E step.
The algorithm terminates if the changes in the log likelihood and parameters are sufficiently small.

𝐾-Means
Clustering. -means clustering is a well known method commonly used to characterize the data.Let   be the mean of cluster of   .The squared error between   and the points in cluster   is defined as where {x 1 , . . ., x  } is the data to be clustered, m  = ∑ x  ∈  x  /  is the centroid of cluster   ,   is the number of points in   , and  is the previously defined number of clusters.The goal of -means is to minimize the sum of the squared error over all  clusters: In the -means approach an initial clustering is selected which can be either at random or obtained from the hierarchical cluster approach [21].In this work, we use -means in two steps, in the first stage we initialize the clusters using instances chosen at random from the dataset.In the second stage, we initialize the clusters with the results obtained from the first step.
Considering how to choose the group , we focus on the largest cluster centroid value through observing the change of centroid values curve.Firstly, we calculate the centroids of each gradient image.Then we compare the difference between the adjacent centroids when the cluster number  is growing.Because the gradient value is higher around the vessels than other spots, there will be a larger jump between the adjacent centroid values if the cluster number  is appropriate.Figure 3 shows the sketch of -means method for extraction of boundary vessels.Suppose the input data is clustered into six groups.And the the group with greatest value is the boundary vessels.Observing that the gradient of boundary vessels are higher than those of other parts, differences between boundary and interparts are higher than those of differences between other interparts of vessels.Therefore, if the group includes interparts of vessels, the centroid value of the vessel group will have an obvious change.In other words, if we describe the changes with a line connecting all the centroid values, then the centroid point will be chosen where the largest slope emerges.Then we calculate the final centroid value of the vessel where the optimal value of  is already obtained.The steps of the algorithm is given as follows.
Step 1. Cluster the gradient data into 2 groups, 3 groups, and 20 groups and record the sequences of the largest value of cluster centroid.
Step 2. Calculate the difference between adjacent values of the centroid sequences and record the largest value.
Step 3. In Step 2, the difference  = center  − center  , where center  and center  belong to the set of largest centroid value and  =  + 1.When  get the maximize value, center  will be the object value.

Experimental Results
Datasets were collected using a GE Signa HDx scanner.Each dataset consists of 512 × 512 × 136 axial slices with slice thickness 0.7 mm, TR = 21 ms, TE = 2.3 ms.The proposed segmentation approach is tested on 13 datasets.
Example parameters of the first intensity stage of Rayleigh Gaussian distributions are listed in Table 1.The distribution components are shown in Figure 4(a).Figures 4(b) and 4(c) show the results of first segmentation step using intensity and second segmentation step using gradient information.Clearly, the statistical approach could capture the bright parts of the volume when the first stage is finished.The remaining middle part includes both the tissues and the blood vessels.In order to distinguish between the two objects we start the second step of the algorithm on the remaining region.As for the brain, it is important to track vessels with small caliber in order to determine if there are pathological lesions around them.In this case, the gradient information is applied in order to detect the edges of these fine vessels.When comparing the segmented result of the intensity stage (first step) alone to the segmentation results using gradient information (second step), we see that these fine vessels could be detected when the gradient information is employed.This -means approach with gradient of remaining part is indeed helpful in finding the edges of vascular with low contrast.
Other three segmentation examples are shown in Figure 5.And we also compare the results with one active contour model method.Figure 5(a) shows the MIP images.Note that the first stage of statistical method using intensity and active contour model may also fail to detect some fractions of the vascular that are detected by -means approach with the remaining gradient middle voxels.This elongation of segmented structures is achieved better in (d) compared to (c) and (b).Comparisons of these results From Table 2, it is clear that the probabilistic model with respect to SEM algorithm could complete the first stage segmentation within 5 minutes.Here the error amount is set to 0.05.As we all know, the convergence time takes much longer if the error amount is set smaller.Besides, from visual inspection, there is little difference between the segmentation results when the error amount is below 0.05.So in order to keep balance between the computation efficiency and the segmentation results and when the error amount is smaller than 0.05, the Gauss Rayleigh probability mixing function can obtain the convergence.The average time of the first stage segmentation is 4 minutes and 13 seconds.
For the active contour model, the iteration numbers are all set to 10.As we know, active contour models are time consuming methods; thus the computing time is much longer than other methods.The average segmentation time is more than 20 minutes.
For the second stage segmentation of the proposed method, the experiments show that the number of clustering centers of -means algorithm can be obtained between 8 and 12, and the corresponding value can then be obtained.Thus, considering the computing efficiency, we apply the means method running from 5 to 15 clustering centers.Here we take one clustering center as a period; the computing time is about 1 to 2 minutes.Combining the computing time of first segmentation step, the total average computing time is more than 6 minutes.
In the blood vessel segmentation process, the final result is a classification of voxels.Each voxel is categorized either as vessel or as background tissue.To evaluate the outcome of the segmentation, we applied Dice similarity coefficient to quantify the comparison.The ground truth of the data is manual segmentation by the radiologist.Let  and V be two different contours and   and  V be the set of cells inside the contours, that is, the set of cells such that () and V().The Dice similarity between   and  V is DSC (, V) = 2       ∩  V               +      V     .
It has value of 1 when   and  V are equal and 0 when they do not share any cell.Figure 6 shows the similarities for the pairs of datasets obtained in each experiment.In this experiment, the similarity is lower (0.81 ± 0.023) for statistical model with intensity alone.The similarity for the active contour model is 0.84 ± 0.025 and is 0.92 ± 0.016 for -means method with gradient clustering based on statistical model.

Conclusions
This study describes a vessel segmentation algorithm which can achieve highly accurate segmentation especially in regions of low contrast and at vessel boundaries with disturbance induced by adjacent nonvessel pixels.Subtraction of background approach reduces the computing burden for the parameter estimation procedure, while -means approach using gradient information yields better quality result.However, there is still some disadvantages of this approach, especially the computing time of the -means method for choosing the proper gradient value.Besides, we mainly focus on the segmentation of MRA dataset.Future work includes investigating vessel shape for further segmentation and employs other angiographic data.

Figure 2 :
Figure 2: An example of gradient image of the remaining middle intensity part.The red squares indicate challenging vessel pixels on the gradient image.

Figure 3 :
Figure 3: The sketch of -means for extracting boundary vessels.

Figure 4 :
Figure 4: Intensity distribution fitting plots.(a) Rayleigh Gaussian distributions (dot blue lines), the normalized intensity frequency histogram (solid red line) and the fitting result by using the proposed statistical model (solid cyan line).(b) Traditional statistical method.(c) Gradient clustering with -means approach based on traditional statistical method.

Table 1 :
Example parameters of Rayleigh Gaussian distributions.