Automatic Extraction of the Centerline of Corpus Callosum from Segmented Mid-Sagittal MR Images

The centerline, as a simple and compact representation of object shape, has been used to analyze variations of the human callosal shape. However, automatic extraction of the callosal centerline remains a sophisticated problem. In this paper, we propose a method of automatic extraction of the callosal centerline from segmented mid-sagittal magnetic resonance (MR) images. A model-based point matching method is introduced to localize the anterior and posterior endpoints of the centerline. The model of the endpoint is constructed with a statistical descriptor of the shape context. Active contour modeling is adopted to drive the curve with the fixed endpoints to approximate the centerline using the gradient of the distance map of the segmented corpus callosum. Experiments with 80 segmented mid-sagittal MR images were performed. The proposed method is compared with a skeletonization method and an interactive method in terms of recovery error and reproducibility. Results indicate that the proposed method outperforms skeletonization and is comparable with and sometimes better than the interactive method.


Introduction
The corpus callosum (CC) is the main commissural bundle of fibers interconnecting the left and right cerebral hemispheres [1]. It facilitates interhemispheric communication in the human brain. Its special role has motivated imagingbased study of its size and shape to investigate the morphological correlation with various disorders, such as spastic cerebral palsy [2], fetal alcohol syndrome and fetal alcohol spectrum disorders [3], autism [4,5], Turner syndrome [6,7], HIV/AIDS [8], frontonasal dysplasia [9], dyslexia [10], attention-deficit hyperactivity disorder (ADHD) [11], and Alzheimer's disease [12,13]. Most of these studies are based on the measurement of the CC's simplex geometric properties, such as the area [2,6,7,12,13] and circumference [11] of CC region, the angle between the CC and anterior-posterior commissure line [9]. However, these studies can only reveal the growth or atrophy of the entire CC but not exactly where the change occurs.
Recently, some researchers have focused on centerlinebased analysis [3][4][5]10], which is more powerful and comprehensive: centerline-based analysis can detect the exact position where the variation of thickness and angular change of the CC along the centerline occurs, which is more sensitive and discriminative in comparison with size-or area-based analysis. Nevertheless, the centerline is an implicit representation of the shape, and it is impossible to delineate the callosal centerline from magnetic resonance (MR) images manually. Many researchers have utilized skeletonization techniques to extract the skeleton as the main body of the centerline. The skeleton is a thin version of a shape, which is an important feature for shape description in image processing and computer vision. It is defined as the locus of centers of maximal inscribed disks in two dimensions (2D) [14]. In the technical literature, the concepts of skeleton and centerline are used interchangeably by some researchers, while others regard them as related, but not the same. In the view of anatomists, the centerline of the anatomy is not consistent with the skeleton, because the topology of the skeleton is uncertain, whereas the topology of the anatomical centerline is known. In general, the centerline starts and ends at boundary points. Taking the CC as an example, its centerline should be a curve that starts at the anterior pole of the rostrum and ends at the posterior pole of the splenium. Therefore, centerline extraction cannot completely depend on the techniques of skeletonization. However, the idea can be applied to extraction of the callosal centerline. Most previous centerline-based studies have adopted skeletonization techniques to extract the main part of the centerline and apply curve fitting after labeling the endpoints to obtain the centerline.
To date, diversified approaches have been proposed to extract the skeleton from an image. These approaches can be mainly classified into three categories: distance transformation based [15][16][17][18][19], Voronoi diagram based [20], and thinning based [21]. The distance transform computes the minimum distance of each pixel to the shape boundary. However, the distance transform is very sensitive to small perturbations of the boundary, as each value of the shape is assigned according to a single boundary point (the nearest point). The skeletons obtained by the distance transform require a pruning stage if the boundary is noisy [22]. To overcome the limitations of the distance transform, several smooth medial functions have been introduced based on Newton's law [23], electrostatic field [24], and Poisson's equation [25]. These methods consider several boundary points and therefore better reflect the global properties of the shape than does the distance transform. Thinning based methods involve a morphological operation that is used to remove object boundary pixels from binary images iteratively with a set of conditions, somewhat like erosion or gradual opening. Complex conditions are required to terminate this process and to preserve the topology and connectivity of the skeleton. In the Voronoi diagram-based approaches, the skeleton is extracted from a Voronoi diagram derived from the object boundaries. Existing skeletonization techniques suffer from at least one of the following shortcomings: dependence on the accuracy of determining the medial axis, computational complexity, lack of robustness, connectivity, spurious branches, or sensitivity to boundary noise. Therefore, the centerline cannot be precisely obtained through skeletonization alone.
Localization of the anterior and posterior endpoints is another issue. The anterior and posterior endpoints are at the anterior and posterior poles of the CC, respectively, as shown in Figure 1. In general, the skeleton extraction methods cannot localize these two points, because they are not part of the skeleton according to the definition of a skeleton. The endpoint at the anterior pole of the rostrum (see Figure 1) is usually associated with a local maximum curvature of the callosal boundary [4,5,7]. Owing to the existence of noise on the boundary, it is not easy to locate it uniquely using stateof-the-art corner detection methods. As for the posterior endpoint, the problem is even more complicated, because the geometric features around it are not obvious, and there is no sharp tip in the splenium as exists in the rostrum. In addition, anatomical variability makes it more complicated to locate the posterior endpoint. Thompson    lowest points of the genu and splenium as the endpoints. In [11], the endpoints of the CC were determined by extending the centerline to the boundary. Owing to the inconsistent criteria for locating the endpoints, the results of these studies may also be inconsistent. To the best of our knowledge, there is no effective method that localizes these two endpoints. Centerline-based shape analysis has been widely used in CC. Most research has adopted the method of skeletonization to extract the centerline. However, there are few works on the evaluation and validation of these centerline extraction techniques, which poses a rather serious challenge when interpreting their results. In this paper, we propose a method for the callosal centerline extraction from segmented midsagittal MR images. The main contributions of this paper are as follows. First, a method of model-based point detection is proposed to localize the callosal endpoints. A model for each endpoint is generated using statistical shape context as the descriptor under a local coordinate system, in which point detection is robust to boundary noise and is rotation invariant (to a certain extent). Then, active contour model (ACM) based curve evolution with two fixed endpoints is applied to approximate the centerline, which guarantees the topology of the obtained centerline and tolerates the influence of boundary noise. Experiments with 80 segmented mid-sagittal MR images were performed to evaluate the effectiveness of the proposed method.

Automatic Localization of the Endpoints.
To automatically localize the two endpoints, a statistical model-based point detection method is proposed, which consists of two steps: model construction and point localization.

Model Construction.
The statistical models of the endpoints are generated using shape context [26], which is a robust, compact, and highly discriminative descriptor widely used in shape matching. The shape context of a point of interest is a measure of the distribution of other points in the shape relative to it under the log-polar coordinate system. Given a point , the shape context of is defined as a coarse histogram of the relative polar coordinates of the other points, written as Computational and Mathematical Methods in Medicine where denotes the other points of the shape and ℎ( , ) is a normalized × bins histogram in log-polar space at . Each bin indicates the proportion of the points in this region with respect to the total adjacent points of . Figure 2 illustrates the process of computing the shape context.
It is easy to make the shape context scale invariant, but we cannot guarantee rotation invariance by referring to the image coordinate system owing to different scanning directions and the existence of individual variability. Some methods obtain the shape context with respect to the tangent direction at the point, which may lose orientation information of the point and cause the shape context to be less sensitive when distinguishing similar boundary points. Alternatively, two local Cartesian coordinate systems with respect to the rostrum and splenium are defined for the computation of the two endpoints' shape contexts. Given a segmented CC, its bounding rectangle and major and minor axes are extracted using the method proposed by Chaudhuri and Samal [27]. Then, the CC is automatically divided into five subregions according to a modification of the Witelson partitioning scheme [28,29]. Four radial dividers emanate from the midpoint of the inferior side of the bounding rectangle with equal angular interval and divide the CC into five subregions, i.e., the rostrum and genu (denoted as CC1), the rostral body (denoted as CC2), the mid-body (denoted as CC3), the isthmus (denoted as CC4), and the splenium (denoted as CC5) (see Figure 3). The log-polar coordinate system is defined on CC1 (and CC5) with its origin at the mass center of CC1 (and CC5) and radial axis parallel to the major axis of CC1 (and CC5) from anterior to posterior (see Figure 2(b)). The radial coordinate is divided by the height of the bounding rectangle of the CC for normalization, which guarantees scale invariance of the shape context.
Results may be biased if the endpoint with its shape context is derived from only one individual's CC. Therefore, we create a statistical model using the mean shape context as the descriptor. Suppose there are samples (i.e., segmented CC images) in the training sets. Two raters are asked to label the endpoints by mutual agreement. Then, the shape context of each sample (denoted as ℎ ) is calculated according to (1) and all shapes are aligned with respect to a local log-polar coordinate system. The mean shape context (denoted as ℎ) is then written as Here, a statistical model is created and used to detect the endpoint by matching the model with the shape contexts of the candidate points.

Point Localization.
Given a segmented CC, denotes a candidate point on the boundary. The process of locating the endpoints is to find a boundary point whose shape context is most similar to the model. As the shape context is represented as a histogram, the similarity is measured as the sum of the difference of two histograms according to * = arg min∑∑ ℎ ( , ) − ℎ ( , ) Here, ℎ ( , ) is the shape context of and ℎ( , ) is the statistical model obtained from (2).

Active Contour Based Centerline Extraction.
The invisible CC centerline is approximated using the ACM proposed by Kass et al. [30]. The advantage of ACM is that the topology of the curve can be preserved during its evolution, which means that spurious branches can be avoided. To allow the curve to approximate the invisible centerline of the CC, the representation of the centerline should be introduced in advance.

Representation of the Centerline.
The centerline is depicted implicitly using a distance map [31], which labels each pixel with the distance to the nearest boundary pixel. If a pixel in the CC is labeled with a maximum distance, it means that this pixel is far from the boundary and in the center of the CC. Therefore, this pixel may be on the centerline. Let denote the segmented CC region, and denote the boundary of . We refer to ( , ) as the Euclidean distance between two pixels and . The distance map in the CC is defined as

Evolution of the Curve.
To fit the centerline using ACM, a curve x( ) = [ ( ), ( )], ∈ [0, 1] moves within the spatial domain of the distance map by minimizing the following energy function: where and are weighting parameters that control the curve's tension and rigidity, respectively, and x ( ) and x"( ) denote the first and second derivatives of x( ) with respect to . The external energy is a function derived from the distance map of the CC and is responsible for driving the curve to the maximum distance region where the centerline is located. The formulation of is where is the normalized distance map. The gradient flow ∇ (x( )) moves the curve toward to the centerline. The coefficient 2/(1 − | (x( ))| ) is used to modulate the force of the gradient flow of the distance map. When x( ) is near the boundary (or center) of the CC, the coefficient is close to 1.0 (or 0.0), and the external energy is increased (or decreased). This guarantees that the curve approximates the centerline more stably. The model is initialized with a spline curve starting at the anterior endpoint, ending at the posterior endpoint, and passing through four control points in the CC. These four control points are on the four radial lines shown in Figure 3 (the maximum distance points). Then, a spline curve is interpolated with nearly equal distance intervals for initialization of the model. During the evolution of the curve, the endpoints of the curve are fixed at the detected endpoints.

Experiments and Results
The proposed method was implemented using the C++ language. The experiments were performed on an HP workstation with Intel Xeon CPU (E5540@ dual-core, 2.53 GHz) and 8 GB RAM. In the experiments, the weighted coefficients and in (5) were empirically set to 0.1 and 0.5, respectively.
To compare our method with existing methods, a skeletonization method and an interactive method were implemented. The skeletonization method (pfSkel-1.2.1.1) proposed by Chuang et al. [32] is publicly available (http:// coewww.rutgers.edu/www2/vizlab/NicuCornea/Skeletanization/skeletanization.html).pfSkel mainly consists of four steps. First, a 2D vector field in the segmented CC is calculated with respect to the boundary pixels. Second, the critical points of the vector field are detected as the core skeleton. Third, the first level skeleton is generated from the divergence of the vector field. Last and fourth, the second level skeleton is derived by connecting the boundary pixel with a certain percentage of curvature value to the core and first level skeleton. The interactive method is based on the pfSkel method and consists of three steps. First, the skeleton is extracted from the segmented CC. Then, the anterior and posterior endpoints are labeled manually. Finally, the centerline is obtained with a cubic spline connecting the endpoints and fitting the skeleton. For clarity, pfSkel is denoted as SKEL1 (which only generates the first level skeleton) and SKEL2 (which generates the first level plus second level skeletons). The interactive method and our method are denoted as CLM and CLA, respectively.

Data and Preprocessing.
The data sets for evaluating the presented method contain high-resolution T1-weighted MR brain volumes of 80 subjects, including 50 healthy controls and 30 patients with various pathologies (infarctions); subject ages range from 12 to 60 years. The volume size varies from 192×256×256 to 256×181×256 voxels. The voxel size ranges from 0.897 mm to 1 mm in the sagittal plane, from 0.879 mm to 1.25 mm in the coronal plane, and from 0.67 mm to 1.5 mm in the axial directions.
The CC in the mid-sagittal plane was segmented with our self-developed software applying the following steps: (1) resampling each volume to make it isotropic; (2) extracting the mid-sagittal MR image using the method proposed by Hu and Nowinski [33]; (3) Binarizing the mid-sagittal MR image with upper and lower thresholds determined using Gaussian mixture modeling [34]; (4) extracting the bounding rectangle of each region and calculating geometric parameters (such as length and width) using the method proposed by Chaudhrui and Samal [27]; (5) selecting the CC region according to its anatomic characteristics: (a) length (from the anterior point to posterior point) of 7 to 9 cm, (b) width (from the superior point to the inferior point) of 2 to 4 cm, (c) orientation (angle of the major axis with respect to the horizontal axis) from 5 ∘ to 40 ∘ , and (d) area > 2 cm 2 ; and (6) manually rectifying any mis-segmentation or oversegmentation by two raters in mutual agreement. After the segmentation, the centerline endpoints were manually identified on the boundary of the CC by two experts according to their anatomical knowledge after mutual agreement.

Accuracy of Endpoint Localization.
To validate the accuracy of the presented endpoint localization method, statistical models were generated with 15 samples in the datasets using the method described in Section 2.1.1. Then, the statistical models were used to localize the endpoints in the other 65 samples in the datasets. The endpoint localization error is measured as the distance between the detected point and the manually labeled point. The endpoint localization error was 0.85±0.12 mm in this case.

Qualitative (Visual) Evaluation.
We present the results of SKEL1, SKEL2, CLM, and CLA to illustrate the difference in the centerline extraction in Figure 4. The top row illustrates the mid-sagittal MR images. Owing to the existence of intersubject variability, the shape of the CC varies significantly among the eight subjects. The results of SKEL1 are shown in the second row. The skeletons are not continuous and do not start and end at the anterior and posterior poles of the CC (see Figures 4(c) and 4(g)). The third row illustrates the results of SKEL2, in which the percentage was set to 0.001 experimentally. It generates fewer branches and more skeletons near the centerline. Even though the parameter has been adjusted to reduce the number of branches, there are still some spurious branches present (see Figures 4(b), 4(d), 4(e), 4(g), and 4(h)). The fourth row demonstrates the centerline extracted by an experienced and well-trained rater with the interactive method. The bottom row exhibits the results of the proposed method. The extracted centerlines are continuous curves connecting the anterior pole to the posterior pole and are centered in the region of the CC.

Quantitative Evaluation.
The centerline is a geometric feature of a shape and is essentially invisible to the naked eye. Given a segmented CC, no radiologist or anatomist can manually delineate a centerline as the ground truth. Hence, it is difficult to validate the accuracy of the proposed method straightforwardly. In this paper, we adopted a technique used in assessing skeletonization results as proposed by Direkoglu et al. [35].
Suppose a point set represents the extracted centerline. According to the definition of the centerline, a point on the centerline must be the center of a maximum disk inscribed in the CC's shape. Let max ( ) denote the radius of the maximal disk ( , ( )) centered at the point . The reconstruction of the CC region is given by where ( ) is the reconstructed CC region. The quality of the extracted centerline is evaluated using a reconstruction error rate (RER) between the reconstructed and the segmented regions of the CC, which is calculated as follows: where ( * ) is a function used to calculate the area measured in pixels. and represent the images that contain the segmented and reconstructed regions of the CC, respectively.
We reconstructed the regions of the CC using the centerlines obtained by SKEL1, SKEL2, CLM, and CLA, respectively. RER is calculated using (8). Figure 5 shows the RERs of SKEL1, SKEL2, CLM, and CLA. The mean and standard deviation of the RERs of CLA, SKEL1, SKEL2, and CLM are 0.12±0.01, 0.24±0.08, 0.15±0.04, and 0.14±0.02, respectively. The presented method outperforms the skeletonization methods (SKEL1 and SKEL2) and is comparable with or even better than CLM in terms of the RER.

Reproducibility Evaluation.
To compare the presented method with CLM in terms of reproducibility, we randomly selected 10 samples in the dataset. For each sample, we extracted the centerline 10 times with the presented method and CLM, respectively. Five knowledgeable raters were asked to extract the centerline of each case with CLM once per day to guarantee that the raters were not influenced by previous results. Then, we obtained two groups of centerlines, i.e., one group with our method and another one with CLM. In each group, the distance between any two centerlines was calculated, and the mean distance was denoted as the reproducibility error.
Owing to the discretization of the centerline, the distance between two centerlines is measured as the distance between two point sets representing the centerlines. Given two centerline point sets = { 1 , 2 , . . . , } and = { 1 , 2 , . . . , }, There was no reproducibility error with our method and an average reproducibility error of 0.047±0.003 mm with CLM. Table 1 shows the results of CLM in detail.

Discussion
The corpus callosum plays an important role in the communication between the left and right cerebrums. Due to its essential role, its dysfunction may cause various neuropsychological or neuropathological diseases, while the progression of these diseases may also cause it to physically change shape and/or thickness. MRI-based morphology analysis of the CC has become an effective technique to investigate the variation of the CC in relation to these diseases in vivo. Most studies are based on area measurement [2,13]. These methods can only detect the whole body change of the CC, which cannot describe the shape change in detail, and their findings are not sensitive or discriminative to specific diseases. To reduce this limitation, some researchers proposed measuring the area of the CC's subregions, which are obtained according to its geometric features [1,12]. There are several rules to divide the CC [29,[36][37][38]. More recent studies based on these schemes have generated controversial results concerning the assumed topography of the callosal fiber tracts. Recently, several groups have focused on centerline-based analysis of the CC [3][4][5]10], which is a promising way to investigate the variation of the CC in relation to specific diseases. The centerline, as a compact representation of the CC's shape, can be used to measure the thickness of the CC and the curvature at any centerline point; these descriptors provide comprehensive information regarding the CC's shape. In addition, the correspondence information among samples can be achieved easily, which facilitates population-based analysis, also known as centerline-based morphological analysis.
The presented method consists of two steps: automatic localization of the endpoints and ACM-based centerline extraction. There are three advantages to the presented method. First, the endpoint localization method is robust to boundary noise in comparison with methods based on curvature because the statistical shape context as a descriptor of local shape features can avoid the disturbances caused by noise. Second, the endpoints localization method is scale invariant due to normalization of the shape context. In addition, the endpoints localization method is rotation invariant to some extent. This is owing to the adoption of the local coordinate system, which makes our method robust to the rotation derived from not only the scanning direction but also individual variability in the CCs of different people. Despite the CC inclining forward and backward with respect to the horizontal line in Figures 4(b) and 4(f), our method can localize the endpoints accurately. The shape context is calculated with 15 bins in (1) (see Figure 2). In theory, the presented method can accommodate an angle between the model and the sample if less than 24 ∘ (= 360 ∘ /15). Figures  4(d) and 4(e) show the case for different angles between the rostrum and the body owing to the existence of intersubject variability. Finally, our method preserves topology by utilizing ACM to fit the centerline: a smooth centerline with no branches is obtained. In contrast, it is difficult to control the topology of the centerline using skeletonization. There is also no gap in the extracted centerline using our method, while gaps may exist in the skeletonization centerline (see Figures 4(c) and 4(g)).
The presented method has a lower RER in comparison with SKEL1, SKEL2, and CLM. This contributes to an automatic, accurate, and robust method for locating the endpoints and reproducing the centerline using ACM-based curve evolution. However, the recovery error cannot reach zero due to the irregular shape of any CC. In terms of reproducibility, 8 Computational and Mathematical Methods in Medicine the presented method has a higher accuracy in comparison with CLM because the endpoints localization method is more consistent in contrast to manual labeling by raters' subjective analysis.
The accuracy of the endpoint localization will affect that of the centerline extraction. However, the influence is limited owing to the movement of the curve in ACM mainly driven by the gradient flow of the distance map. The endpoint localization error merely interferes in the curve's behavior near the endpoints. The movement of the curve's main body is still under the supervision of the gradient flow. Moreover, the proposed method presents high accuracy (0.85±0.12mm) and robustness (see Figure 4) in the endpoint localization.
The prerequisite of the presented method is that the CC should be segmented from a mid-sagittal MR image in advance. To date, there are several techniques available to extract the mid-sagittal MR image automatically [33,[39][40][41] and various methods to delineate the corpus callosum, such as mathematical morphology-based methods [42], clusterbased methods [43,44], deformable mode-based methods [45], tractography-based methods [46,47], and templatebased methods [48]. Any of these methods can be integrated into the presented method for convenience.

Conclusions
The centerline of the CC can depict the CC's shape variation in more detail when compared to size or area measurements. In this paper, we proposed a method of automatic extraction of the callosal centerline. The anterior and posterior endpoints are localized using statistical model-based point matching, which is robust to boundary noise and is rotation invariant to a certain extent. The centerline is fitted using the active contour model driven by a gradient of the distance map to produce an implicit representation of the centerline. Experiments with segmented MR images were performed to validate this method and the results indicate that our method outperforms skeletonization and is comparable with and sometimes better than the interactive method.
In the future, neurological or neuropathological diseases related to changes in the corpus callosum can be analyzed with centerline-based measurements, such as variation of thickness and curvature.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.