Robust Diffeomorphic Mapping via Geodesically Controlled Active Shapes

This paper presents recent advances in the use of diffeomorphic active shapes which incorporate the conservation laws of large deformation diffeomorphic metric mapping. The equations of evolution satisfying the conservation law are geodesics under the diffeomorphism metric and therefore termed geodesically controlled diffeomorphic active shapes (GDAS). Our principal application in this paper is on robust diffeomorphic mapping methods based on parameterized surface representations of subcortical template structures. Our parametrization of the GDAS evolution is via the initial momentum representation in the tangent space of the template surface. The dimension of this representation is constrained using principal component analysis generated from training samples. In this work, we seek to use template surfaces to generate segmentations of the hippocampus with three data attachment terms: surface matching, landmark matching, and inside-outside modeling from grayscale T1 MR imaging data. This is formulated as an energy minimization problem, where energy describes shape variability and data attachment accuracy, and we derive a variational solution. A gradient descent strategy is employed in the numerical optimization. For the landmark matching case, we demonstrate the robustness of this algorithm as applied to the workflow of a large neuroanatomical study by comparing to an existing diffeomorphic landmark matching algorithm.


Introduction
There have been many approaches to segmentation in medical imaging, including both the active shape methods pioneered by Kass et al. [1] and template based approaches pioneered by Dann et al. [2]. For studying images made up of simple homogeneous structures such as anatomical structures, local active evolution methods [1,[3][4][5] which are encoded through their boundary representations are natural. In such methods, the complexity of the representation is reduced from an encoding based on the dimension of the extrinsic background space containing the object, to the dimension of the boundary.
Given the line of work in template based computational anatomy which has emphasized the important role of diffeomorphisms for defining bijective correspondence between coordinate systems, it is natural to constrain the iterative methods of active shapes so that shape evolution preserves the original topology of the template. This is the intention of the diffeomorphic active contour (DAC) approaches taken by Younes et al. [6][7][8], including in the local evolution equations the diffeomorphism constraint. DAC methods, in a form similar to the original methods of Christensen et al. and Trouvé [9,10], only optimize for the final position of the deformable template and not for the evolution process that leads to it. The approach adopted herein results in an entire trajectory through shape space, allowing basic prior knowledge, that is, proximity to a template, to be incorporated in the estimate of a shape.
The trajectories considered are geodesic flows, which are deduced from the Riemannian structure associated to large deformation diffeomorphic metric mapping (LDDMM) [10][11][12]. Geodesics are characterized by a conservation law [7,[13][14][15][16] on the "momentum" associated to the evolution, where 2 International Journal of Biomedical Imaging we describe in Section 2 what is meant by momentum in this context. This allows a further reduction in complexity from a time varying flow to a single initial condition: an initial momentum vector. In other words, a target shape is represented as the endpoint of a geodesic flow from a template and can be encoded by one such vector. In this setting, knowledge of shape variability is straightforwardly incorporated via prior distributions on initial momentum. We call these connections geodesically controlled diffeomorphic active shapes (GDAS).
In this paper, we examine robust LDDMM via geodesically controlled diffeomorphic active shape models. The GDAS method allows us to introduce prior distributions so as to support the diffeomorphic large deformations of unconstrained LDDMM (taking advantage of the reduction in complexity from a time varying flow to an initial condition), while at the same time constraining the mapping, so it is indexed to neuroanatomical shapes such as the subcortical structures (taking advantage of the reduction in complexity from background space to structure boundaries). We demonstrate that these mappings are robust to small variations associated with the MRI measures of the structures. This is accomplished by constraining the initial momentum of the GDAS solutions to be in the span of a finite-dimensional basis constructed from PCA associated with large-scale surfacebased [17] anatomical studies and by penalizing our initial momentum estimates in basis directions of low variability as recently also derived in Qiu et al. 's work [18].
As with the classical active shape methods (described for example in [19][20][21][22][23][24][25], we pose the GDAS problem in the variational setting with the data term used for matching derived from various representations of a partition of the "scenes" including (i) collection of structures defined via triangulated meshes, (ii) a collection of structures defined through feature points, and (iii) a collection of homogeneous structures defined via inside-outside appearance model [26][27][28][29][30]. In case (iii), the process is iteratively driven by a voxel's likelihood of being interior or exterior of the region of interest (ROI), with the shape controlled by the conservation law geodesic dynamics. In our model, the appearance likelihood at each voxel only depends on whether the voxel is inside or outside the surface. It is estimated from the MRI training samples and modelled as Gaussian mixtures [31,32] learned with the expectation maximization algorithm.
Vailliant et al. [33] first proposed this framework in a discussion of statistics on diffeomorphisms, and more recently Qiu et al. [18] addressed this problem and derived an algorithm for the case of surface-to-surface matching. Here, we consider a general data attachment term and provide a variational solution. We develop and implement a gradient descent algorithm for the case of surface matching, landmark matching, and grayscale image segmentation. A lack of robustness is an important challenge to high dimensional registration and a barrier to its automation and use in high throughput studies. We show that GDAS provides an efficient method for constraining LDDMM to incorporate the finitedimensionality of typical shape variation and emphasize the robust performance of our methods on challenging biological datasets

Geodesic Diffeomorphic Evolution for Active Surfaces
In our region-of-interest (ROI) approaches to subcortical structure analysis in the human brain, our goal is to robustly segment anatomical structures (in particular neuroanatomical structures such as the hippocampus, caudate nucleus, etc.) from the surrounding environment using a given set of data (such as manually placed landmarks within an ROI, coarse segmentations, or MR images). Typically, anatomical structures have their own characteristic shapes and appearance which must be learned from training data to successfully perform segmentation.

Conservation Law Controlled Diffeomorphic Evolution.
The methodology of tangent space representation has been a powerful tool in computational anatomy, since it was proposed in [16]. In this context, evolution of visual structures, like points, curves, surfaces, and images is governed by geodesic equations. By the law of momentum conservation, the initial state of the equations determines the entire trajectory of evolution and can be used as a representation of the trajectory endpoint. We refer to [16,34] for more details and context in shape spaces modeled as homogeneous spaces under diffeomorphic action and describe here a special form of the associated equations that will be adapted to our needs. For a triangulated surface 0 in R 3 with vertices 1 , . . . , , the initial momentum (0) : 0 → R 3 can be represented by a vector at each vertex through (0) = ∑ =1 ( ). One can derive the geodesic equation for the evolution of 0 , which is equivalent to the geodesic equation for point sets [16]. We define a radially symmetric smoothing kernel on R 3 × R 3 . A typical choice, used here, is With ( , ) = (‖ − ‖ 2 ), we denote = (‖ − ‖ 2 ), and = (‖ − ‖ 2 ). The geodesic evolution satisfies where the notation ⋅ refers to the usual dot product between vectors in R 3 . Once the initial position of the vertices, (0) = ( 1 (0), . . . , (0)), and the initial momentum vector, (0) = ( 1 (0), . . . , (0)), are provided, the evolution of the point set is uniquely determined. The endpoints of the evolution (1) = ( 1 (1), 2 (1), . . . , (1)) correspond to the deformed surface (1). It can be shown that (2) has solutions over arbitrary time intervals [34]. Equation (2) induces International Journal of Biomedical Imaging 3 a diffeomorphism, , that interpolates the evolution of the vertices via the equation The conservation law associated to (2) then takes the form ( ) = ( , ) − (0). Hence, we are able to represent a shape by an initial momentum vector (as opposed to a function or distribution) and a template. Furthermore, the variability of shapes can be described from probabilistic properties of initial momentum. This is the basic representation for the tangent space PCA of shapes.

Finite Dimensional Representation via PCA in the Tangent
Space. Clearly, as the discretization of the surface gets fine, the dimension of the control point representation of the momentum goes to infinity. We want to characterize the main modes of statistical variations in our data, of which there is a small number up to some acceptable error. This enables us to write the momentum vector in a finite dimensional span regardless of the discretization.
It is natural to do this empirically based on training samples. PCA depends on the definition of an inner product in the considered space. In our context, the inner product derives from the Riemannian metric that led to the geodesic equation. Details of corresponding PCA have been described previously [33].
We calculate a population template using the approach in [35], but several other algorithms are available [36][37][38]. We denote the inner product between two initial momentum vectors and̃, as ⟨ ,̃⟩ . Assume we have recovered mean initial momentum vector , orthonormal prinicipal components (with ⟨ , ⟩ = ), and an estimate of variance along each component . We note that any initial momentum vector can be projected onto the principal space as By parameterizing our deformations with respect to the , the variability of the surface shape is constrained by the principal space generated from the training set.

Variational Formulation of Volumetric Segmentation.
We formulate segmentation problems within an energy minimizing scheme. The energy includes two terms: one term is to constrain the shape in the principal space, with appropriate weights derived from PCA, and the other term regulates the error of mismatch, which we will define in a general sense here and show specific examples in Sections 3.2 to 3.4. We introduce some notation. Let Ω ⊂ R 3 be the background space. Let denote the template, which will be assumed to be a closed surface. After learning, the selected principal space of initial momentum is spanned by orthonormal vectors 1 , 2 , . . . , corresponding to decreasing eigenvalues 1 , 2 , . . . , .
Our goal is to segment the region-of-interest (ROI) from Ω using a deformable model; that is, we want to deform the template , so that it overlaps with the boundary of the ROI. We assume that is a triangulated surface, with vertex set 0 = ( 0 1 , . . . , 0 ). For an initial momentum , we let denote the solution at time = 1 of (2) initialized with (0) = 0 and (0) = . We let denote the triangulated surface that inherits the topology of with displaced vertices .
We now describe the two terms in our energy that balance the prior knowledge we have of the diffeomorphic deformation with the accuracy of the segmentation. The initial momentum is constrained to the principal space and takes the form (0) = + ∑ =1 . This prior knowledge is regulated by the coefficients of principal components, scaled by eigenvalues, resulting in the first term of our cost function: We define the accuracy of segmentation based on some function 2 of only available data (e.g., an image, a set of landmarks, or a surface) and the configuration of our deformed template . For the time being, we assume that the derivative of this function with respect to each of the is known, and we denote the vector of this derivative by 2 / . Our goal is to find the optimal coefficients 1 , 2 , . . . , that minimize the energy: The variational problem is solved by calculating the derivative of with respect to each coefficient . The derivative of 1 is trivial: The derivative of 2 can be calculated by the chain rule While the term / is unknown, its product with 2 / (which we denote as 2 / ) can be calculated numerically by solving a system of linear ordinary differential equations. This adjoint method has become common in this context and is derived, for example, in [35]. It is described for completeness in Appendix B. The derivative of is given by We now discuss three applications and explain examples of 2 and their gradients. International Journal of Biomedical Imaging

Robust Surface Matching.
In challenging surface mapping applications, it can be necessary to regularize the mappings to avoid undesirable results, and GDAS provides a powerful method for doing so.
In particular, volumetric segmentations of neuroanatomy are often readily available. Converting them to an isosurface for analysis and display is standard, and GDAS provides a method to convert such an isosurface to one reflecting the typicality and variability of a population, rather than features of the volumetric data with an unnatural voxelized structure. Our goal here is to provide a tool to correct for such erroneous segmentations.
This application is essentially equivalent to that presented in [18]. We retain it here as it is the most natural application of GDAS (priors learned from surface matching are used to regularize surface matching), to develop a notation consistent with that for our other applications and to demonstrate robust performance on poorly behaved datasets.

Notation. The deformed template surface
is triangulated with vertices 1 , . . . , . Suppose the template surface has faces denoted as = ( 1 , . . . , ). Each face is represented by an ordered triple of vertices: = ( ,1 , ,2 , ,3 ). We define oriented edges on face by the area-weighted normal to face by and the face center by Similarly, suppose the target surface has vertices 1 , . . . , and faces denoted as = ( 1 , . . . , ), with oriented edges ,1 , ,2 , ,3 , area-weighted normals given by , and face center . Similarly to the case for velocity fields, we define a smoothing kernel ( for "surface") of the form in (1) to be used for comparing two surfaces.

Energy.
Following [17], we embed surface matching in a more general "current matching" problem. This results in an energy to be minimized taking into account closeness between two surfaces, as well as orientation of normals: The constant 2 in (6) is determined heuristically.

Energy Gradient.
We refer the reader to [17] for the derivation of this energy gradient and present the result only here: where the first sum is over faces for which is a vertex, and the symbol is reserved for the index of that vertex on each such face ( ∈ {1, 2, 3}).

Robust Landmark Matching.
A further application of our framework involves ROI analysis methods based on diffeomorphic landmark matching. Given a template surface containing landmarks located on vertices, a trained technician places corresponding landmarks in T1 MR images. Diffeomorphic landmark matching provides a segmentation of the structure of interest in each T1 image by applying the landmark-based transformation to the entire template surface. This procedure is advantageous, because it provides a compromise between the speed of automatic segmentation and the accuracy of hand segmentation.
However, variability in landmark placement and sparsity of landmarks can occasionally lead to unsatisfactory segmentation results and to a time-consuming quality control stage where such segmentations are fixed manually. We propose to regularize the problem, taking into account landmark placement variability based on voxel size, as well as shape variability learned from PCA.
International Journal of Biomedical Imaging 5 3.3.1. Notation. For simplicity, we assume that landmarks on the template are chosen among the vertices of its representation as a triangulated surface. We denote by ( ) = ( ) ( ) the th landmark, which is placed on a template vertex ( ) and flowed according to (2) up to time . Therefore, (1) = ( ) . We define the corresponding landmark placed on the target image as .

Energy.
We assume that the only available information on the targets is landmarks at positions homologous to those on the template. The accuracy of segmentation is then based on the squared distance between deformed template and target landmarks, leading to the term The weighting parameter is chosen on the order of voxel size 2 = ((Δ /2) 2 +(Δ /2) 2 +(Δ /2) 2 )/3 (equal to 0.4475 for our application) where Δ , Δ , and Δ are the target image's voxel dimension.

Energy Gradient.
The variation of 2 with respect to the th deformed template vertex is 0 if this vertex does not correspond to a landmark, and is the th landmark.
3.4. Robust Image Segmentation. We seek to automatically segment subcortical structures from MR images. For simplicity, we assume that such structures are relatively homogenous throughout and therefore chose an appearance model for voxel intensities that depend on location only through whether they are inside the structure or not. To perform image segmentation, we seek to partition the space into high integrated voxel-likelihood under such an inside-outside model. The approach can be generalized to more complex appearance models (involving higher order image features, e.g.) in a straightforward way.

Appearance Model.
Gaussian mixtures are widely employed to model voxel intensity of medical imaging. That is, a p.d.f of intensity at a certain location in the tissue is in the form of where , , and 2 denote the weight, mean and variance of th (out of ) Gaussian component, respectively.
In our work, we assume that the intensities at all points of the interior region (resp., exterior region) of the surfaces share the same mixed Gaussian distribution and the p.d.f's are denoted as int and ext , respectively.
Given the number of mixture components, the maximum likelihood estimator for the parameters can be computed using the EM algorithm [39]. Our estimation of int and ext (using mixtures of Gaussians) is performed on the basis of training images with manual segmentation, in which the collection of all intensity values of voxels inside (resp., outside) the ROI are used for int (resp., ext ).

Energy.
We define the accuracy of segmentation using integrals of likelihood of being misclassified, and we define the mismatch: where we denote the interior and exterior of a closed surface by int( ) and ext( ), respectively. The constant 2 in (6) is determined heuristically.

Energy Gradient. The energy gradient is derived in Theorem
and , the midpoint of the th edge of face , we have 3.5. Numerical Implementation. We use gradient descent to optimize the PCA coefficients and iteratively update with − ( / ) until convergence. The discretization of the surface integrals in (19) is simply performed by replacing ( )( , − ) × , by its value at = , the center of face , with This yields the approximation that has been used in our implementation. We summarize these steps in Algorithm 1. The complete procedure, including training, is summarized in Algorithm 2.

Experimental Methods
To demonstrate the proposed algorithm, for each of the three data attachment terms, we use data being processed as part of many of our region-of-interest (ROI) biological studies in schizophrenia, depression, Alzheimer's disease, ADHD, and autism [40][41][42][43][44][45] (for landmark matching and image segmentation) and Alzheimer's Disease Neuroimaging Initiative (ADNI) study (for surface matching and PCA In order to find an initial momentum minimizing (6) (1) Initialize with (0) being vertices of template and = .
(3) Compute the gradient 2 / , for example using (14), (16), or (21). , and solve (2) initialized at , to obtain the updated surface . Iterate steps 2 to 6 until numerical convergence or reaching the maximum iteration limit. For the case of landmark matching, we have integrated it into the workflow of several large neuroanatomical studies (e.g., [46]). We therefore include this application as a case study, quantifying performance in detail and demonstrating improvement we expect to gain. Our hypothesis when beginning this work was that the GDAS algorithm would exhibit increased robustness compared to more standard methods, without significantly sacrificing accuracy.
Data used in the preparation of this paper were obtained from the ADNI database (http://adni.loni.ucla.edu/). The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies, and nonprofit organizations, as a $60-million, 5-year public-private partnership. The primary goal of ADNI has been to test whether serial magnetic resonance imaging (MRI), positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment (MCI) and early Alzheimer's disease (AD). Determination of sensitive and specific markers of very early AD progression is intended to aid researchers and clinicians to develop new treatments and monitor their effectiveness, as well as lessen the time and cost of clinical trials.
The Principal Investigator of this initiative is Michael W. Weiner, MD, VA Medical Center and University of California San Francisco, CA, USA. ADNI is the result of efforts of many coinvestigators from a broad range of academic institutions and private corporations, and subjects have been recruited from over 50 sites across the US and Canada. The initial goal of ADNI was to recruit 800 adults, ages 55 to 90, to participate in the research, approximately 200 cognitively normal older individuals to be followed for 3 years, 400 people with MCI to be followed for 3 years, and 200 people with early AD to be followed for 2 years. For up-to-date information, see http://www.adni-info.org/.

Principal Component Analysis.
Given a template and a set of target surfaces (650 for the ADNI study), we perform principal component analysis on initial momentum data as described in Section 2.2. The same ADNI template and PCA model will be used in each of our applications, even for data taken from other datasets. We plot the variance as a function of number of dimensions and identify the number of dimensions required to account for 95% of the variance.

Surface Matching Study.
As part of the ADNI study, volumetric parcellations (performed using Freesurfer, described, e.g., in [47]) of whole brains are available at a series of time points. The = 0 data has been studied and a template (see Figure 1(a)) as well as a population of initial momenta data has been calculated [48]. To study their changing shapes over time, we wish to convert such binary International Journal of Biomedical Imaging 7 (A) Training.
(2) Map to the surfaces and obtain the initial momenta , = 1, . . . , . Compute the. mean momentum and perform PCA in the tangent space as described in Section 2.2. (3) Estimate the constant 2 , and other terms required for 2 (e.g., estimate the density function int and ext for interior and exterior of ROI from training images as described in Section 3.4.1). (B) Segmentation.
(1) Initialize the template surface , and register the target data to it (e.g., through a similitude matching, rigid motion × scale). (2) Run Algorithm 1 to optimize the initial momentum . segmentations to surfaces. However, the voxelized nature of the segmentations makes simple isosurfaces unacceptable (as shown in Figure 1(b)). We therefore employ the technique of matching our template to such an isosurface, using the constraints of a smooth deformation regularized by PCA to avoid the unnatural appearance of the isosurface. We show example results from our GDAS surface matching algorithm, and compare them to typical results from the traditional surface matching LDDMM algorithm.

Landmark Matching Study.
Our segmentation pipeline for our ROI methods is described in [40][41][42][43][44]46]. The relevant portion (the landmark matching phase) for one such study is summarized here. Thirty-eight landmarks are placed along the left and right hippocampi in 441 0.93 × 0.93 × 2.0 mm T1 images of the brain. The first was placed at the tip of the head of the hippocampus (the center of the most anterior slice containing the hippocampus in a T1 image), and the second was placed at the tip of the tail of the hippocampus (the center of the most posterior slice containing the hippocampus). The distance between these two was then divided into 9 slices from anterior to posterior, and on each slice 4 landmarks were placed at the superior, inferior, medial, and lateral margins of the hippocampus. This manual procedure takes approximately 10 minutes for a trained technician to complete, as compared to over 2 hours for a full hand segmentation of images of this size.
In the existing segmentation and analysis pipeline, a template surface was chosen as the left hippocampus for a single subject, and a manual segmentation and resulting isosurface were generated for this case. After a similitude alignment (including reflecting right hippocampi to match left) landmark LDDMM [49] was used to map this template to each target, defining a segmentation surface and binary image for each patient.
However, this procedure was found to suffer from lack of robustness, and roughly 30 out of 441 cases were unacceptable. A laborious phase of quality control was necessary involving identifying problematic or distorted segmentations, manually editing their binary images, and regenerating isosurfaces.
To measure whether our prior model provides enough robustness to avoid such issues, we chose 5 challenging cases of left hippocampi (as identified during quality control inspection), where manual intervention was required The target isosurface is shown at the right (red). and 11 typical cases, and we examined the performance improvement using the proposed algorithm rather than that outlined above. These cases were manually segmented by a trained technician to provide a gold standard, and associated isosurfaces were generated for further evaluation. Furthermore, for 3 cases requiring intervention and 2 typical cases (those illustrated in Figure 4), a second manual segmentation was obtained to give a sense of interrater variability.
Note that the segmentations that are shown here do not constitute the final output of the ROI pipeline described in [40], in which they would be further processed. That is, the results of standard landmark mapping shown here are not reflective of the final segmentations. However, we expect improvement at this stage to contribute to overall improvement.
The template with its associated landmarks is shown in Figure 1(a). In Figure 1(c), an example isosurface generated from a manual segmentation is shown together with its associated landmarks. Note that landmarks were placed on template surface vertices, but the target landmarks were placed independently (by a different technician) from the gold standard manual segmentation. Figure 1 shows the uncertainty of landmark placement, particularly in the region of the hippocampus' head and demonstrates the need to include landmark placement variability in the segmentation algorithm. In these figures and throughout this paper, the color cyan will be used for the template, red for the target, blue for our new results, and green for results using existing algorithms.

Image Segmentation Study.
To demonstrate the capabilities of the GDAS image segmentation algorithm, 5 examples for the same dataset as the landmark matching study are shown. We anticipate that good initial alignment will be important for high quality segmentations, and so the same landmark based similitude registration as above will be used to initialize the target data in this study.
We use 4 outside and 3 inside components for our Gaussian mixture model. The mixture model is trained based on gold standard segmentations from the remaining cases in a "leave-one-out" fashion. A histogram equalization intensity transformation is applied to each T1 image to match the first training sample, based on data from a neighborhood (±5 voxels) around the landmarks, before estimating mixture model International Journal of Biomedical Imaging coefficients. A similar histogram equalization is applied to the target image (to match the first training sample) before beginning the segmentation process.

Analysis Methods.
In analyzing results, we seek to demonstrate two main ideas. First, the accuracy of GDAS is comparable to existing methods, and second its robustness is improved. Accuracy is demonstrated using three techniques. First, we use visual inspection. Second, scores [50] are used to compare our results to gold standard segmentations. This score is defined by where agree is the fraction of voxels in which the given segmentation agrees with the manual segmentation, and random is the fraction you would expect by random chance (based only on the volumes of foreground and background). We calculate scores using a Monte-Carlo method, generating uniformly distributed points and checking if they are inside one or both segmentations. For applications involving subcortical structures, a value of = 0.8 is generally considered quite good. Third, surface-to-surface distance cumulative distribution functions (c.d.f.s) are used to quantify average proximity of surfaces. At each vertex on each surface, we calculate the distance to the nearest vertex on the other surface and analyze the distribution of these distances. We examine entire c.d.f.s and also examine thresholds: distance at which 50% of vertices are closer than and distance at which 80% of vertices are closer than. Robustness is also demonstrated using visual inspection, with care taken to highlight challenging regions. Surfaceto-surface distance histograms are also restricted to such regions, highlighting challenging areas rather than averaging over the entire surface. Additionally, we quantify surface smoothness by measuring integrated sum of squares of principal curvatures over the deformed template surface (this quantity is scale invariant); the assumption being that smoother surfaces more accurately reflect natural anatomy in these applications.
Further, to overcome limitations surrounding the accuracy of manual segmentation and to emphasize the robustness of our algorithm, we evaluate its performance on simulated data. Five hippocampal shapes are generated according to the PCA model in Section 4.1, and landmarks are placed on the vertices shown in Figure 1(a) with additive Gaussian noise of variance 0.01, 0.1, 1, 10, and 100 times that of the weighting parameter discussed in Section 3.3.2. Traditional LDDMM landmark matching as well as GDAS is run on this dataset using the same parameters as for our real data, emulating a scenario where errors in landmark placement are unknown beforehand. We demonstrate robustness by reporting scores as a function of landmark noise for both algorithms. Lastly, we show the accuracy at which PCA coefficients are recovered using Mahalanobis distance, and values corresponding to such distances are shown for real and simulated data.

Principal Component Analysis.
After performing tangent space PCA with the left hippocampus for the ADNI dataset, we found 31 dimensions characterized 95% of the variability for this population (to be contrasted with 1184 × 3 dimensions associated with a momentum vector at each vertex of the discretized surface). This is illustrated in Figure 2.  GDAS surface matching are shown in Figure 3, with target isosurfaces shown on the right-hand side. Qualitatively speaking, the traditional LDDMM result tends to produce squared off hippocampal heads (left side in figure) due to outlier voxels, as well as an overestimation of the medial margin (bottom of figure) due to overfitting an outlier "ribbon" of voxels. The constraints imposed in GDAS surface matching result in a useful and accurate segmentation reflective of the population being analyzed. The "fingerlike" and "ribbonlike" projections reflecting the voxelized structure of the target isosurface, as well as the set of constraints used in Freesurfer that are designed for an unrelated application, do not significantly influence the resulting surface.   (blue). In the right-hand column, the surfaces are shown overlaid on a T1 image, with the gold standard segmentation shown in red.

Examples: Robust Landmark
Qualitatively, the improvement of the GDAS algorithm over traditional landmark matching is evident. Large distortions at the head of the hippocampus are common where landmark placement can be quite variable. Along the length of the hippocampus, deformations with scale characteristic of the distance between landmarking planes are easily seen. These issues are still common in those surfaces not requiring quality control. The GDAS algorithm avoids each of these pitfalls, avoiding overfitting landmarks while maintaining shape variability characteristic of the population.
Because poor performance of traditional landmark LDDMM motivated this analysis, we also display (in less detail) results for an additional 8 patients selected randomly from among those identified as performing poorly. These are shown in Figure 5 with traditional LDDMM results shown in the left column and GDAS results shown in the right column. It is evident that the distortions seen in the top three rows of Figure 4 are typical for this dataset, and that the GDAS results avoid these distortions in a similar manner.

Examples: Image
Segmentation. An example of the results of Gaussian mixture modeling is shown as probability density functions in Figure 7. Measured data (after histogram equalization) is shown as a solid curve, and the results of mixture modeling as dashed curve. The Gaussian mixture parameters are quite similar in all cases examined. The "inside" region (narrow curve, blue and red) is a unimodal distribution describing subcortical gray matter. The challenge of this application can be seen from the "outside" region (broad curve, green and magenta), which is a more heterogenous mixture. It describes cerebrospinal fluid and white matter, as well as cortical gray matter and partial-volume voxels whose intensities are quite similar to the "inside".
Five example segmentation results are shown in Figure 6. The performance appears satisfactory, an achievement considering the large overlap between inside and outside histograms seen in Figure 7. The PCA prior can prevent the template surface from deforming to erroneously include cortical gray matter in many cases, even though it is similar or identical in intensity to subcortical gray matter. This simple inside/outside model could likely be improved, for example, by including a heterogenous appearance model, or combining landmarks and intensity information in cost functions. However, this will be the subject of future research. The purpose of this section was to demonstrate the extensibility of the GDAS framework to a varied range of applications.

Evaluation: ROI Method Case Study.
For the landmark matching application, we describe in detail the performance of the GDAS algorithm as compared to our existing method.
The overlap on a large scale is quantified by scores, as shown in Figure 8 for each of the 16 test cases. The GDAS results tend to be similar, but better on average than those for landmark LDDMM. For typical landmark matching, the mean and standard deviation of is 0.7131 ± 0.0457, and for GDAS landmark matching, it is 0.7268 ± 0.0531. The difference is statistically significant ( < 0.05 in Student's paired -test).
For those cases with two raters, we examine the second score, which differed from the first by 2.66% on average, to understand interrater variability. We present scores, averaged over the two raters in Table 1. In each case, the GDAS method performs superiorly for both raters, and this is reflected in the increased average scores from 0.732 to 0.751. Despite this improvement, it is interesting to note that the overlap between the two manual segmentations is comparable to that between the results of the two segmentation methods.
Examining overlap voxel by voxel, as in Figure 8, shows our algorithm making a small improvement in accuracy. However, the relatively larger improvement in robustness International Journal of Biomedical Imaging  can be seen when examining surface shapes globally such as in Figures 4 and 5 and contrasting with expectations from knowledge of neuroanatomy. The region around the head was seen to be particularly challenging to segment in the traditional landmark case, and distortions occurring at the scale of landmark spacing give the impression that certain regions are "left behind. " To quantify accuracy globally, while acknowledging these specifically challenging areas, we use surface-to-surface-distance histograms and associated c.d.f.s. These c.d.f.s are shown for all 16 patients (unsaturated colors, green: standard landmark matching, blue: GDAS) in Figure 9(a). Combining all vertices gives a single c.d.f indicative of the whole population (saturated colors). A CDF closer to the top left reflects a better segmentation. In Figure 9(b), we show the same analysis, but restricted to vertices within 10 mm of the head landmark. This analysis was repeated (not shown) with vertices restricted to those within 2.5 mm of any landmark, and those not within 2.5 mm of any landmark.
For each patient, the 50% and 80% crossings were measured and are plotted in Figure 9(c). In each set of four bars, the left two show 50% crossings, and the right two show 80% crossings. A smaller value indicates a better segmentation, but the 50% crossing indicates a "typical" region, while the 80% crossing indicates an "outlier" region. Our hypothesis was that the GDAS algorithm would show improvement in outlier regions, at the cost of poorer performance in typical regions. However, the data shows better performance from GDAS in all regions examined. This is likely due to traditional LDDMM overfitting landmark placement inaccuracies, while GDAS finds an appropriate balance between landmark matching accuracy and shape variability. Differences show statistical significance ( < 0.05 in a Student's pairedtest) with the exception of vertices close to the head (50%: = 0.4073, 80%: = 0.0895).
To further quantify the more natural shapes produced by GDAS, we examine the curvature of the resulting segmentations. For each patient examined, the integrated sum of squares of principal curvatures is shown in Figure 10. In all but one case, the GDAS algorithm results in surfaces with less curvature. The differences are statistically significant ( < 0.0001 in a Student's paired -test).
14 International Journal of Biomedical Imaging

Evaluation: Simulated Data.
To further quantify the performance and robustness of our landmark matching algorithm, we evaluate it using simulated data such that the gold standard segmentation can be precisely known. Figure 11 shows example results of our landmark matching algorithms as described in Section 4.5, with traditional landmark matching shown on the left side (green) and GDAS landmark matching shown on the right side (blue). From top to bottom, the additive noise in landmark placement increases from 1/10 to 10 times that expected from voxel size in our case study (variance 0.004475, 0.04475, 0.4475, 4.475, and 44.75). At low levels of landmark uncertainty, the two algorithms give very similar results. However, as landmark uncertainty increases, the performance of GDAS exhibits a graceful decline, while that of traditional LDDMM demonstrates a precipitous drop. Note that third row gives a level of landmark uncertainty comparable to that in our case study. Figure 12 shows scores as a function of landmark noise for each of the five simulated cases (desaturated colors), as well as for the average performance (saturated colors). Consistent with our expectations of improved robustness, we see a much smaller variability in scores for GDAS. Furthermore, consistent with our earlier discussion of accuracy, we see poor performance of traditional LDDMM due to overfitting untrustworthy data. Note that the third data point (close to the left-hand side of the figure) corresponds to a level of landmark uncertainty comparable to that in our case study.
For the GDAS results, we can also express accuracy by measuring the error in PCA coefficients recovered by the algorithm. A natural way to do this is through the Mahalanobis distance (treating the inverse of the covariance matrix as a bilinear symmetric operator defining an inner product). Loosely, this distance is the square root of the sum of squares of differences in PCA coefficients; each being first divided by its respective standard deviation. At the five levels of landmark noise examined, the distance between the true coefficients and those recovered by GDAS (summed over the 31 coefficients) is given by 0.8100, 2.0484, 4.0477, 5.5894, and 5.7708 standard deviations. However, the lower order coefficients, which contribute more to overal shape, are recovered with more accuracy than the higher ones. The first coefficient is recovered with an error of 0.0154, 0.0259, 0.1422, 0.4338, and 0.6215 standard deviations, and the first 5 with an error of 0.0813, 0.2791, 0.5213, 1.6052, 2.0663 standard deviations.
This highlights a potential future direction for the GDAS framework. We calculate the Mahalanobis distance from the origin for each of the 650 patients in our ADNI training set and use the empirical distribution to calculate values. A sample of these patients is shown in Figure 13 a distance is descriptive of the naturalness of anatomical shapes, with low values corresponding to unnatural shapes. Using such a tool to identify outliers for targeted quality control is the subject of future research. An extension of this idea is the ability to generate random anatomical shapes and quantify their typicality with values. Some examples are shown in Figure 13(b), with format paralleling what was discussed above. This tool demonstrates the generative nature of the GDAS framework and may prove to be useful for didactic or other purposes.

Conclusion
Volumetric segmentation has played an essential role in computer-based interpretation of medical images. There have been many approaches published to address this challenge [1,[51][52][53][54][55][56][57][58]. Most segmentation methods use a combination of shape constraint and data attachment to achieve their goals. Data attachment can be based on geometric data, gray levels, edge detection [59], or unstructured segmentation like K-Means [60] or Gaussian Mixtures [61].
In this paper, we have demonstrated applications of the GDAS framework with data attachment based on landmarks, surfaces, and likelihood ratios from grayscale values. For the case of landmark matching, we demonstrated how it could be used to remove or hasten a laborious quality control phase of large-scale neuroanatomical studies. We quantified its improvement over an existing method based on accuracy, as well as robustness. As is typical of Bayesian analysis, we originally expected to see a tradeoff between accuracy and robustness. However, our results showed improvements in both, likely due to overfitting to noisy data in the standard method, reducing accuracy.
The GDAS algorithm shows improvement over traditional landmark matching in scores and surface-to-surface distances, as compared to the gold standard segmentation. Qualitatively, improvements are particularly noticeable in the region around the hippocampus' head (where landmark placement is uncertain). The segmentations resulting from GDAS appear natural, reflecting the typicality and variability of the population from which the PCA basis was determined. This naturalness was quantified in terms of reduced curvature as compared to the traditional method and can be understood in terms of Mahalanobis distance values.
We have found in large sample studies that robustness is accommodated by our GDAS methods controlled by the PCA dimensions empirically trained from samples of subcortical anatomy. In a study with over 400 hand placements of landmarks in hippocampus and amygdala, we have found that robust GDAS detects our failed landmark based mappings using -values, supporting the notion that it provides direct method for quality control of large deformation mappings. Exploring this possibility will be the subject of future research.
Our method is based on the geodesically controlled diffeomorphism constraints associated with the momentum conservation law. Encoding structure via prior distributions which are empirically trained has a longstanding tradition in active shape and appearance modeling [4,62], defined on landmark structures as well as on higher dimensional structures as proposed in [3,63,64]. Our principal contribution here has been to encode the diffeomorphism constraint into the standard active shape models. By incorporating the conservation law controls, we not only inherit the power of diffeomorphic transfer of the submanifold surface in the background 3D space, as has been described in [7,8], but also obtain the metric structure property. Along the geodesic path connecting templates and targets; the metric structure of the large space is maintained.
International Journal of Biomedical Imaging These properties have been explicitly modelled in our own methods previously using deformable templates acted upon by diffeomorphisms and embedding them into the associated metric space structures [12,65]. These formulations have tended to explicitly model the transformations on the entire dense background space Ω, working to minimize a cost function accumulated over the entire space. In the setting where the template consists of a collection of homogeneous substructures, we would expect to obtain similar formulations as described herein. In fact, Qiu and Miller [48] have used dense deformations for statistical modeling in such settings via their support on the boundaries of the cortical substructures.

A. Image Segmentation Energy Gradient
Theorem A.1. The gradient of (18) with respect to the th deformed template vertex is given by (19).
In the proof of the theorem, we will use the following lemma (see [34] for a proof). where is the positively oriented tangent to the boundary of , and , respectively, denote the unit normal and the area form on , and ∇ ⋅ ℎ is the divergence of ℎ.
Proof of Theorem 1. First notice that by adding and subtracting ∫ ∈int( ) log [ int ( ( ))] , we can rewrite the energy as A variation → + for = 1, . . . , implies for such a the transformation + V( ) with V ( ) = 1 ,1 + 2 ,2 + 3 ,3 . (A.5) The mapping V which is defined this way is a continuous vector field on and smooth (linear) over each face. Introduce a function ℎ such that ∇ ⋅ ℎ = (such an ℎ always exists). By Stokes theorem, we have 2 = ∫ (ℎ ) + cst. (A.6) We can decompose this integral over all faces and apply (A.1) in Lemma A.2 to each face. Summing the result over the faces and noting that the boundary terms cancel (because the tangents of two neighbor faces are in opposite directions), we find Replacing V by its expression and reordering the sum over the vertices yield

B. Adjoint Method
We need the following definitions, associated to the variation of (2) and its transpose. Applying an infinitesimal variation → + in the initial condition induces infinitesimal variations + and + , and the pair ( , ) obeys the following differential system, that can be obtained from a formal variation of (2): We let ( 0 , , ) denote the solution at = 1 of (B.1) initialized at (0) = 0 and (0) = , solved along with (2) initialized with ( 0 , ). We define the dual variation * ( 0 , , ) as follows. First, solve (2)  We state without proof the following lemma, which is a consequence of the theory of linear differential systems (cf. [34,35] for a proof). which is used in calculating the energy gradient.