Slice-to-Volume Nonrigid Registration of Histological Sections to MR Images of the Human Brain

Registration of histological images to three-dimensional imaging modalities is an important step in quantitative analysis of brain structure, in architectonic mapping of the brain, and in investigation of the pathology of a brain disease. Reconstruction of histology volume from serial sections is a well-established procedure, but it does not address registration of individual slices from sparse sections, which is the aim of the slice-to-volume approach. This study presents a flexible framework for intensity-based slice-to-volume nonrigid registration algorithms with a geometric transformation deformation field parametrized by various classes of spline functions: thin-plate splines (TPS), Gaussian elastic body splines (GEBS), or cubic B-splines. Algorithms are applied to cross-modality registration of histological and magnetic resonance images of the human brain. Registration performance is evaluated across a range of optimization algorithms and intensity-based cost functions. For a particular case of histological data, best results are obtained with a TPS three-dimensional (3D) warp, a new unconstrained optimization algorithm (NEWUOA), and a correlation-coefficient-based cost function.


Introduction
Registration of histological sections to magnetic resonance (MR) images is an important step in quantitative analysis and cross-modality comparison of brain histology and MR data. Magnetic resonance imaging (MRI) and histological examination are two complementary modalities that considerably differ in their scope of application and scale of analysis. MRI of the brain became an indispensable method in clinical evaluation of central nervous system (CNS) diseases and in brain research. MRI is extensively used for in vivo characterization of intracranial lesions and tumors, for longitudinal studies of neurodegenerative effects, and so forth. Postmortem MRI of isolated formalin-fixed brain specimens is also becoming a promising method for the investigation of the pathology of a brain disease [1], where it can serve as a high-throughput screening tool detecting regions with abnormal signal contrast. Unfortunately, in the near future, the spatial resolution of MRI is not likely to improve beyond an intermediate, mesoscopic scale of 0.1 mm, contrary to histology that has long been operating on a microscopic, cellular scale. In addition, MRI contrast mechanisms have complex origins that make exact relationships between MRIdetected abnormal regions and pathological characteristics unclear; correlates of MRI findings have to be studied and validated. Histological examination remains a gold standard for precise characterization of neuropathology. However, histology has its own share of shortcomings: it is a low-throughput, inherently two-dimensional (2D), timeconsuming process, which is invasive, irreversible to the specimen, and destructive to gross morphological features. Consequently, there is a great value in bringing these two modalities together, to complement each other and bridge the scales of mesoscopic and microscopic examination. In doing so, it is important to establish the best possible spatial correspondence between the two, which can be achieved by the design of a specialized brain slicer [2] or, more generally, by the registration approach (e.g., see [3]). Registration may become a key technology bridging these two modalities. Specifically, the registration of individual 2 Anatomy Research International histological slices from sparse section sets directly to threedimensional (3D) MR images (slice-to-volume registration) constitutes an important unsolved problem.
Apart from neuropathological investigations, histologyto-MRI registration plays an important role in quantitative analysis of brain structure and, more specifically, in architectonic mapping of the brain. Cyto-or myeloarchitectonic mapping of the brain cortex has long since relied on examination of histological specimens, but the spatial location of delineated areas was traditionally sketched only roughly, as exemplified by renowned Brodmann maps. In modern development, architectonic mapping features observerindependent quantitative analysis of histology and is brought into a standard anatomical spatial reference system by reconstruction and stereotaxic alignment of a histological volume. Schormann and Zilles [4] describe reconstruction of histology volume from serial sections, which at one point involves registration to MRI. It should be noted that the methodology involved in reconstruction of a histology volume, which has become a well-established procedure, relies on coregistration of serial histological sections, and is not directly applicable to registration of individual slices from sparse sections.
Advances in high-resolution MRI make an alternative approach to architectonic parcellation of the cortex feasible: homogeneous areas can be delineated directly from MR images, potentially in vivo. However, as in the case of neuropathological investigation, correlation of MRI findings to the gold standard of histological examination must be studied and validated. Our present work in histology-to-MRI registration is primarily motivated by the search for a ground truth reference to validate neocortical parcellation results obtained by Kruggel et al. [5,6]. Previous studies have demonstrated a striking qualitative similarity of optical profiles in myelin-stained tissue with intensity profiles of high-resolution MRI in a fixated brain [5,6]. Profiles of histological specimens stained for myelin and cell bodies were quantitatively compared to in vivo T1 and postmortem T2 MRI by Eickhoff et al. in [7], focusing on two architectonic regions only: a striate and an extrastriate cortex. It was found that MR intensity profiles were best fitted by a weighted sum of both myeloarchitectonic and cytoarchitectonic profiles, with a significantly larger contribution from the former. There, an approximate correspondence of histological and MRI slices was established manually, and only two image modalities were originating from the same brain specimen. Our study is seeking to establish a reference for quantitative comparison of histological and MRI intensity profiles, where correspondence of section images is optimized by a nonrigid registration procedure.
The registration of histological images is challenging because of three main issues: (1) a "deformation problem": a brain sample is subject to complex deformations in the process of preparation and histological sectioning; in case of sparse histological slices (2) a "search problem": an inherent one-to-many correspondence of the underlying 2D-to-3D mapping; (3) a "statistical consistency" problem: the reliance of a similarity measure on small sampling size of an input 2D image.
Nonlinear and nonuniform global and local distortions pertinent to different stages of histological sectioning procedure are described in [8]. A brain sample undergoes global shrinking during fixation in formalin, and each slice is experiencing local uncorrelated nonlinear distortions when sectioned. In addition, nonelastic deformations such as folds and rips may be introduced when a section is transferred to the object slide for scanning. Schormann et al. [9] list nine sources of deformations induced by histological processing and present a theoretical analysis of the statistics of deformations, emphasizing that local deformations impede exact registration of MRI with histological sections.
Only a limited number of papers have been published on the particular topic of histology-to-MRI registration (see [8] for a review), most of them dealing with reconstruction of histology volume from serial sections. When contiguous data for serial sections are available, the issue of a 2Dto-3D mapping may be bypassed, and the registration can proceed in two stages: first, sequential slices are 2D aligned into a 3D histology volume and second, the reconstructed histology volume is 3D registered with an MRI or a positron emission tomography (PET) image. This way, each stage can benefit from a vast body of research and algorithms developed for 2D and 3D, intra-and intermodal, rigid and nonrigid registration. For example, Ourselin et al. [8] applied a block matching algorithm to 2D align histological sections into a reconstructed volume, which was further registered to MRI data using a 3D version of the same algorithm. If available, optical images of a block face, for example, photos of the cryomacrotome cut view with external landmarks, can be used as intermediate data, facilitating the stage of 2D alignment in histology volume reconstruction. Mega et al. [10] reconstructed a cryovolume from optical data, adapted a 3D elastic warping/surface matching approach to a single 2D slice for warping of each stained section image to a corresponding optical image, and applied a 3D rigid registration to align the cryovolume with PET. Similarly, Schormann and Zilles [4] reconstructed a histology volume by 2D alignment with optical data, 3D aligned it to MRI, and applied full multigrid elastic registration with MRI to refine the histology volume reconstruction. Intermediate optical images may also serve as a reference in case histological sections are not contiguous, for example, see [11].
When histology data are sparse, and no intermediate optical data are available, coregistration with MRI is posed as a 2D-to-3D registration problem. The 2D-to-3D medical image registration has received considerable attention in the context of projective registration in intraoperative imaging modalities, for example, X-ray fluoroscopy with CT images. Projective registration is seeking to find a correspondence between reference 2D image and a projection of a volume to a plane. On the contrary, there are relatively few publications on the topic of slice-to-volume registration, which is seeking to find correspondence between reference 2D image and a cross-section of a volume by a plane or a warped surface. Different approaches to rigid slice-to-volume registration are described in [12][13][14]. In prostate imaging, Fei et al.
Anatomy Research International 3 [12] registered interventional MRI (iMRI) slices to preoperative high-resolution MRI volume, using multiresolution approach with two cost functions and automatic restarting to avoid local minima. Chandler et al. [13] evaluated sliceto-volume registration in connection to the planning of MR scans, registering single thick brain MR slice, and applying information-based criterion on vector-valued probabilistic images of tissue classification. Birkfellner et al. [14] studied registration of single slices from FluoroCT, CineMR, or iMRI to 3D volumes, and used cross-correlation with repeated application of local optimization algorithms. All authors in [12][13][14] note that slice-to-volume registration raises several challenges in cost function selection, convergence behavior, and optimization strategy, which require special considerations. We will return to a more detailed discussion in Section 2.
Slice-to-volume registration of histology to MRI was studied by few research groups. Jacobs et al. [15] registered rodent brain histology to MRI in two steps: first by rigid alignment of a histological slice with an MR image based on a surface matching algorithm, followed by a 2D warping of a planar section of the MRI volume onto a histological image. Here, the 2D warping step used thinplate splines with matching control points automatically selected along both sets of contours. Similarly, Gefen et al. [16] registered sections of a mouse brain to a 3D atlas by first matching them to a corresponding atlas plane with a global affine transformation, and then by warping in 2D. However, modeling the sectioning surface as a plane may not correctly reflect global and local deformations of a brain sample; generally, such a cross-section is better modeled as a warped surface. In a study of the pathology of disease, Kim et al. [3] registered postmortem human brain histology images to premortem MRI reference volume. Warped cross-sections of MRI volume were modeled by polynomial functions of second or third order, recursively derived and matched to postmortem slices. A polynomial model allows to compensate for global distortions between premortem and postmortem images, but does not account for local deformations.
We present a flexible framework for intensity-based sliceto-volume nonrigid registration algorithms with geometric transformation that combines a rigid alignment with a 3D deformation field parametrized by various classes of spline functions: thin-plate splines (TPS) [17], Gaussian elastic body splines (GEBS) [18], or cubic B-splines [19]. We apply algorithms to cross-modality registration of histological and MR images of the human brain and evaluate registration performance across a range of optimization algorithms and intensity-based cost functions.
The remainder of the paper is organized as follows. In Section 2, we present the building blocks of the algorithmic framework: geometric transformations, cost functions with similarity measures, and optimization strategies. In Section 3, we describe our computational experiments and discuss their results, comparing registration performance across spline models, similarity measures, and optimization algorithms. The last section discusses the key features, various algorithmic details, and future developments.

Algorithms
In our framework, an intensity-based registration algorithm consists of three main building blocks: (1) a geometric transformation with a deformation model, (2) a cost function with a similarity measure, and (3) an optimization scheme. In this section, we describe three implementations of (1) with a 3D deformation field parametrized by TPS, GEBS, or B-splines, several implementations of (2) with various similarity measures, and several optimizers and optimization schemes implementing (3).

Problem Formulation and General
Notation. Slice-tovolume nonrigid registration takes as an input a 2D image (called target or floating image) and finds a corresponding warped cross-section slice (called registered image, 2D) in a 3D volume (called source or reference image). Let I d (i, j) and V d (i, j, k) denote a 2D histological image and a 3D MR discrete grey-scale image, respectively. Let I((x, y) ∈ Ω) ∈ R and V ((x, y, z) ∈ Φ) ∈ R denote the real-valued intensity function of a continuous version of a 2D and a 3D image with domain Ω ⊂ R 2 and Φ ⊂ R 3 , respectively. Continuous images are obtained from discrete images I d and V d by a chosen interpolation model, for example by trilinear or spline interpolation. A geometric transformation T is defined as a mapping T : Ω → Ψ, Ψ ⊂ Φ; the mapping defines the domain of a warped slice Ψ ⊂ R 3 . Then a registered image I R is defined as the following intensity function: denote an intensity-based similarity measure between two images I 1 , I 2 , such that closer similarity of images results in higher value. A cost function is defined as: C(I, V , T ) = −S sim (I, I R )+βE(T (Ω)), where E(T (Ω)) is the deformation energy of the geometric transformation, and β is the weight parameter at the energy penalty term. Image registration seeks the optimal transformation T = arg min(C(I, V , T )) that minimizes the cost function.

Geometric Transformation.
The slice-to-volume geometric transformation T is composed of two parts: alignment and deformation, T = T aln • T def . The flat 2D slice is deformed in 3D, and the resulting warped surface is rotated, translated, and scaled within bounds of a 3D volume; the registered image is obtained at the cross-section of the volume by the warped surface.

Alignment.
Alignment T aln is modeled by a 3D linear operator T aln [ x] = A( α, κ) x + τ, where x denotes 3D coordinates of a transformed point, A is a 3 × 3 rotation/scaling matrix, α is a vector with three rotation angles, κ is a vector with three scaling factors, and τ is a translation 3D vector. With scaling factors, T aln is called Procrustes transformation and has nine degrees of freedom (DOF). If scaling is omitted, T aln corresponds to rigid transformation with six DOFs.

Thin-Plate Spline Deformation
Field. Thin-plate spline (TPS) is a solution of an interpolation problem that minimizes a certain smoothness energy function based on local 4 Anatomy Research International curvature [17]. TPS interpolation function is defined as T x, y, θ = a 0 + a 1 x + a 2 y where U(r) = r 2 log r is a radial basis function (RBF), θ = {(x i , y i )} are control points (where RBFs are centered), {c i } are RBF weights, and {a 0 , a 1 , a 2 } are affine coefficients. Given a set of interpolation conditions T( there exists a closed form solution for TPS coefficients [20]. TPS interpolation minimizes the following functional, called TPS bending energy: TPSs are typically used in a 2D nonrigid registration based on point-landmark matching [15,21]: point-landmarks in a target image serve as control points, and the displacement (u, v) of a 2D warp is modeled by two independent TPS functions. Matched landmarks in a source image define displacements at the control points, from which TPS weights can be computed by solving a system of linear equations [17].
Our slice-to-volume registration does not use landmarks. Instead, TPSs are used in a novel way to parametrize a smooth 3D deformation of a 2D surface: control points are placed in a regular grid on the 2D image domain Ω, and a 3D warp is modeled by three independent TPS functions u(x, y, θ), v(x, y, θ), and w(x, y, θ): at the control points serve as DOFs and thus "steer" the deformation; TPS coefficients are computed from these displacements. The smoothness energy of such a deformation field is the sum of three individual TPS bending energies E T = E u + E v + E w , which can easily be computed from TPS weights [20].

Gaussian Elastic Body Spline Deformation Field.
Gaussian elastic body splines (GEBS) [18] model 3D deformations of elastic homogeneous isotropic media under the influence of forces that are applied at control points and are decreasing with distance as Gaussian functions. The deformation is defined as: where G( x) is a 3 × 3 matrix of basis functions, and θ = { p i } are control points in 3D. For given interpolation/approximation conditions, the coefficients c i are determined from a linear system of equations (see [18,22]). As with TPS, we define control points θ on a regular 2D grid. The 3D displacements of control points serve as DOFs and specify an interpolating/approximating GEBS deformation. We characterize the smoothness of GEBS deformation by the same energy functional E T (TPS bending energy), which is commonly used in registration [19,23]. An analytic expression for E T is not readily available; therefore, we compute it numerically, approximating partial derivatives by finite difference schemes.

B-Spline Free-Form Deformation Field.
Free-form deformation (FFD) using 3D tensor product of cubic Bsplines is described in [19] and is generalized in [24]. We define a 3D FFD of a 2D domain as where θ = { c i, j } variables are 3D displacements of control points in a 2D L × M grid that has regular spacing h and β 3 denotes the cubic B-spline basis function [25]. Four main distinctions of a B-spline FFD are noted below: (1) in contrast to TPS or GEBS, where each control point contributes globally, FFD is locally controlled, that is, each control point affects only its local neighborhood of 16 grid tiles; (2) FFD does not require solving a linear system for computing weights from displacements; (3) for B-splines, control points must be placed in a regular grid, and must enclose the domain boundaries ("closed grid"), whereas for TPS or GEBS control points can, in principle, be arbitrarily placed inside the domain, and a grid can have open boundaries ("open grid"); consequently, for the same domain size and grid spacing, FFD will have a larger number of control points and DOFs; (4) unlike TPS and interpolating GEBS, a surface warped by FFD does not necessarily pass through the displaced control points (see properties of β 3 in [25]). We regularize B-spline FFD with the same TPS bending energy functional, which can be efficiently computed from analytical expression, because B-spline derivatives have local support.

Cost Function Similarity Measures.
Our registration framework provides a flexible selection of cost function similarity measure: , H(I 1 , I 2 ) , LC : S LC (I 1 , Anatomy Research International 5 RC : S RC (I 1 , I 2 ) = S CC (Rank(I 1 ), Rank(I 2 )), CR: S CR (I 1 | I 2 ) = var(E(I 1 | I 2 )) var(I 1 ) , ( 5 ) where N is the number of pixels in Ω, H(X) and H(X, Y ) correspond to the entropy and joint entropy, E(X|Y ) is the conditional expectation of X in terms of Y , n( j) is the local neighborhood of jth voxel, and brackets denote averaging. The listed similarity measures are: sum of squared differences (SSD) [26], Pearson's correlation coefficient (CC) [27], mutual information (MI) and normalized MI (NMI) [28], overlap-invariant NMI (NMI 1 ) [29], local correlation (LC) [30], Spearman's rank correlation coefficient (RC) [31], and correlation ratio (CR) [32]. The similarity measures can be ordered with respect to assumed optimal relationship between image intensities, from more restrictive to less restrictive assumption: SSD (additive Gaussian noise), CC (globally linear relationship), LC (linear transfer functions for small patches), RC (some monotonic relationship), CR (some functional dependence), and MI-NMI (statistical dependence). The choice of a similarity measure for a particular registration application depends not only on relationship between imaging modalities, but also on other factors, like robustness with respect to outliers/image defects, computational cost, sensitivity, statistical consistency (adequate sampling size), presence of local maxima, and availability of analytical derivatives for derivative-based optimization.

Optimization.
The optimal transformation is sought by numerical multidimensional nonlinear optimization. For our geometric transformations, the total number of optimization variables is N opt = N aln + 3N cp , where N aln is equal to nine or six for Procrustes or rigid transformation, respectively, and N cp is the number of the control points in a 2D grid that defines a spline deformation model.
The registration framework also allows for an evaluation of different optimization strategies: multiresolution and/or multistage optimization schemes. A multiresolution scheme [36] proceeds through several stages, from coarser to finer resolution, for example, on a Gaussian pyramid of input images. A multistage scheme partitions the optimization into several sequential stages, each stage being focused on a particular aspect of the problem such as initialization, a subset of optimization variables, a particular cost function, or a certain optimizer. For example, a two-stage scheme can combine global optimization of rigid alignment by DE and local optimization of deformation DOFs by PDS or NEWUOA.

Data and Preprocessing.
High-resolution MR images of an isolated left brain hemisphere fixated in 4% paraformaldehyde were acquired postmortem on a Bruker 3T Medspec 100 system equipped with a bird cage quadrature coil, using T 1 -weighted 3D MDEFT protocol (FOV 179.2 × 89.6 × 179.2 mm, matrix 512 × 256 × 256, voxel size 0.35 × 0.35 × 0.7 mm 3 , scanning time 12 h). The whitegrey matter contrast of the MR images is inverted due to fixation in formalin, which makes it similar to the contrast in histological images (see Figure 1). After MRI scanning, the brain sample was cut on a macrotome into ten coronal blocks, each approx. 1.5 cm thick in sagittal direction. Blocks were frozen to −80 • C, and up to five sections were cut from the face of each block using cryomicrotome. The 50 μmthick slices were washed in 30% sucrose in sodium phosphate buffer (PBS), glass-mounted, stained for myelin with Sudan Black B, and scanned on a flat-bed scanner at 2000dpi resolution (12.7 μm/pix). The MR image was preprocessed to correct for intensity inhomogeneities, and converted to the isotropic resolution of 0.35 mm 3 . Histological images were converted to an 8-bit grey-scale intensity range (0 to 255), and resampled to match MRI spatial resolution; the bright background was suppressed by setting intensities above 240 to zero. A continuous model of the discrete MR image was built with the fourth degree B-spline interpolation [25]. Registration experiments were applied to histological slices from eight distinct blocks of the brain sample. Because both histology and MR images originated from a formalin-fixated brain, there was no global shrinkage/swelling to account for, therefore a rigid transformation was used in the alignment part T aln (scaling was disabled). The software was developed in C++ in a Linux environment; running times are reported on a 2.4 GHz Athlon-64 CPU with 2 GB RAM.

Initialization and Choice of Parameters.
This subsection explains how initial conditions and parameter values were chosen. Initial conditions specify the starting vector of optimization variables. Initial values for translation τ and rotation α were first determined approximately by a human observer, and were further refined by a sequence of fast rigid registrations sampling the diagonals of the τ, α DOF space; they were also confirmed by global optimization with DE algorithm. The deformation field was always initialized to zero displacements of the control points, that is, no deformation.
The right choice of deformation energy penalty weight β (Section 2.1) is very important for high-quality nonrigid registration results. If β is too low, the deformation is not penalized, and a higher value of intensity similarity measure can be reached at the expense of unrealistic deformations, producing clearly visible artifacts and distortions in the registered image. If β is too large, then deformation is suppressed, and registration approaches the rigid alignment. A suitable value of β has been established experimentally: the similarity measure and bending energy of registered images were plotted as a function of β, and the value of β was picked from an interval that was preceding a sharp rise in bending energy, and where the similarity measure was flat. The value of β = 0.1 has been chosen for our experiments reported below. Control point grid spacing influences both the spatial scale of control over deformation, and the dimensionality N opt of the optimization problem. A coarser grid does not model local deformations at a smaller scale, but is easier for the optimizer. Finer grids allow fine-tuning of deformation, but may be very difficult for optimization. As an empirical trade-off between an increase in registration time and an improvement of similarity measure, the grid spacing of 15 and 20 pixels has been chosen for histological image area below and above 40,000 square pixels, respectively.

Metaoptimization.
Choosing the right similarity measure for intensity-based registration can be challenging. Although some arguments in favor of a certain similarity measure can be based solely on image modalities (e.g., SSD or NMI is a typical choice in intra-or intermodal registration, resp.), the best cost function has to be determined experimentally. We test different cost functions and evaluate registration results by computing (post registration) the cumulative similarity measure SM as a sum of seven coefficients: CC, LC, RC, CR, NMI, NMI 1 , and Dice's coefficient (DC) on segmented cortical ribbon (intensity thresholds of 185 and 150 were chosen to segment the grey matter ribbon in histological and registered MR slices, resp.). A comparison on any single similarity measure may be biased towards the underlying assumption of functional/statistical relationship. We argue that the cumulative measure represents a "consensus vote" if, on average, all individual similarity measures are consistent, that is, they may disagree on quantity but do not contradict each other qualitatively. Such cumulative similarity measure is aimed at "unbiased" and "smoothed" comparison of cost function performances, when the ground truth evaluation, either based on reliable landmarks or on prior knowledge, is not feasible.
For a particular slice-to-volume nonrigid registration task, it is not clear a priori what is the best: (1) optimization strategy, (2) cost function, (3) deformation model, and more practically, (4) what is the best combination of the above three? These questions are left open to experimental confirmation. Therefore, the goal of "metaoptimization" is to find the best combination of "metadimensions" (a cost function, a deformation model, and an optimization strategy) within our framework. Our framework has open building blocks that may be freely combined and tested. Since it is not practical to exhaustively test all possible combinations of implemented models and algorithms, the combination is optimized in a series of "line searches" along one metadimension at a time. For example, first, performance of optimization algorithms is compared using TPS-and CC-based cost function, next performance of cost functions with various similarity measures is compared using the best optimizer from the first step and keeping TPS, then different deformation models are compared using the best cost function and optimizer, and finally, multiresolution/multistage schemes are evaluated for the best performing combinations.

Comparison of Optimizers.
The first series of experiments addresses the following question: which optimization algorithms are robust and perform well in nonrigid and/or rigid registration? The goal is to identify candidates for further use in multistage/multiresolution optimization schemes and for sampling along the remaining metadimensions (cost functions, deformation models). Rigid registration is the easiest test for an optimizer, while nonrigid registration (full resolution, single-stage) is the hardest test. Rigid and nonrigid registrations were performed on our data using six optimization algorithms: NEWUOA, PDS, Simplex, Genetic, DE, and CG. All cases used CC cost function and TPS (N/A to rigid case), except for CG, which is implemented only for SSD and B-Splines. The results of rigid and nonrigid registrations are compared across optimization algorithms in Table 1.
The registration quality is evaluated by three criteria listed in order of priority: (1) the cumulative similarity SM, (2) the bending energy E T (N/A to rigid case), and (3) the running time. Best results should have maximal SM with constrained bending energy and a reasonably low running time. In the easiest optimization task, rigid registration, two optimizers stand out as winners: NEWUOA (total SM 39.099) and DE (total SM 39.245), but DE runs much longer (total time 125.9 min versus 1.1 min). In the hardest test, single-stage nonrigid registration, some optimizers fail to find a plausible solution: Simplex optimization does not finish in more than two days, while Genetic and DE evolutionary algorithms fail to converge to an optimum (note large bending energy at low SM values). Two algorithms, NEWUOA and PDS, manage to find a solution that is better than rigid registration, and NEWUOA results are clearly the best in terms of all three criteria. Based on these findings, NEWUOA is identified as the top performer for our slice-tovolume nonrigid registration task.
As evident from the table, different optimizers do not find the same solution. The slice-to-volume registration problem is underdetermined; despite a deformation model regularization and a smoothness constraint, there are multiple locally optimal solutions that are close to each other. Also, the results are not uniform across input slices (see columns with block # 1-8): registration of some slices takes considerably longer time. For this reason, the table includes detailed results for each block, instead of just the summary column. The results are so different not only because of differences in the size of images and control point grids, but also because of different image content. A cost function multidimensional landscape is unique in each case of a histological image; some cases are more difficult to optimize because of many local minima due to noise and local matches.
In order to simplify the nonrigid optimization task, it was divided into stages optimizing alignment and deformation DOFs separately. Each stage in a two-stage scheme employs the NEWUOA. Better results are obtained if the second stage is also allowed to optimize the translation τ variables (it should be intuitively expected because the position of the deformed surface in 3D is fine-tuned in a second stage). A two-stage scheme works better (total SM 46.109 in Table 2 versus 45.233 in Table 1) and faster (total time 69.2 versus 139.4 min).
We conclude that the NEWUOA optimizer provides a clear advantage in our problem for both rigid and nonrigid cases; two-stage scheme provides a further improvement. SSD. The results of cost function evaluation are presented in Table 2. Two cost functions, CC and SSD, stand out as top performers, with SSD being slightly better in cumulative similarity but running slower on average. Even though SSD is computationally less expensive than CC, the number of cost function evaluations is typically larger with SSD compared to CC (total sum 48940 versus 38351), which indicates that SSD is a more difficult task for the optimizer. In our data, both imaging modalities have similar trimodal histograms that are closely matched by linear histogram expansion (see Figure 2); this could explain in part why CC performs well. While in theory SSD is not optimal for different imaging modalities, in practice, SSD produces good results; this justifies using analytically simple SSD expression in derivative-based optimizers, as implemented in our CG and LM.

Evaluation of Cost
NMI does not yield the best results, which may seem surprising in the light of NMI's popularity in cross-modality registration. This may be in part due to the "complexity" of its optimization landscape, which exhibits many "noisy" local minima and a narrower optimum (see [14]). Traditionally, these problems are rectified by a multiresolution approach, therefore the NMI cost function is revisited in Section 3.7. Another plausible explanation refers to problems with statistical consistency of NMI (see [41]), given small sample size of the 2D image. NMI becomes less reliable and loses statistical consistency if used with images of small size, because it estimates the joint distribution of intensity values from a 2D histogram, and for small images, the number of samples may be insufficient for a robust estimate. Consequently, the number of bins in the histogram becomes an important parameter. Our implementation of NMI can have 256, 128, or 64 bins in the histogram; coarser binning has shown better performance, which agrees with the statistical consistency issue. CR, LC, and RC cost functions return good results that, however, are not superior to CC nor SSD. In case of LC and RC, the running time is significantly longer, as expected due to their computational cost. Based on these findings, we choose the "speedy" CC cost function for further tests with NEWUOA.
3.6. Comparison of Deformation Models. The next series of experiments addresses the question: which model of the deformation field ensures the optimal performance? The three deformation models in our framework are all built with smooth splines on a regular grid of control points, but they differ in underlying "physical" properties, computational costs, and (for B-splines) optimization dimensionalities. All three deformation models have been tested on our data in the context of CC cost function and NEWUOA two-stage optimization scheme. The results are shown in Table 3, where rigid registration results are also included for baseline reference.
The SM and E T results show a close tie between all three models, with B-Spline FFD quality being slightly better (higher similarity and lower energy) on all eight data blocks. As for the running speed, registration with TPS model is clearly the fastest (see Section 4 for discussion).

Multiresolution Optimization.
The final round of experiments addresses the following questions: can a multiresolution scheme further improve the so far identified best performing combination NEWUOA/TPS/CC? Are there any other "overlooked" combinations whose performances peak in multiresolution setting? Multiresolution schemes are evaluated for four specific combinations: (1) NEWUOA Table 2: Two-stage NEWUOA registration results for six cost functions (correlation coefficient, correlation ratio, local correlation, rank correlation, Maes' normalized mutual information with 64 bins, and sum of squared differences). Each registration result is presented by the cumulative similarity measure SM, the bending energy E T , and the running time T in minutes. Columns 1-8 are for eight slices from eight distinct blocks of the brain sample. The last column is the sum of 1-8. Values referenced in the text are in boldface.  Table 3: Two-stage NEWUOA registration results for three deformation models (thin plate splines, Gaussian elastic body splines, and cubic B-splines). Rigid registration is also given for the reference. Each registration result is presented by the cumulative similarity measure SM, the bending energy E T (nonrigid only), and the running time T in minutes. Columns 1-8 are for eight slices from eight distinct blocks of the brain sample. The last column is the sum of 1-8. Values referenced in the text are in boldface. with TPS and CC, (2) NEWUOA with TPS and NMI, (3) NEWUOA with B-splines and SSD, and (4) LM with B-splines and SSD. Each multiresolution scheme consists of four stages: rigid alignment at coarse resolution using NEWUOA (even in LM case), followed by three stages of nonrigid registration at progressive levels of a Gaussian pyramid (progressive smoothing with the 2D or 3D Gaussian kernel, size = 7 and σ = 2.0), from coarse to full resolution, with rotational DOFs being disabled (tests showed that performance was worse if rotational DOFs were enabled).
The results are presented in Table 4. The scheme with NEWUOA, TPS, and CC returns slightly better similarity value compared to the two-stage case (refer to Table 2), but on average runs twice as slow. The NEWUOA/TPS/NMI scheme, which was tested with the expectation that it would rectify the NMI optimization landscape, does improve the result, yet not quite reaching the level attained by CC cost function, but on average runs almost three times slower. The third scheme NEWUOA/B-spline/SSD, which was included as a reference for the LM-based entry, reaches highest values of SM, but does it very slowly in most cases. The number of cost function evaluations was almost ten times higher than in the first case (total sum 623772 versus 68772). This is explained by larger number of control points and DOFs required for B-spline FFD, as was already mentioned in Section 2.2.4. The last entry in the table, the LM/B-spline/SSD scheme, performs certainly better than the single-stage full resolution LM, which failed to converge in our tests, and better than the two-stage LM (not listed in the tables; total SM 44.649, total running time 67.5 min), but its running time is much longer. While it runs faster than NEWUOA/B-spline/SSD (total time 365.6 versus 1987.1 min), it is slower than the first scheme, NEWUOA/TPS/CC (total time 135.2 min). In terms of cost function and gradient/Hessian evaluations (total sums 3446 and 898, resp.), LM scheme seems to converge in fewer iterations than NEWUOA/TPS/CC (total sum 68772), but evaluations of gradient and Hessian are much more costly.
To summarize the results for our data, the best combination of image similarity (subject to deformation constraint) and registration speed is obtained with two-stage NEWUOA optimization scheme, TPS deformation model, and CC cost function. In some cases, for example, for slices in blocks 1, 6, and 8, a minor improvement can be achieved by a multiresolution scheme, without incurring a significant extra running time.

Registered Images.
Examples of eight histological slices and registered MR images are given in Figures 3 and 4. Images in the middle column correspond to rigid registration with NEWUOA and CC cost function. Images in the right column are from nonrigid registration with two-stage NEWUOA, TPS, and CC cost function. Rigid registration results look rather similar to histological images on the left-clearly a pay-off from refining the initial conditions (see Section 3.2) and choosing a good optimizer (see Table 1), since our earlier results were not quite as good. A visual inspection shows that in all slices, nonrigidly registered images have some small regions that more closely resemble histological sections. While rigid registration, which runs much faster, may seem sufficient at the level of gross morphology or white matter regions, nonrigid registration may become essential at the small detail level, for example when matching of cortical ribbons is desired.
We evaluate where and how nonrigid registration was important with the help of difference images. Although difference images are not strictly applicable to cross-modality registration, a large difference of intensities is very likely to pin-point a registration error, if intensity distributions are reasonably close. The cross-modality difference images in Figures 5 and 6 show the absolute difference between histological and registered MR images, where intensities of the latter are scaled by the factor of 1.22 to match the dynamic range of the former. The same-modality difference images in the right column show the absolute difference between rigidly and nonrigidly registered MR images, without any intensity scaling. Please recall from Figure 2 that intensity peaks of grey matter (GM) and white matter (WM) on histological images stand apart by more than approximately 100 units. Therefore, if a difference image scores a pixel with value more than 100, such a pixel is very likely to correspond to GM/WM mismatch. We computed the percentage of mismatch area on difference images thresholded above 100. Table 5 compares rigid and nonrigid registrations in terms of this measure. For all slices, the mismatch area is smaller in the case of nonrigid registration.

Discussion
The proposed algorithmic framework has the following advantages. (a) It allows registration of individual, sparsely While the framework appears to have a component structure that is pretty general for image registration, it is specialized for the slice-tovolume registration because (1) it offers deformation models particularly suitable for a 2D-to-3D transformation and (2) it has an emphasis on flexible selection of cost functions and optimization strategies, for which slice-to-volume registration problems are known to be particularly demanding [12][13][14]. TPS and GEBS are well-known techniques in point-landmark registration, but here they are used in a novel way for parametrization of a free form (smooth)  deformation in a completely different context of intensitybased registration. While the reconstruction of a histology volume from serial sections is a well-established procedure, it requires expertise and specialized equipment and is costly and time-consuming. Individual, sparsely spaced histological sections may be more readily available from a nonspecialized histology lab; therefore, the registration method enables a wider range of histological-to-MRI evaluation and validation experiments, that otherwise could not afford a full-scale serial sectioning procedure.   for example, when matching gross-morphology or regions of white matter. In most cases, however, it is desirable to recover deformations of complex local and global nature, so the modeling of nonplanar cross-sections is sought. Kim et al. [3] applied a polynomial deformation model, which captures global deformation in 3D. In their study, MRI of the brain was performed before autopsy and fixation. Global deformations occur when the brain sample is extracted, and fixation in formalin causes global shrinkage of the specimen, therefore a global deformation model is well justified. In our framework, we also implemented a second-and a third-order polynomial models similar to those in [3], but they did not yield any improvement over rigid registration, because in our data the postmortem MRI was performed after autopsy and fixation. Our deformation model goes a step further to account for local displacements, but it may also be easily combined with a polynomial model of global deformation, if needed. In general, registration approaches must be tailored to a specific problem at hand: there is no single registration algorithm that suits all needs. Likewise, we cannot declare our registration problem as "solved" and "closed," and future applications are likely to require modifications of the algorithms. In this light, flexible optimization schemes constitute an advantage of our framework, since different strategies and combinations can be tested and compared. This seems to be particularly important for slice-to-volume registration, whose performance shows greater dependence on the input images, compared to typical 3D-to-3D MRI registration.
Our finding that correlation coefficient cost function works better than other similarity measures may seem surprising and contradictory to the mainstream. After all, CC use is traditionally restricted to intramodal registration, and NMI is usually preferred in intermodal registration (although there is no proof that NMI is in any way optimal [42]). Here, we reiterate possible explanations of why CC works better in our case: (1) although registered images originate from different modalities, the correlation coefficient-based cost function may be efficiently used in cases where histograms are similar (see Figure 2); (2) the use of a NMI-based cost function with small input images suffers from statistical inconsistency; (3) CC cost function has a "smoother" optimization landscape, which is less demanding on the optimizer. The SSD cost function works well for similar reasons.
The fact that registration with TPS model is the fastest, may also appear counterintuitive. The computational cost of TPS is usually considered to be high and prohibitive for practical applications with large number of landmarks or control points [19]. However, some aspects of the computational cost may be efficiently handled by trading memory space for computation time: by iteratively reusing the decomposition of the linear system, and by tabulating RBFs at all pixel positions. This way, a speedup of factor 24 is achieved. In our experiments, TPSs are computed on grids of 100-200 control points. The computational cost may be further reduced by using approximating TPS [20], but in that case there is no closed form expression for the bending energy. An efficient computation of the TPS bending energy is one computational advantage of TPS versus GEBS, where the bending energy integral has to be computed numerically. In the case of GEBS, the basis functions can also be precomputed, therefore more complex expressions do not translate to extra running time, but we found that the TPS linear system has better numerical stability properties compared to GEBS. LU-decomposition of the linear system was used for TPS, whereas we chose to use singular value decomposition (SVD) for solving GEBS linear system (see [31]). LU-decomposition is typically faster than SVD; this also explains why registration with TPS model is faster than GEBS. These arguments do not apply to the explanation of why B-spline model works slower, since it does not require solving any linear system, and the bending energy integral can be efficiently computed with locally supported derivatives of B-spline functions. In the case of B-spline deformation model, slower performance is mainly due to the larger dimensionality of the optimization task. It may be argued that, instead of control point displacements, the TPS or GEBS weight coefficients themselves could be used as DOFs/optimization variables: this would eliminate the need to solve a linear system for weight coefficients, and thus reduce the computational cost. Although it is equivalent to a linear transformation of the search space, nonlinear optimization problems are not invariant under such transformations. We verified experimentally with both TPS and GEBS that it would result in a much higher optimization cost and convergence issues.
Multiresolution LM optimization strategy proved to be very successful in cross-modality affine registration of 3D volumes [37], where a cost function naturally does not have any bending energy term. The combination of multiresolution approach with an LM-type optimization and cosine basis function deformation field is also known to work well in SPM toolkit nonrigid registration of smoothed MR images [40]. There, SSD-based cost function does not include bending energy penalty; instead, the deformation is regularized by Maximum a Posteriory (MAP) estimation with an ad hoc model assuming multinormal distribution of parameters. In our case, explicit bending energy term not only makes the programming of gradient and Hessian routines a tedious and time-consuming task, but also adds to the computational cost of each update iteration in LM. We recognize the fact that performance and running times of an optimization algorithm may be heavily dependent on implementation; therefore, proven state-of-the-art implementations were used in our framework whenever possible. In-house implementation of some algorithms, for example, LM and CG, might be less efficient, which could affect the running times. At the same time, the necessity of a custom implementation of these algorithms for a particular cost function and deformation model argues against them and makes the comparison fair.
The validation of registration results is problematic in the absence of a ground truth reference and reliable landmarks. Manual selection of landmarks is difficult and imprecise, especially in the case of mixed 2D and 3D images. Complementary to evaluation by a range of intensity-based similarity measures, we attempted to evaluate registration accuracy by 3D matching of automatically detected sulcal landmarks. Sulcal lines of maximal depth (sulcal fundi or sulcal bottom lines) were automatically extracted in 3D by the procedure described in [43]. A similar procedure was developed in 2D, extracting sulcal bottom points. The position of each bottom point was transformed in 3D according to the geometric transformation (alignment and deformation) recovered by our registration algorithm, and a 3D distance to the bottom lines was computed. For each registered slice, the maximal, the minimal, and the average distances with the standard deviation were computed. We compared these average distances for rigid and nonrigid registrations but did not find any statistically significant difference. Since there is a clear visual distinction between rigid and nonrigid registration results, confirmed by similarity measures, one possibility is that the bottom line/bottom point landmark detection and matching method is not sensitive enough to detect it. Another possibility is that main differences are outside of sulcal fundi, which are well matched, and gyral contour or edge landmarks should be tried instead. Validation of our registration results by automatic landmark detection remains an open topic.
The described method has useful potential for various applications in mapping of brain histology to 3D imaging. Applications include (a) compiling computerized 2D-3D atlases of rodent and human brains, (b) comparing histology with MRI for a better characterization of MRI-detectable (and potentially, MRI-invisible) pathological signs, and (c) seeking for a ground truth reference to validate our neocortical parcellation results [5,6]. This approach narrows the gap between the macroscopic scale in MRI and the microscopic scale of histological examination.
With further improvement of performance, the slice-tovolume registration framework may also find useful applications beyond histology: in areas where rigid registration is used for speed, up to this point, but where a nonrigid registration may be beneficial, for example, in planning of MR scans, or in intraoperative registration of single slices from FluoroCT, CineMR, or iMRI to 3D preoperative scans. In current implementation, nonrigid registration of a single slice takes on the order of several minutes and is likely to be too slow for intraoperative applications, but the rigid registration scheme with NEWUOA is robust and sufficiently fast, and we believe that it deserves the attention of research in the medical field.