An Optimized Spline-Based Registration of a 3D CT to a Set of C-Arm Images

We have developed an algorithm for the rigid-body registration of a CT volume to a set of C-arm images. The algorithm uses a gradient-based iterative minimization of a least-squares measure of dissimilarity between the C-arm images and projections of the CT volume. To compute projections, we use a novel method for fast integration of the volume along rays. To improve robustness and speed, we take advantage of a coarse-to-fine processing of the volume/image pyramids. To compute the projections of the volume, the gradient of the dissimilarity measure, and the multiresolution data pyramids, we use a continuous image/volume model based on cubic B-splines, which ensures a high interpolation accuracy and a gradient of the dissimilarity measure that is well defined everywhere. We show the performance of our algorithm on a human spine phantom, where the true alignment is determined using a set of fiducial markers.


INTRODUCTION
Image-guided navigation procedures are critical components of currently available systems for computer-assisted therapy and surgery [1][2][3][4]. Orthopedic procedures offer a real-time display of tools superimposed on the three-dimensional (3D) computed tomography (CT) data that have been acquired preoperatively and that are used for planning of the surgery. CT-based navigation has been successful for spinal pedicle screw insertion, total hip arthroplasty, total knee arthroplasty, pelvic osteotomy, and reconstruction of knee cruciate ligaments [4,5]; however, it is difficult to implement because it requires that the CT volume be registered to the patient at the time of the surgical intervention. This is the problem that we address in this paper.
A partial solution is to use fluoroscopy-based navigation, which is a recent technique that shows promising results in early clinical trials of spinal pedicle screw insertion, distal locking of femoral nails, and femoral fracture reduction [5,6]. This technique displays the tools in a set of intraoperative images of the patient, but is limited to twodimensional (2D) data. Images at different viewing angles are acquired by a C-arm device which has become part of the standard equipment for orthopedic surgery thanks to its mobility and ease of manipulation. To maintain the patient and the C-arm image intensifier in common registration, a position sensor tracks them simultaneously [6,7]. The C-arm images with overlaid projections of the tools are displayed on the computer screen. An update rate of 10 Hz enables realtime navigation in up to four C-arm images simultaneously [7]. The lack of three-dimensionality can be compensated by an augmented number of image shots, which has the drawback of excessive irradiation of the surgical staff and of the patient.
There are two groups of algorithms that can register a 3D CT to a set of X-ray images [8][9][10][11][12][13][14][15][16]. One group matches specific anatomical features that appear in the two imaging modalities [9-12, 14, 15, 17]. These methods rely on an accurate data segmentation or edge detection. They are therefore less robust to the partial-data problem (presence of a feature in one imaging modality and its absence in the other) than projection-based (or intensity-based) methods that form the other group of registration algorithms [8,13,16,[18][19][20]. In general, these methods require little or no segmentation. 2 International Journal of Biomedical Imaging They achieve the registration by matching a set of X-ray images to their simulations. The simulations are traditionally computed as projections of the CT volume obtained by summing the volume intensities along simulated X-rays through the volume. Since the projection-based methods perform the registration of the CT volume to the X-ray images using a large amount of data, their accuracy is potentially higher than that of anatomy-based methods. In this paper, we follow the projection-based approach to combine good properties of the CT-based navigation and of the fluoroscopy-based navigation. Our goal is to achieve an improved, automatic 3D CT-based navigation which would display the tools in the preoperative 3D CT based on its registration to only a few (say, five or less) intraoperative C-arm images.
The projection-based registration is commonly achieved by optimizing a measure of similarity between the X-ray images and their simulations. To perform optimization, some practitioners use Powell's multidimensional direction set method [19,20], while others use gradient-descent-type search technique [8]. In this paper, we perform optimization by using the Levenberg-Marquardt (LM) optimizer, which is tuned to least-squares dissimilarity measures and is very efficient. Moreover, the LM approach makes possible the simultaneous optimization of the dissimilarity measure in all unknown parameters.
Penney et al. compare six similarity measures, based on fiducial markers, for registering a 3D CT to a fluoroscopy image of a spine phantom [21]. They report pattern intensity and gradient difference as being able to register accurately and robustly, even when soft tissues and interventional instruments are present in the fluoroscopy image. They find mutual information to be the least accurate of the six similarity measures. Lemieux et al. propose to register a 3D CT to two radiographic images by optimizing a cross-correlationbased cost function at the initial stage of the registration, and a gradient-based cost function at its final stage [20]. With the same goal, Brown and Boult maximize the gradient correlation between the original and the simulated radiographs [19].
The reference image and the corresponding projection of the volume have generally different intensity ranges. Brown and Boult explore the physical relationship between CT and radiograph measurements [19]. They use it to correct the radiographs such that their registration to the 3D CT is more accurate. Penney et al. design a similarity measure with a suitable scaling of the fluoroscopy image or the volume projection which normalizes their intensity ranges [21]. In this paper, we use a least-squares dissimilarity measure, where we deal with the intensity ranges by normalizing them.
The computational effort of projection-based methods increases with the data size. Several methods for a fast computation of the volume projections, based on ideas from computer graphics, have been proposed [13,22]. The method called shear-warp factorization consists of two steps [23]. In the first step, an intermediate image is computed. The volume, resampled at a higher resolution, is projected using nearest-neighbor interpolation in-slice only. The second step relies on a linear interpolation of the intermediate image to obtain the final projection. Light fields (a technique to parameterize the set of all rays that emanate from a static scene) allow most of the computation to be performed in a preprocessing step [13,22]. Using a light field generated from a number of volume projections, new projections can be computed by interpolating the four-dimensional space of prestored rays. To speed up the registration, Sarrut and Clippe have proposed to precompute a set of projections of the 3D CT that are evenly distributed in space and to match them to X-ray images [18]. In a simulation study, Birkfellner et al. have shown that a 2D/2D 1-DOF (degrees of freedom) registration applied in alternation with a 3D/2D 5-DOF method accelerates the registration three times without loss in accuracy [8]. Another strategy used to improve speed performs the registration iteratively using data at multiple resolutions (multiresolution pyramid) obtained usually by data blurring, using an averaging filter, followed by downsampling [20]. Sometimes, the multiresolution pyramids are built on a region of interest (ROI) extracted from the volume and the images [21]. The multiresolution strategies have been shown to improve the robustness of the registration [24,25].
In this paper, we propose to benefit from a continuous image/volume model based on cubic B-splines for computing the projections of the volume, the gradient of the similarity measure, and the multiresolution data pyramids [26][27][28][29]. Also, we describe a novel shearing method to compute fast projections. This is a one-step approach that does not complicate the optimization procedure, contrary to the intermediate-image approach [23]. Here, we applied our algorithm on the registration of a cadaver spine CT volume with respect to the corresponding C-arm images. The ground-truth registration parameters were estimated based on the position of fiducial markers implanted on the specimen before the data acquisition. We performed two series of experiments, the first on an ROI containing a single vertebra and the second on an ROI containing three vertebrae. The latter was performed to provide an upper bound to the registration performance, although this configuration would not be possible in clinical conditions. We also study the influence of the number of C-arm images and their relative orientation to the registration accuracy. Our algorithm requires at least two input images and its accuracy can be improved either by increasing the number of input C-arm images, or by imposing a minimal angle between their planes, or by combining both strategies.
The paper is organized as follows: we describe the geometry of the problem and the fast computation of the projections in Section 2, along with the model used for the data interpolation. The cost function and its optimizer are described in Section 3. In Section 4, we show the performance of our algorithm on the available data using an objective measure of the registration accuracy. The conclusions are given in Section 5.

Geometric parameters
Given a 3D CT of a patient and a set of 2D C-arm images with known positions and orientations in the reference coordinate system (R-COS), our algorithm computes a rigid-body transformation of the volume in the R-COS with respect to these images. Since we need a meaningful interpretation of the transformation in terms of clinical (anatomical) angles, we use three Euler angles (two out-of-plane rotations and one in-plane rotation) together with three translations to represent this rigid-body motion. Let therefore a = (a 1 , a 2 , . . . , a 6 ) be a vector containing three Euler rotations and three translations of the volume in the R-COS. To compute these parameters, our algorithm refines their initial values by minimizing the mean-square dissimilarity between the C-arm images and the projections of the volume, which we refer to as the cost function. This corresponds to minimizing a real-valued function of the six variables a i , i = 1, 2, . . . , 6.
To improve the robustness and the speed of the algorithm, we register the volume to the images at different resolutions, from the coarsest resolution to the resolution that provides a good tradeoff between accuracy and time. We use the estimate of the position and of the orientation obtained at some resolution to resume the registration at the next finer one ( Figure 1). To obtain data at different resolutions, we compute their B-spline least-squares L 2 pyramids, where L 2 denotes the space of measurable, square-integrable functions [30,31].
The registration at a single resolution of the data is schematically shown in Figure 2. The input data for this registration are (1) the CT volume, (2) the C-arm images, (3) the extrinsic geometrical parameters (the voxel and pixel sizes, and, for each C-arm image, the R-COS position and orientation of the image plane and the R-COS position of the illumination source), and (4) the initial position and orientation of the volume.

Model
The B-spline model f reconstructed from the samples f [n] of a function, where n = (n 1 , . . . , n N ) ∈ Z N , is given by . . . where the coefficients c[n] are obtained by recursive digital filtering [32] of the samples f [n], and where β m (x) is the separable N-dimensional B-spline of degree m given by Here, the function β m on the right-hand side denotes the centered one-dimensional B-spline of degree m [29]. The separability makes possible an operation on N-dimensional data to be performed by successive processing of one-dimensional data along each of the N dimensions. In return, the data processing is simple and fast. The B-splines have many interesting properties [28]. For instance, the approximation order (and the support) of a Bspline of degree m is equal to (m + 1). They are shown to be maximally continuous basis functions, with the minimal support for a given order of approximation, and with the maximal order of approximation for a given support [33].

Projection
Let us refer to the system of 3D voxel indexes, n = (n 1 , n 2 , n 3 , 1), as the volume coordinate system (V-COS) and to the system of 2D pixel indexes, k = (k 1 , k 2 , 0, 1), as the image coordinate system (I-COS). All the vectors and matrices used here are given in homogeneous coordinates. Moreover, let us define the position and the orientation of the image plane in the R-COS by the R-COS coordinate of an in-plane point (e.g., p) and by two in-plane orthonormal vectors Although the X-rays are produced by an illumination source that has a definite physical extent, for the sake of simplicity we prefer to model it as a punctual source situated at s. This source emits rays that hit the image plane at every pixel index k each ray follows the unit vector u k = (A k−s)/ A k− s , where A is some (4×4) homogenous matrix that depends on the position and the orientation of the image plane in the R-COS and on the pixel size. The projection of the volume points with the R-COS coordinate r k (t) = s + t u k (i.e., along the ray determined by the unit vector u k ) to the image plane point with the I-COS coordinate k is where t 1 = 0, t 2 = A k − s , and where the (4 × 4) matrix B(a) is the transformation from the V-COS to the R-COS. We express the transformation B(a) in Appendix A.

Fast projection
A direct implementation of (3) would require a 3D interpolation of the volume at the V-COS coordinate (B(a)) −1 r k (t).
We perform a fast computation of the projection by replacing the 3D interpolation by a 2D interpolation. The principle of this approach is related to [34,35]; we refer to this method as shearing. Let us rewrite (3) as where n 0 (a) = (B(a)) −1 s and n k (a) = (B(a)) −1 u k . The shearing method is based on a change of the integration variable t in (4). The new integration variable is selected from one of three possibilities that correspond to j ∈ {1, 2, 3}, given by where [x] j is the jth component of the vector x. This yields where n 0; j;k (a)=n 0 (a)−([n 0 (a)] j /[n k (a)] j ) n k (a) and n j;k (a)= n k (a)/[n k (a)] j . The scaling factor λ j;k (a) = |1/[n k (a)] j | determines which one of the three expressions for p k (a) will be used to compute the projection. We choose j for which λ j;k (a) is minimal, which ensures the highest sampling rate for f . Since λ j;k (a) depends on k, the decision about j is taken independently for each pixel of the projected image. The integral in (6) can then be approximated by a discrete sum where the size and the phase of the sampling step have been chosen so that only samples of the volume with one integer and two real coordinates take part in the sum, that is, Note that, by construction, the evaluation of f requires the interpolation of the volume at a point with one integer and only two real coordinates, which saves on computations. An example of applying the shearing method for computing a 1D cone-beam projection of a 2D image is shown in Figure 4. As we can see from this figure, there are two image regions.
In one (hollow circles), we sample the image along the horizontal axis, while in the other (black-filled disks), the sampling is performed along the vertical axis. (The gray samples of Figure 4 can be attributed to either region.) Likewise, we can find regions of image pixels k with same j in the 3D-to-2D case. This can speed up the implementation of the projection equation. By combining (1) with (7), we get the final expression that allows us to estimate the value of the projection of a volume with position and orientation a along a simulated X-ray of direction u k , onto the C-arm pixel indexed by k, as

Cost function
Given the volume with position and orientation a, the cost function measures the dissimilarity between the qth C-arm image p q and the qth volume projection p q (a) at Q viewing angles (i.e., q = 1, 2, . . . , Q). We first subtract the mean intensity from a C-arm image, and then divide by its standard deviation to obtain an intensity range that depends only weakly on the power of the illumination source. We do the same for the corresponding projection. Thus normalized, the C-arm  image and its corresponding projection can then be compared by the mean-square cost function given by where P q (a) and σ q (a) are the mean value and the standard deviation of the qth projection, respectively, and where P q and σ q are the mean value and the standard deviation of the qth C-arm image, respectively. The arbitrary binary mask D q contains card(D q ) pixels. It is suited to the qth view and prevents the cost function to be evaluated over irrelevant data such as the surgical instruments, the dynamic reference base, the markers if any, or nonanatomical data. For the experiments performed here, we have created this mask manually. However, for practical applications, it could also have been created using a semiautomated or perhaps automatic method.

Optimizer
Suited to least-squares cost functions, the LM algorithm achieves a good tradeoff between the robust but generally inefficient method of steepest descent and the efficient but nonrobust Newton method [36]. It uses the gradient and the Hessian Let H be a modified Hessian such that the diagonal elements of the true Hessian are multiplied by a factor (1 + λ) while its off-diagonal elements are not changed, where 6}. Then, the LM optimization algorithm can be described by Equation (13) approximates the gradient algorithm for λ → +∞, albeit with vanishing steps. Similarly, it approximates the Newton algorithm for λ → 0. The parameter λ is adaptively tuned to provide a smooth transition from the steepestdescent algorithm (used in the beginning) to the Newton algorithm (used as approaching to the solution) [36]. When using a specific approximation of the Hessian, the LM algorithm has even better convergence properties. We describe this approximation below.

Gauss-Newton approximation of the Hessian
The exact components of the Hessian matrix depend on the first derivatives of the volume projections with respect to the parameter a, as well as on their second derivatives. Since the influence of the second-derivative terms can be destabilizing when the current guess is far from the solution, we use an approximation of the elements of the Hessian matrix which ignores them [36]. This approximation makes the optimization work better because it provides a Hessian matrix that is positive-definite when λ is sufficiently large. This is known as the Gauss-Newton approximation of the Hessian for leastsquares cost functions [37,38]. We give a detailed description of this approximation in Appendix C.

Gradient
The gradient of the cost function depends on the first derivatives of the volume projections with respect to the parameter a. In turn, these derivatives depend on the spatial gradient of the volume. The spatial gradient of our B-spline data model (1) is given by For the B-spline of degree m ≥ 2, we have the guarantee that its first derivative dβ m (x)/dx is continuous, which is not the case if m = 0 (nearest-neighbor interpolation) or m = 1 (linear interpolation). We choose to model the data using cubic B-splines (m = 3), because this model provides a gradient of the cost function which is well defined everywhere. At the same time, this choice offers a good tradeoff between computational cost and interpolation quality. Detailed expressions for calculating the first derivatives of the volume projections and elements of the gradient are given in Appendix B.

Methodology
We show an application of our algorithm on the registration of a 3D CT of a frozen human spine with respect to a set of its C-arm images. We compare the computed alignment with the alignment that we estimate based on a set of markers implanted on the specimen before the data acquisition. The true alignment is therefore known a priori, albeit within some nonzero tolerance. To simulate clinical conditions, we register one vertebra at a time, which severely limits the amount of available data. However, to get an upper bound to the accuracy of our algorithm when registering this data set, we also perform a joint registration of several vertebrae. The data set is described in Section 4.2. The measure of the registration accuracy is given in Section 4.3. One of the elements that determines the working range of our algorithm is given in terms of the angle between the image planes. Given the CT volume and a pair of C-arm images, our preliminary experiments have shown that this angle should be at least 15 • . Our algorithm therefore requires at least two input C-arm images.
We performed two sets of experiments. In the first set, which is presented in Section 4.4, we investigate the role of multiresolution. In the second set, which is described in Section 4.5, we show the benefit of increasing the number of views, and explore the influence of the minimal angle between views. We discuss the residual misalignment in Section 4.6.

Spine data
A human cadaver spine specimen was frozen so that it can be treated as a truly rigid body. This gives us the opportunity to jointly consider more data (here, three vertebrae) than is available in clinical practice where each vertebra can exhibit independent motion, thus providing an upper bound to the accuracy of the registration of this data set. However, to be closer to a clinical configuration, we restrict the focus of our experiments on a single vertebra by masking out large areas from the C-arm images so that only one vertebra shows through at any one time. Five fiducial markers (custom-made, titanium) were implanted on the spine. One was placed in the L5, two in the L4, and two in the L3 vertebra.
The specimen, placed in a plastic bag that is apparent in the transversal and sagittal views of Figure 5, was CT-scanned (GE LightSpeed Ultra CT scanner) with seventy two slices of size (512 × 512), in pixel units. The intraslice pixel size was (0.36 × 0.36), in mm units, and the interslice thickness was 2.5 mm. The tilt angle was zero. The images were acquired using a Siemens ISO-C Carm instrumented with LED markers. Their positions were tracked using an optoelectronic position sensor (Northern Digital Optotrak 3020). The geometric parameters of the Carm X-ray projection (the position and the orientation of the image plane and the position of the illumination source s) were determined according to the two-step procedure proposed by Hofstetter et al. [7]. The captured images were un-distorted before further use [7]. We used an optical tracking system and calibration procedure that resulted in the navigation error being 0.5 ± 0.5 mm [7].
Typical clinical settings involve a device called a dynamic reference base (DRB) to define the patient coordinate system (P-COS) in which the tools are tracked. In spinal surgery, the DRB is commonly clamped to one of the vertebrae; both DRB and clamp are therefore visible in the C-arm images. Their presence challenges the registration algorithm because neither DRB nor clamp is present in the CT data. Thanks to the mask D q in (9), we have hidden them, which further reduces the amount of data available for registration; in addition, we have hidden the fiducial markers from the C-arm images to forfeit any illegitimate help from them. Two out of the seven available images are shown in Figure 6. The images were of size (768 × 576), in pixel units, and the pixel size was (0.36 × 0.36), in mm units. We show in Figure 7 the projections of the CT that correspond to the C-arm images of Figure 6, as found by our registration algorithm.
To determine the ground-truth registration, the coordinates of the fiducial markers were digitized in the P-COS using an optoelectronically tracked pointer. Then, they were transformed to the R-COS. The coordinates of the fiducial markers in the V-COS were estimated by fitting a sphere to the marker heads.  Figure 7: Two projections of the CT volume. According to the outcome of our registration procedure, these projections correspond to the C-arm images given in Figure 6. As the DRB was not present at the time the CT was acquired, it is not present here. The fiducial markers, however, are present at all times.

Measure of the registration accuracy
We transform the V-COS coordinate n of each CT voxel into the R-COS by using two transformations: the ground-truth transformation B and the transformation B(a) estimated by our algorithm. We define the misregistration as the average of the norm of the difference between the two R-COS coordinates over all the CT voxels, where F is the support of f . In the case of perfect registration, which is achieved when B(a) = B, we have that = 0. To measure the registration accuracy, we evaluate (15) for B that is estimated using a given list of coordinates of the fiducial markers in the R-COS (v i , i = 1, . . . , 5) and in the V-COS (n i , i = 1, . . . , 5). The estimation of B is done by minimizing We compute a single ground truth for both singlevertebra and several-vertebrae registration. More precisely, we use the collection of all five markers to determine it, even though some are not placed on the vertebra of interest in the single-vertebra case.

Multiresolution
We use four-level CT-volume and C-arm-image pyramids ( Table 1). These pyramids are dyadic in only two directions; we do not change the number of CT slices while performing the data reduction since the registration fails otherwise, Table 2: Accuracy of the registration of the CT volume to eighteen pairs of C-arm images, at each level K of a multiresolution pyramid (K = ∞: initial condition). The angle between image planes is larger than 15 • . The results for a single vertebra can be compared to those for joint vertebrae (L3/L4/L5). the initial number of 72 CT slices becoming too small when reduced. For the experiment performed in this section, we created a set of eighteen pairs of C-arm images such that the angle between their planes was larger than 15 • . We run the algorithm for an imposed deliberate initial misregistration such that = 9.0. For each vertebra, we report in Table 2 how the registration accuracy evolve across the levels of the multiresolution pyramid. In the same table, we present the results of the registration based on all three vertebrae. In the single-vertebra case, one can note that pyramid levels 0 and 1 (finest and next-to-finest levels, resp.) contribute to an accuracy gain lesser than 0.2 mm. Since this does not justify the huge additional registration time, we stopped the registration process in the several-vertebrae case after having completed the two coarsest pyramid levels.
As expected, the data reduction from three vertebrae to a single one results in a drop of performance. This effect is particularly strong for the vertebra L4 (the fourth vertebra in the lumbar part of the vertebral column) which suffers from a very small ROI obtained after having masked out the DRB and the clamps. We will see in Section 4.5 that this drop can be compensated by increasing the number of imaging views at the input and the minimal angle between the image planes.
We report in the first column of Table 3 the time spent at the Kth level of this multiresolution pyramid. We observe that the progression is not a geometric one: while the noncumulative registration time spent until convergence at level K = 2 is about four times longer than that at level K = 3, this convergence time is nearly six times longer at K = 1 than at K = 2. The largest computational payoff, due to the 8 International Journal of Biomedical Imaging Table 4: Accuracy of the registration of the CT volume to pairs, triplets, and quadruplets of single-vertebra C-arm images for different minimal angles between image planes. The pooled performance is reported under the heading L . The results for single vertebrae can be compared to those for joint vertebrae (L3/L4/L5).

Angle
Views Cases Accuracy (mm) multiresolution strategy, comes between K = 0 and K = 1 since at level K = 0 the algorithm spent only twice the time spent at level K = 1, despite the fact that there is four times more data to process at K = 0. This suggests that the solution reached at the next-to-finest resolution is already very close to the solution found at the finest resolution, an interpretation that is consistent with the results of Table 2.

Several-views experiments and angle between views
We repeated the single-vertebra experiment described in Section 4.4 on two additional image sets: a set of twenty two image triplets and a set of thirteen image quadruplets, the angle between any two image planes in each set being larger than 15 • . Also, we performed the same experiment using three more sets: fifteen pairs, thirteen triplets, and four quadruplets of C-arm images for the minimal angle between image planes of 25 • . We show the results in Table 4, where the influence of the number of views is combined with that of the minimal angle between views. In the same table, we present the results of the joint registration of all three vertebrae based on two, three, and four image views, and the minimal angle between image planes of 15 • . We observe a significant gain in performance when using three views instead of two, while the gain of an additional fourth view, although less impressive, is present too.
We also observe that it is beneficial to enforce that the angle between views is closer to being perpendicular. For two views, a spectacular effect of increasing the minimal admissible angle from 15 • to 25 • is a strong reduction in the standard deviation of the registration accuracy. We interpret this reduction as indicative of the disappearance of outliers. Finally, we observe that the overall geometrical performance is similar when using four views that meet at an angle that is allowed to become rather flat, or when using only three views that meet at a less acute angle. The latter case is more favorable because of a smaller radiation dosage and because of the registration speed ( Table 3). Note that if we stop the registration at level K = 2, as suggested in Section 4.4, the registration of the CT to three C-arm images takes just about two minutes.

Residual misalignment
As we can see from Table 4, our algorithm registers our CT/C-arm data set with an accuracy of about 1.7 ± 0.5 mm when using pairs of images that encompass all three vertebrae and that meet at an angle larger than 15 • . But three views are needed to reach the same accuracy when only a single vertebra is available; moreover, the imaging planes should meet at an angle larger than 25 • in this case. Then, our algorithm achieves an accuracy of about 1.7 ± 0.3 mm ( Table 4).
The residual misalignment can be explained by the fact that some errors were made when digitizing the fiducial markers (the mean navigation error was 0.5 mm), and that some more errors were made when determining the CT indexes of the markers. The maximum error committed when determining the CT indexes of the markers was about 0.6 mm (25% slice off). This means that in the experiments carried out, our algorithm is perhaps responsible only for a fraction of the reported misregistration. Note that the mean misregistration of 1.7 mm is clearly subvoxel with respect to the interslice CT thickness of 2.5 mm.

CONCLUSION
We have developed a projection-based algorithm for the registration of a CT volume to a set of C-arm images, which takes advantage of a continuous model of the CT volume based on cubic B-splines. This model allows for a welldefined gradient of the dissimilarity measure, which is a necessary condition for efficient and accurate registration. We compute the projections of the volume using a novel method for integrating the volume along simulated rays. The advantage of this method is the replacement of a 3D interpolation by a faster 2D interpolation. Our algorithm improves robustness and speed using a coarse-to-fine registration strategy based on spline multiresolution data pyramids.
We have shown the performance of the algorithm when registering a set of human cadaver spine CT/C-arm data. Their true alignment was estimated using a set of fiducial markers implanted on the specimen before the data acquisition. We have provided an objective geometric measure of the quality of the registration.
We have found that our algorithm requires at least two input C-arm images whose planes meet at an angle larger than 15 • . To register a single vertebra at a time, we performed two sets of experiments. In each set, we used image pairs, image triplets, and image quadruplets as input. In the first set, we have selected them in such a way that the minimal angle between any two image planes was 15 • . In the second set, the minimal angle was chosen to be 25 • . We showed that the registration accuracy could be improved by increasing the number of C-arm images and/or the minimal angle between them. Moreover, we showed that a good tradeoff between the radiation dosage, the registration accuracy, and the registration time could be achieved with three different C-arm orientations. Given the triplets of C-arm images that meet at an angle larger than 25 • , our algorithm achieves registration accuracy of 1.7 ± 0.3 mm for this CT/C-arm data set. Note that this accuracy is subvoxel with respect to the CT which has an interslice thickness of 2.5 mm. However, our algorithm was able to achieve an accuracy of 1.4 ± 0.2 mm when jointly registering all three vertebrae using triplets of images that meet at an angle larger than 15 • . This shows a potential of our algorithm to register a single vertebra with smaller misalignments than 1.7 ± 0.3 mm. To achieve this, one should perhaps let the minimal angle between the image planes be closer to perpendicular. Unfortunately, the configuration of our data did not allow us to presently investigate this issue.
To be applied in a real clinical procedure, our algorithm should be validated using more realistic data such as data with soft tissues and patient data.

A. MATRIX B(a)
Let us write the vector of orientational and translatonal parameters a = (ϕ, θ, ψ, Δx 1 , Δx 2 , Δx 3 ), where ϕ, θ, and ψ are three Euler angles, and Δx 1 , Δx 2 , and Δx 3 are three translations. Given a, the voxel size λ 1 × λ 2 × λ 3 , and the 3D index of the volume origin c, we write the transformation B(a) as follows: The matrix makes the origin of the volume be the center of rotation. The matrix translates the volume by Δx 1 , Δx 2 , and Δx 3 along its new x-, y-, and z-axes, respectively. The sense of rotation for ϕ, θ, and ψ has been chosen so as to follow the conventions of a right-handed coordinate system.

B. GRADIENT OF THE COST FUNCTION
The gradient of the cost function (9) with respect to some parameter a i is given by Because variations tend to average themselves over the whole image, we do not expect the mean value P q (a) and the standard deviation σ q (a) to exhibit a strong dependence on a small variation of a i . By contrast, we consider as relevant the local dependence of the projection p q;k (a). Thus, we simplify (B. where, for the sake of brevity, we have dropped the dependence on q and k, and where the spatial gradient of the image ∇ f (x n (a)), evaluated at x n (a) = n 0 (a)+nn(a), is determined by (14). Here, we leave as an exercise to the reader the task to develop further the term ∂λ(a)/∂a i , which depends on geometry alone; the same goes for the term ∂x n (a)/∂a i . Combining (B.4) with (1), (2), and (14), we finally get where, as before, we have ignored the terms ∂σ/∂a and ∂P/∂a. As explained in Section 3.3, it is actually beneficial to also ignore the second-derivative terms. Finally, we express a component of the Hessian as where the remaining differential terms are available from (B.5).