Comparison of Quadratic- and Median-Based Roughness Penalties for Penalized-Likelihood Sinogram Restoration in Computed Tomography

We have compared the performance of two different penalty choices for a penalized-likelihood sinogram-restoration strategy we have been developing. One is a quadratic penalty we have employed previously and the other is a new median-based penalty. We compared the approaches to a noniterative adaptive filter that loosely but not explicitly models data statistics. We found that the two approaches produced similar resolution-variance tradeoffs to each other and that they outperformed the adaptive filter in the low-dose regime, which suggests that the particular choice of penalty in our approach may be less important than the fact that we are explicitly modeling data statistics at all. Since the quadratic penalty allows for derivation of an algorithm that is guaranteed to monotonically increase the penalized-likelihood objective function, we find it to be preferable to the median-based penalty.


INTRODUCTION
We have recently developed penalized-likelihood approaches to the problems of sinogram smoothing and sinogram restoration in computed tomography [1][2][3][4], with a particular eye to the low-dose regime being considered for screening studies for lung and colon cancer [5][6][7][8][9]. In both cases, we assume that the statistics of each detector measurement are given by the sum of a compound Poisson term, representing the photon counting statistics for polychromatic photons, and a Gaussian term, representing electronic noise. In the case of sinogram restoration, we generalize the measurement model to include blurring coefficients representing sinogram degradations such as off-focal radiation, detector afterglow, and detector crosstalk [2,4]. From the noisy, degraded measurements, we then seek to estimate a set of "ideal," undegraded line integrals by iteratively maximizing an objective function comprising a sum of a simple Poisson likelihood (an approximation to the measurement statistics assumed above) and a roughness penalty. The estimated line integrals can then be fed into an existing analytic reconstruction algorithm, such as those typically implemented in hardware on commercial CT scanners. The hope is that this iterative sinogram-domain approach would provide some of the statistical advantages of fully iterative approaches to image reconstruction at a lower computational cost.
In our previous studies of the smoothing and restoration approaches, we have made use of a quadratic roughness penalty applied in the line integral (log) domain to the difference between a given sample in sinogram space and its four adjoining neighbors, that is, between a given detector channel and its two neighboring channels (we assumed a single-row detector), as well as to its own reading at the preceding and following view angles. While the approaches performed better in resolution-variance studies at low doses than did Hsieh's noniterative adaptive trimmed mean (ATM) filter [10], the ATM filter was surprisingly effective at reducing the influence of a small number of very noisy measurements without unduly compromising resolution. The ATM filter is only applied to measurements whose signal strength falls below a certain threshold, and it entails replacing the value in question with the trimmed mean of the values in a neighborhood of the sinogram around the measurement in question. The trimmed mean filter is a median-like filter based on order statistics, and it varies adaptively between applying a true median filter and a simple boxcar filter, depending on the signal level. The relatively strong performance of this filter suggested that it would be worthwhile to explore the use of a median-based roughness penalty in the context of the sinogram smoothing and restoration methods we have developed.

2
International Journal of Biomedical Imaging The use of median-based roughness penalties in fully iterative penalized-likelihood image reconstruction was pioneered by Alenius et al., who referred to them as median root priors (MRPs) [11,12]. Unfortunately, it does not appear to be possible to derive an iterative algorithm that is guaranteed to monotonically increase an objective function based on the MRP of Alenius et al. However, they derive a heuristic update equation, based on Green's one-step-late (OSL) strategy [13], that does not necessarily correspond to the maximization of a predefined objective function, but that does yield good results in practice.
In this work, we explore the use of a standard MRP penalty like those of Alenius et al. in the context of our sinogram-restoration approach and we make use of the heuristic, OSL strategy to derive the iterative update. We compare the images qualitatively and quantitatively to those obtained by use of our method with quadratic roughness penalties, as well as to those obtained by use of Hsieh's ATM filter.

Measurement model
We assume that the CT scan produces a set of measurements that are represented as a one-dimensional (1D) vector y meas , with elements y meas i , i = 1, . . . , N y , where N y is the total number of measurements in the scan, and the index i denotes a particular attenuation line through the patient (i.e., a specific combination of detector channel, detector row, and projection angle).
To review the model, we have been employing [2][3][4], we assume, when we simulate data, that each y meas i is a realization of a random variable Y meas i whose statistics are described by with the various terms in this equation defined in Table 1.
The compound Poisson distribution in the first term has been derived and validated by Whiting [14] and rederived by Elbakri and Fessler [15,16]. We assume that m of the incident beam, and an energy-averaged and normalized estimated scatter term m are all known. Our goal is to estimate a set of ideal, "monochromatic" attenuation line integrals: i = 1, . . . , N y , at some reference energy E r (usually E i ), from the set of measurements y meas i , i = 1, . . . , N y . These estimated line integrals can then be input to a standard analytic reconstruction algorithm as mentioned above. Our strategy for estimating the line integrals entails maximizing a penalized-likelihood objective function. Because the model of (1) does not yield a tractable likelihood, we approximate it by defining a vector y of new adjusted measurements with elements where [x] + is x for positive x and zero otherwise, that are realizations of random variables Y i which we assume are approximately Poisson-distributed: where with f j (l) being an empirically determined function, typically polynomial, that adequately captures the effect of beam hardening in slices that do not contain substantial amounts of bone [17]. Our strategy is then to estimate the vector l (poly) , with elements l (poly) j , from the vector of adjusted measurements y, since the needed l (mono) i can then be obtained by inverting (5). For simplicity, we drop the (poly) superscripts from l (poly) in what follows.

Quadratic penalty approach
Our general strategy thus far [2][3][4] has been to maximize a penalized-likelihood objective function where is the Poisson log-likelihood for the random variables of (4) and where we have defined ). The roughness penalty R(l) can be expressed in a general form as given by Fessler [18], where ψ k is a potential function that assigns a cost to the K combinations of attenuation line integral values represented by the linear combinations Ny j=1 t k j l j . Our quadratic penalty approach entails choosing ψ k (t) = ω k t 2 /2 and constructing the t k j to create differences of a sinogram sample with its horizontal and vertical neighbors, with Probability of a photon incident on ith attenuation line belonging to mth spectral bin Number of scattered photons of energy E m contributing to measurement i μ(x, E) Energy-dependent attenuation map, with x being spatial coordinate in patient Electronic noise in ith measurement ω k = 1/2 for those neighbors. This is equivalent to where N j denotes the neighborhood comprising the 4 nearest horizontal and vertical neighbors of measurement j.
We have derived an algorithm that generates a sequence of estimates of l that are guaranteed to increase the objective function of (6) by making use of the optimization transfer principal [18], in which at each iteration one defines a surrogate to the likelihood function, such that the vector of line integrals maximizing this surrogate is guaranteed to have a higher penalized likelihood than the previous vector estimate. The resulting update is where the notation l (n) j denotes the estimate of l j after the nth iteration. Here, wherė and the c (n) j are the curvatures of paraboloidal surrogates constructed to give rise to an overall, easy-to-maximize quadratic surrogate to the objective function. One choice for the curvatures that guarantee monotonicity is although in practice we make use of a different set of curvatures that do not guarantee monotonicity but that in practice lead to faster convergence [19].

Median penalty
For the median penalty approach, we employ a more heuristic approach based on producing a sequence of estimates that, in the absence of a penalty, would yield a maximumlikelihood estimate, but that incorporates a penalty term in each iteration that discourages deviations from the local median of the last iteration. This is consistent with the approach of Alenius et al. in fully iterative reconstruction [11,12]. Specifically, the update is defined as where S(l, l (n) ) is a surrogate to the log likelihood to be described below and is the median penalty, with denoting the median of the values of l (n) in some neighborhood N k around value k. The surrogate S(l, l (n) ) is defined as where with International Journal of Biomedical Imaging and g i (x) ≡ y i log x − x. This surrogate satisfies both S(l, l (n) ) ≤ L(l), ∀l j ≥ 0 and S(l (n) , l (n) ) = L(l (n) ), and thus in the absence of the penalty term, finding the l maximizing this surrogate necessarily increases the likelihood [4].
Substituting (17) and (15) into (14), our update is given by We can solve for the maximum by setting the derivative with respect to l j equal to zero. Doing so yields Solving for l j yields the update

Qualitative results
To compare the two penalties to each other as well as to the ATM filter, we simulated projections of a numerical ellipse phantom shown in Figure 1 that we have used previously and that is modeled on the physical phantom employed by Hsieh [10]. We computed a sinogram of 1024 angles × 1024 bins of extent 0.5 mm at the isocenter with a source-to-isocenter distance of 540.0 mm. We simulated the data according to the forward model of (1) with discretization of a realistic CT spectrum into 1 keV bins. We assumed that the phantom was water-equivalent except for the three circular structures, which we assumed to be bone.
We simulated two exposure levels: a clinically typical I i = 2.5 × 10 6 and a low-dose level I i = 2.5 × 10 5 . We chose G i = 3.57 × 10 −3 pA/keV, such that G i E i = 0.25 pA/quanta for E i = 70 keV, and σ 2 i = 10.0 pA 2 for all i. We included the effects of off-focal radiation in the simulation by convolving with kernels having 13 nonzero values arrayed diagonally with slope −2 in discrete sinogram space. The central value (corresponding to the zero point of the kernel) had relative value 1.0 and the six values on either side had relative value 0.02 (these thirteen values were normalized so that their sum was 1.0). We found through simulations with water-equivalent phantoms that the beam-hardening effect of the simulated tube spectrum was well represented as a second-order polynomial f (l) = l − 0.007l 2 .
We reconstructed images by means of four different approaches.
(1) Hanning. Simple discrete deconvolution of the effect of off-focal radiation, followed by beam-hardening correction and fan-beam filtered backprojection (FFBP) reconstruction with a Hanning filter of varying cutoffs.
(2) ATM. Presmoothing of data by means of the ATM filter, followed by simple discrete deconvolution of the effect of off-focal radiation, beam-hardening correction, and reconstruction by FFBP with an unapodized ramp filter. The ATM filter was implemented with cutoff parameter λ = 75.0 pA and baseline parameter δ = 0.05 pA, as in [10] by Hsieh, and with the filter length varying from 3 to 19. (3) Quadratic. Penalized-likelihood sinogram restoration using the quadratic neighborhood penalty followed by beam-hardening correction and reconstruction by FFBP with an unapodized ramp filter. The smoothing parameter β was varied from 0.01 to 50. (4) Median. Penalized-likelihood sinogram restoration using the quadratic neighborhood penalty followed by beam-hardening correction and reconstruction by FFBP with an unapodized ramp filter. The smoothing parameter β was varied from 0.01 to 50. The size of the neighborhood used in median calculation was 3 × 3. Figure 2 shows typical results of these reconstructions, where we have selected values of the smoothing parameters for the ATM and penalized-likelihood images that give approximately matched resolution at the center insert and approximately matched noise levels at the right insert, based on the resolution-variance results to be described below. It can be  seen that the noise level in the low-dose data leads to severe streaking artifacts in the Hanning filter reconstruction and that these are suppressed by the approaches under consideration, more so for the penalized-likelihood approaches than for the ATM approach. The results all appear to perform similarly at the higher-dose level. These qualitative impressions were explored quantitatively by use of resolution-variance studies.

Resolution-variance tradeoffs
To characterize resolution, we determined the local edgespread function at the central and right high-attenuation inserts. The vertical profiles through these structures have profiles that are well fit by error functions parametrized by width σ b which implies that the effective blurring kernel is Gaussian with standard deviation σ b . We employ the FWHM of the Gaussian, 2.35 σ b , as our measure of resolution. To obtain an accurate fit, we performed targeted reconstructions of the central and right high-attenuation inserts with 0.25 mm pixel size from 10 different noise realizations. We then averaged the reconstructions together to obtain relatively lownoise profiles on which to perform the fitting. We characterized noise by calculating the average standard deviation of the pixel values in circular regions of interest (ROIs) of diameter 16.0 mm placed adjacent to, but not overlapping, the central and right high-attenuation inserts. We then plotted the resulting noise measure versus the resolution measure for the same location. Images reconstructed after smoothing with different values of the smoothing parameters α or filter length β provide different combinations of such values and allowed us to sweep out a resolution-noise curve.
The results are given in Figure 3, where it can be seen that at the low-dose level, the two penalized-likelihoodbased approaches both outperform the ATM filter in terms of resolution-noise performance. They perform very similarly to each other, with perhaps a slight advantage to the quadratic penalty. At high-dose levels, the approaches all perform relatively similarly.

CONCLUSIONS
We have compared two different penalty choices for a penalized-likelihood sinogram-restoration strategy we have been developing, one is a quadratic penalty we have employed previously and the other is a median-based penalty. We compared the approaches to a noniterative adaptive filter that loosely but not explicitly models data statistics. We found that the two approaches produced very similar resolutionvariance tradeoffs to each other and that they outperformed the ATM filter in the low-dose regime, which suggests that the particular choice of penalty in our approach may be less important than the fact that we are explicitly modeling data statistics at all.
It is not possible to conclude, of course, that the penalized-likelihood approaches would outperform any noniterative adaptive filter. In generating the resolution-variance tradeoffs for the ATM filter, we made use of the parameters λ and δ given by Hsieh in describing the filter in [10] and varied the filter length parameter β. The filter length β is the most natural parameter to vary in sweeping out resolutionvariance curves, but it is possible that further adjusting the parameters λ and δ could further improve the achievable tradeoffs. In particular, it is possible that the version of the ATM filter implemented on GE scanners has been optimized beyond what was presented in [10].
Since the quadratic penalty allows for derivation of a monotonic algorithm guaranteed to increase the likelihood function while the median filter approach offers no such guarantee, we find it to be preferable to the median-based penalty. However, it might be worthwhile to explore the new class of median-like prior proposed by Hsiao   auxiliary field derived from the local medians of the image [20]. Another possibility for future work would be to explore a penalty that uses Hsieh's ATM filter as an OSL prior much as the median prior was used here.