A Weighted Two-Level Bregman Method with Dictionary Updating for Nonconvex MR Image Reconstruction

Nonconvex optimization has shown that it needs substantially fewer measurements than l 1 minimization for exact recovery under fixed transform/overcomplete dictionary. In this work, two efficient numerical algorithms which are unified by the method named weighted two-level Bregman method with dictionary updating (WTBMDU) are proposed for solving lp optimization under the dictionary learning model and subjecting the fidelity to the partial measurements. By incorporating the iteratively reweighted norm into the two-level Bregman iteration method with dictionary updating scheme (TBMDU), the modified alternating direction method (ADM) solves the model of pursuing the approximated lp-norm penalty efficiently. Specifically, the algorithms converge after a relatively small number of iterations, under the formulation of iteratively reweighted l 1 and l 2 minimization. Experimental results on MR image simulations and real MR data, under a variety of sampling trajectories and acceleration factors, consistently demonstrate that the proposed method can efficiently reconstruct MR images from highly undersampled k-space data and presents advantages over the current state-of-the-art reconstruction approaches, in terms of higher PSNR and lower HFEN values.


Introduction
Fast acquisition is an important issue in magnetic resonance imaging (MRI) for avoiding physiological effects and reducing scanning time on patients. Acquisition time is proportional to the number of acquired k-space samples [1]. Unfortunately, accelerating acquisition by reducing k-space samples leads to noise amplification, blurred object edges, and aliasing artifacts in MR reconstructions. Then improving reconstruction accuracy from highly undersampled k-space data becomes a complementary tool to alleviate the above side effects of reducing acquisition.
To cope with the loss of image quality, some prior information should be used in the reconstruction procedure. One process of introducing prior information in the reconstruction is known as "regularization" [2][3][4][5][6][7][8][9][10][11]. Tikhonov regularization, a commonly used method that pursuits reconstructions by a 2 -norm minimization, leads to a closed-form solution that can be numerically implemented in an efficient way [6][7][8]11]. Moreover, with the advent of compressed sensing (CS) theory, sparsity-promoting regularization has gained popularity in MRI (e.g., the 1 -based regularization) [1-3, 5, 10, 12]. The CS theory states that sparse, or more generally, compressible signals, incoherently acquired in an appropriate sense, can be recovered from a reduced set of measurements that are largely below the Nyquist sampling rate. Exact reconstruction can be achieved by nonlinear algorithms, using 1 minimization or orthogonal matching pursuit (OMP) [1-3, 5, 10, 12, 13]. Usually, nonconvex optimization like (0 < < 1) minimization will guarantee a better recovery by directly attacking the 0 minimization problem [14][15][16]. Chartrand et al. investigated the nonconvex (0 < < 1) norm optimization as a relaxation of the 2 International Journal of Biomedical Imaging [15] developed a homotopic 0 -minimization strategy to reconstruct MR image and generalized the nonconvex (0 < < 1) norm to a family of nonconvex functions as alternatives. Candès et al. presented the iteratively reweighted 1 minimization (IRL1) algorithm [19], which amounts to linearly minimize the log penalty of the total variation. Wong et al. [16] further incorporated a seminonlocal priority into the homotopic 0 -minimization for breast MRI.
Besides the prior-inducing penalty function, another important issue in CS-MRI reconstruction, is the choice of a sparsifying transform, since one of the important requirements in CS-MRI is that the image to be reconstructed has a sparse representation in a given transform domain. At the start, sparse representation was usually realized by total variation (TV) regularization and/or wavelet transform [1,12,13,15,20,21]. As it is known, TV prior assumes the target images consist of piecewise constant areas which may not be valid in many practical MRI applications. Other analytically designed bases, such as wavelets and shearlets, also involve their intrinsic deficiencies [20]. Some advanced approaches were developed to address these issues. Knoll et al. introduced the total generalized variation (TGV) for MR image reconstruction problems [22]. Liang et al. [23] applied the nonlocal total variation (NLTV) regularization to improve the signal-to-noise ratio (SNR) in parallel MR imaging, by replacing the gradient functional in conventional TV using a weighted nonlocal gradient function to reduce the blocky effect of TV regularization. Since fixed bases might not be universally optimal for all images, some methods utilizing sparsity prior under adaptive transform/dictionary were developed recently [24][25][26][27][28][29]. The sparsity in this framework is enforced on overlapping image patches to emphasize local structures. Additionally, the global dictionary consisting of local atoms (prototype) is adapted to the particular image instance, which thereby provides better sparsity. Our work also adopts the similar concept as a penalty term for MRI reconstruction.
After reviewing two important components in CS, an interesting question is whether a better sparsity-inducing function (e.g., (0 < < 1) norm) combined with a better transform (e.g., adaptive dictionary) will lead to better reconstruction from the same set of measurements. To the best of our knowledge, there is only one paper by Shi et al. preliminarily studying the 0 -approximation sparsity under the learned dictionary [14], where the 0 penalty is relaxed by the nonconvex minimax concave (MC) penalty. The MC penalty approximates the 0 -penalty by gradually increasing a scale parameter. However, their work focused on image processing and the homotopic strategy they used is computationally demanding [14].
In this paper, we propose a novel method to solve the nonconvex minimization problem in CS-MRI by incorporating the reweighting scheme with our recently developed twolevel Bregman method with dictionary updating (TBMDU). An ultimate advantage of our proposed method is that, by incorporating the iteratively reweighted scheme into the twolevel Bregman method/augmented Lagrangian (AL) with dictionary updating, the -based updating step is conducted in each inner iteration of the AL scheme, and both the minimization of the sparsity of image patches and the datafidelity constraint can be solved in a unified formalism. The modified alternating direction method (ADM) efficiently solves the model of pursuing the approximated (0 < < 1)-norm penalty. Specifically, by applying iteratively reweighted 2 or 1 minimization scheme at each AL inner iteration, the proposed algorithm utilizes adaptive weights to encourage the large coefficients to be nonzero and the small ones to be zero under learned dictionary, hence resulting in more compact and discriminative representation for reconstructed image patches. Numerical experiments show that the performance of the derived method is superior to other existing methods under a variety of sampling trajectories and k-space acceleration factors. Particularly, it achieves better reconstruction results than those by using TBMDU without adding any computational complexity.
The rest of this paper is organized as follows. We start with a brief review on TBMDU and iteratively reweighting scheme for -minimization in Section 2. Consequently, the proposed method WTBMDU is presented in Section 3. Several numerical simulation results are illustrated in Section 4 to show the superiority of the proposed method. Finally, conclusions and discussions are given in Section 5.

Theory
Plenty of papers, with the theme of compressed sensing (CS) [2,3], demonstrated that MR images can be reconstructed from very few linear measurements [1-3, 5, 10, 12]. These approaches take advantage of the sparsity inherent in MR images. The well-known example is the success of TV regularization which implies that MR images can be approximated by those having sparse gradients. Generally, if a sparsifying transform can be readily defined, ideally one could choose the estimation with the sparsest representation in that still matches the limited measurements. That is, by minimizing transform-domain sparsity-promoting energy ( ) = ‖ ‖ 0 that is subject to the data consistency = + , where is noise, it yields min ( ) where ∈ C is a vector representing the -pixel 2D complex image.
∈ C × denotes the partially sampled Fourier encoding matrix and ∈ C represents the acquired data in k-space. = √ 2 is the standard deviation of the zero-mean complex Gaussian noise, and is the standard deviation of both real and imaginary parts of the noise. Practically, the 0 quasi-norm ‖ ⋅ ‖ 0 in the regularization term of (1) is usually relaxed. In this paper, we propose some efficient algorithms for nonconvex relaxation such as quasinorm ‖ ⋅ ‖ , 0 < < 1 [17,18].
The choice of the sparsifying transform is an important consideration in minimizing functional (1). Besides the fixed transforms such as finite-difference and wavelet transform, modeling image patches (signals) by sparse and redundant International Journal of Biomedical Imaging 3 representations have been drawing considerable attention in recent years [24]. Considering the image patches set = [ 1 , 2 , . . . , ] consisting of signals, ∈ C denotes a vectored form of the √ × √ patch extracted from the image of size √ × √ . The sparseland model for image patches suggests that each image patch, , could be sparsely represented over a learned dictionary * [27]; that is, * = arg min ( where stands for the sparse coefficient of the th patch. The combination of sparse and redundant representation modeling of signals, together with a learned dictionary using signal examples, has shown its promise in a series of image processing applications [24][25][26][27][28][29]. For example, in [25,26], the authors used reference MRI image slices to train the dictionary and achieved limited improvement. Considering that a dictionary learnt from a reference image would not be able to effectively sparsify new features in the current scan; Ravishankar and Bresler [29] suggested learning dictionary from the target image itself and developed a two-step alternative method to remove aliasing artifacts and noise in one step and subsequently fills in the k-space data in the other step.
In the following, to make the paper self-contained, we, respectively, review the TBMDU method for MRI reconstruction and reweighted scheme for solving nonconvex relaxation of the -quasi-norm minimization.

The TBMDU Framework for MRI Reconstruction.
Recently, we proposed a series of dictionary learning methods [30][31][32] based on augmented Lagrangian/Bregman iterative scheme [33]. Particularly, by using a two-level Bregman iterative method and the relaxed version of sparseland model (i.e., ( ) = ∑ (‖ ‖ 1 + ( /2)‖ − ‖ 2 2 )) as the regularization term for MRI reconstruction [31], the TBMDU method solves the objective function equation (1) by iterating the following: where = [ 1 , 2 , . . . , ] ∈ C × and Γ = [ 1 , 2 , . . . , ] ∈ C × . stands for the sparse level of the image patches in the "optimal" dictionary. For many natural or medical images, the value of can be determined empirically with robust performance in our work. = ⋅ , in which measures the degree of the overcompleteness of the dictionary. The proposed TBMDU method consists of a two-level Bregman iterative procedure, where (3) is the outer-level of the method. On the other hand, the inner-level Bregman iterative is employed to solve the subproblem of sparse representation in (3), that is, adding auxiliary variables to convert the unconstrained problem (2) into a constrained formulation: Then split Bregman method/augmented Lagrangian is used to solve problem (4) as follows: We minimize the corresponding AL function (5) alternatively with respect to one variable at a time. This scheme decouples the minimization process and simplifies the optimization task. At the dictionary updating step, by updating (5) with respect to along the gradient descent direction and constraining the norm of each atom to be unit [24], we can get the following update rule: At the sparse coding stage of updating , the optimization problem for each is derived: By applying the iterative shrinkage/thresholding algorithm (ISTA) [33,34] (i.e., performing a gradient descent of the first functional term in (8) and then solving a proximal operator problem) and using the immediate variable +1 to represent the updating scheme compactly (i.e., the formulation +1 = ( / + )(− , + + / ) derived from (6)), it attains the solution of as follows: where > eig(( ) ) and = Shrink( , ) = ( ⋅ max(| | − , 0))/(max(| | − , 0) + ) is the solution of = arg min ‖ International Journal of Biomedical Imaging While stop criterion not satisfied (loop in ) do (6) +1 = 0 (− Γ , + , + / ) update +1 by frequency interpolation using Algorithm 1: TBMDU.
The corresponding procedure of TBMDU is summarized in Algorithm 1. Line 10 is the frequency interpolation step for updating the image . AL scheme and ADM applied to the constrained problem result in an efficient two-step iterative mechanism, which alternatively updates the solution and image-patch related coefficients (Γ, , and ). For a more detailed description of the derivations, the interested readers can refer to [31].

Reviews of Reweighted Scheme on Solving Nonconvex
Relaxation of the Quasi-Norm. In (1), the 0 penalty can be relaxed into several tractable alternatives. For example, some nonconvex relaxations such as quasi-norm ‖ ⋅ ‖ , 0 < < 1 [17,18], log penalty [19], and smoothly clipped absolute deviation (SCAD) penalty [35] were usually used. Compared to 1 norm, function with 0 < < 1 is a smooth relaxation of 0 and the geometry of ball gives better approximation on sparsity. Recent numerical results showed that adapting nonconvex optimization technique can reduce the required number of measurements for reconstruction [14][15][16].
Most current approaches for solving nonconvex -quasinorm optimization can be classified into two categories: reweighted 2 and reweighted 1 [36][37][38][39][40][41]. The common idea of these methods is to utilize an optimization transfer technique to iteratively surrogate the nonconvex potential function by a convex function [37,39,41]. Usually, the convex function is chosen as a local quadratic approximation or linear local approximation. A technique drawing much interest is the iteratively reweighted least squares (IRLS) method which is widely used in compressed sensing and image processing [36,40]. Another technique closely related to IRLS is the iteratively reweighted norm (IRN) approach introduced by Rodríguez and Wohlberg for generalized total variation functional [42]. The IRN method pursues to minimize the norm ( ) = (1/ )‖ ‖ for ≤ 2 by using a weighted 2 norm in an iterative manner. At iteration + 1, the solution +1 is the minimizer of The sequence of solution { } converges to the minimizer of ( ) as → +∞. It has been proven that IRN is one kind of majorization-minimization (MM) method [42], which involves good property of ( +1 ) ≤ ( ). In order to extend the iterative reweighting approach IRLS to the general nonconvex function, Mourad and Reilly developed the quadratic local approximation of the nonconvex function as follows [39]: where 0 ≥ 0 is a suitably chosen parameter and is the first derivative of . stands for the th index number of vector . A similar work conducted by Elad and Aharon is the iteratively reweighted 1 minimization (IRL1) algorithm [27], which iteratively minimizes the linearization of a quasilogarithmic potential function. Zou and Li [41] presented the linear local approximation of the general nonconvex function and the upper bound of Equation (12) aims to approximate the penalty function with its first-order Taylor expansion at the current iteration. Its International Journal of Biomedical Imaging 5 majorization properties can be easily obtained by exploiting the concavity of the function , which always lies below its tangent. The numerical tests in [17] by Chartrand and Yin demonstrated that IRLS is comparable with IRL1. In [38], the authors presented weighted nonlinear filter for compressed sensing (WNFCS) for general nonconvex minimization by integrating the IRL1 algorithm shown in (12) and their previously developed NFCS framework [43].

Materials and Methods
As discussed in previous section, dictionary updating and sparse coding to (5) are performed sequentially. Therefore, an interesting question is that will the better sparsity inducing function in the coefficient matrix lead to better image reconstruction under the learned dictionary? In the following, we investigate two variations at the sparse coding step. By employing the -seminorms (0 < ≤ 1), it is feasible to expect that utilizing approximations that are closer to the 0quasi-norm than 1 will correspondingly reduce the required number of measurements for accurate reconstruction.

Proposed Method: WTBMDU.
By replacing 1 norm with nonconvex norm, (3) can be rewritten as After some manipulations similar to those in Section 2.1, updating coefficients at the sparse coding stage are deduced as follows: When other variables are fixed, numerical strategies are proposed in the following to solve the corresponding nonconvex minimization problems with regard to , by iteratively minimizing a convex majorization of the nonconvex objective function.

Algorithm 2: Reweighed 2 .
Compared to the previous reweighted 1 strategy, the only difference is that the reweighted 2 -approximation in (10) is employed to the second term: Then it attains the solution of : The minimizer of this least-square problem is given by where , , = 1/(| , , | 2− + ), = 1, 2, . . . , . Similar to that in (14), > 0 is a small parameter to prevent numerical instabilities. Now, we summarize our proposed method for MRI reconstruction here, which we call weighted TBMDU. The detailed description of the proposed method is listed in 6 International Journal of Biomedical Imaging Algorithm 2. Similar to TBMDU, the proposed WTBMDU method alternatively updates the target solution , image patch related coefficients (Γ, , and ), and auxiliary variables. The difference between the plain TBMDU and the weighted TBMDU mainly lies on the weights used in updating the sparse coefficients. In WTBMDU, the weights are obtained by evaluating the function of variables at the solution of the previous step. Specifically, whether the , (19), the weighting scheme , , decreases as the absolute value of , , increases, indicating that it penalizes more on the coefficients with small magnitude value. Therefore, this operation strongly encourages the large coefficients to be nonzero and the small ones to be zero. In line 8 for reweighted 1 and line 10 for reweighted 2 , the formulations denote operating every element in the matrix componentwise. In the following content of this paper, we denote the reweighted 1 and reweighted 2 in WTBMDU as WTBMDU-L1 and WTBMDU-L2, respectively.
The strategy we used in WTBMDU is similar to that used in WNFCS [38], that is, combining the penalized proximal splitting strategy and reweighting strategy. The difference is that WNFCS focus on updating in the simple CS domain and the proximal splitting strategy is employed to the Fourierrelated data-fidelity term, while our WTBMDU devotes to updating the coefficients in the adaptive dictionary and the proximal splitting strategy is employed to the dictionaryrelated sparse representation error term. Additionally, we also derive the updating scheme based on reweighted 2 .

Parameter Values, Continuation Strategy, and Algorithm
Convergence. The proposed method involves four parameters: , , , and . The setting of these parameters is similar to that in TBMDU. Firstly, both parameters and are the Bregman (or augmented Lagrangian) positive parameters associated with the small image patches and the whole image itself, respectively. One is for the overlapping image patches and the other is for the image solution itself. It has been mathematically proven that the choice of the Bregman parameter has little effect on the final reconstruction quality as long as it is sufficiently small [31,44,45]. In our work, the smaller the value of the Bregman parameter is, the more iterations the Bregman method need to reach the stopping condition. Moreover, svince and are with different orders of magnitude, we set ≈ / in the AL formalism for the balance between various "penalty" terms. Secondly, stands for the sparse level of the image patches and can be determined empirically. Finally, the step size in the dictionary updating stage can be set to be a small positive number, for example, 0.01.
In summary, there is only the parameter that should be carefully chosen in order to enable the efficiency of the algorithm. In our method, by taking advantage of the typical structure of the problem in this paper, we propose a similar rule to that used in [21]. Specifically, we set and so as to achieve condition numbers (1+ , / 0 ) and ( + ( / ) ∑ ) that result in fast convergence of the algorithm. Since 0 = /( + ) and ≫ , 0 ≈ . Consequently (1 + , / 0 ) is a decreasing function of .
As for the convergence, because of the unconvexity and nonlinearity of the problem in the case of updating dictionary, the global solution may not to be found easily like in TBMDU. Nevertheless, our dictionary is updated by a gradient descent of the AL scheme which leads to a monotonic decrease in the cost function. At the sparse coding stage, since WTBMDU merges the reweighted and penalization strategies, it still maintains the descent property of the latter and local minimum must be attained as demonstrated in [38]. Therefore, both the value of the objective function and the norm of the reconstruction difference between successive iterations can be chosen as the stopping criterion. The convergence property of the algorithm will be presented in the numerical section.

Experiment Results
In this section, we evaluate the performance of the proposed method using a variety of sampling schemes, with different undersampling factors. Sampling schemes used in our experiments include 2D random sampling [15], Cartesian approximation of multishot variable-density spiral sampling [15], Cartesian sampling with random phase encodings (1D random) [1,15], and pseudo radial sampling [15,29]. Reconstruction results on simulated MRI data, a complex phantom, and real MRI data were presented. The MR images tested in the synthetic experiments are from in vivo MR scans of size 512 × 512 (many of which are courtesy (2009, American Radiology Services [Online]. Available: http://www3.americanradiology.com/pls/web1/wwimggal .vmg/) and used in [29]), and the real MRI data examples International Journal of Biomedical Imaging While stop criterion not satisfied (loop in ) do (6) +1 = 0 ( + Γ , + , + / ) End (If) (12) End (While) update +1 by frequency interpolation using 1 = , reported here are of size 256 × 256 (except in Figures  6 and 7 where a phantom of size 512 × 512 was used). According to many prior work on CS-MRI [1,13,29], the CS data acquisition was simulated by subsampling the 2D discrete Fourier transform of the MR images (except in the second subsection where real acquired data was used). Our proposed method WTBMDU was compared with the leading DLMRI (the code is available in https://netfiles.uiuc.edu/ravisha3/www/DLMRICODE.zip) [29] and TBMDU methods [31], which have been shown to substantially outperform other CS-MRI methods such as LDP (Matlab codes are available in http://www.stanford.edu/∼mlustig/) [1], and the zero-filling reconstruction. DLMRI directly solves the 0 -minimization by OMP while TBMDU is devoted to the 1 -induced sparse minimization. In each given example, the parameters for the DLMRI method were set to be default values.
In the experiments, the nominal values of various parameters were set as patch size √ = 6, the over-completeness of the dictionary = 1 (correspondingly = 36), and the patch overlap = 1; thereby the number of data samples were = 65536 for √ = 256 and = 262144 for √ = 512. = ( √ / ) 2 = 36, = 0.0056, = / , = 0.01, = 12, and = 3. To avoid dividing by zero, the parameter used in the weighted matrix , , decreased by 2% after each inner iteration with an initial value of 5. The setting of parameters and is very similar to that in, [31,32] and not discussed here due to the limit of paper space. Realvalued dictionaries were used for the simulated experiments with real-valued images, where the over-complete discrete cosine transform (DCT) was chosen as the initial dictionary. Complex-valued dictionaries were used for real MR data, where both the real and imaginary parts were the same DCT matrix. The quality of the reconstruction was quantified using the peak signal-to-noise ratio (PSNR (the PSNR is defined as PSNR = 20 log 10 255/RMSE, where the RMSE is the root mean error estimated between the ground truth and the reconstructed image)) and high-frequency error norm (HFEN) [29]. All algorithms were implemented in MATLAB 7.1 on a PC equipped with AMD 2.31 GHz CPU and 3 GByte RAM.

Reconstruction of Simulated Data (Real-Valued).
We first investigate the performance of WTBMDU with the noiseless measurements. Figure 1 involves an axial T2-weighted reference image of the brain with pseudo radial sampling under 85% and 95% undersampling percentages, respectively (i.e., only acquiring 15% and 5% k-space data with corresponding acceleration factors of 6.67 and 20). The plots of PSNR and HFEN values as functions of iteration number under the undersampling percentage of 85% are presented in Figures 1(d) and 1(e). It can be observed that the quantitative measures of both TBMDU and WTBMDU change quickly during the first few iterations. In other words, these measure values only need less iterations to reach the convergence zone and hence the iterative convergence property of our method is better than that of DLMRI. The higher PSNR values and lower HFEN values after convergence also confirm the superiority of our method to DLMRI and TBMDU. The reconstructed results shown in Figures 1(f), 1(g), 1(h), and 1(i) reveal that the method WTBMDU-L1 with = 0.5 provides a more accurate reconstruction on image contrast and sharper anatomical depiction. Compared to DLMRI, the magnitude images of the reconstruction error shown in Figures 1(h) and 1(i) indicate that our method exhibits crisper reconstruction of object edges (the large anatomical structure in the middle region) and preserves finer texture information (the gray matter regions in the bottom-right of the reconstruction). In general, our proposed method provides better intensity fidelity to the fully sampled image. More obvious differences in visual quality can be observed in the case of 95% undersampling as shown in Figures 1(j) and 1(k). The obtained PSNRs of DLMRI and WTBMDU-L1 with = 0.5 are 26.30 dB and 28.13 dB, respectively. The DLMRI reconstruction in Figure 1(j) based on k-SVD dictionary updating and greedy pursuit of coefficients shows a large number of spurious oscillations, although it gives much improvement than zero-filling and LDP (not shown in this paper). In contrast, WTBMDU shown in Figure 1(k) results in much fewer spurious oscillations and better preservation of edges through iterative weighting of the coefficients under the data-adaptive dictionary. This is especially noticeable for the brain's fissures with sharper anatomical edges. Figure 2 compares the results generated by DLMRI and WTBMDU using four sampling trajectories roughly under the same undersampling percentage: variable density random with 87% undersampling [15], Cartesian approximation of multishot spiral with 86% undersampling [15], 1D Cartesian trajectory [29], and pseudo radial sampling with 86% undersampling (i.e., 7.11-fold acceleration). The test image is the axial T2-weighted brain image shown in Figure 1(a). As can be observed from the error images, WTBMDU-L1 performs better in reducing aliasing artifacts and maintaining fine details than DLMRI with all verified sampling trajectories. Table 1 lists the PSNR values of the axial T2-weighted brain image at different sampling trajectories with the same undersampling percentage using DLMRI, TBMDU, and WTBMDU. Generally, the improvements gained by WTBMDU over other methods are different for four kinds  of trajectories although under the same undersampling rate. The largest and smallest improvements were achieved with the 2D random and radial sampling, respectively, where roughly 5 dB and 2 dB were obtained. This indicates that the efficiency of dictionary learning methods may depend on the incoherence of the data acquisition. In the family of WTBMDU algorithms, the optimal value is also different at various trajectories for both WTBMDU-L1 and WTBMDU-L2. To balance the numerical calculation and the selection of trajectories, = 0.5 or = 0.7 is a good option. Figure 3 illustrates the performance of DLMRI, TBMDU, and WTBMDU at a range of acceleration factors including 2.5, 4, 6, 8, 10, and 20, where zero-mean complex white Gaussian noise with standard deviation = 10.2 was added to the 2D random sampled k-space [29]. Since the stopping rule for the outer loop of both TBMDU and WTBMDU is determined by ‖ − ‖ 2 < , the number of outer iterations max of TBMDU and WTBMDU-L2 with = 0.7 take the values of 3, 3, 6, 7, 7, 11 and 4, 8, 8, 10, 11, 11 for the above six acceleration factors, respectively. For the quantitative comparison, the values of PSNR and HFEN as functions of acceleration factor are shown in Figures 3(b) 14 International Journal of Biomedical Imaging As shown in the error images, the brighter the error image appears, the larger the deviation between the reconstruction and the reference image will be. WTBMDU (Figure 3(i)) presents less pixel errors and structure loss than that of DLMRI in Figure 3(g) and TBMDU in Figure 3(h), especially in regions indicated by red arrows. In general, it can be concluded that both WTBMDU-L2 and TBMDU offer strong preservation of details compared with that of DLMRI, while WTBMDU-L2 provides a crisper result of fine structures than that of TBMDU. Figure 4 depicts the reconstruction results of a transverse slice of a noncontrast MR angiography (MRA) of the circle of Willis (COW) at the same experiment setting as in Figure 3. The MRA of the circle of Willis has much textural information such as the vessels on the middle region and fine-scale details on the bottom region. As can be seen from Figure 4(b), when the acceleration factor increased until 10-fold, the PSNR gap between TBMDU and WTB-MDU increases synchronously. The PSNR improvement is also reflected in Figures 4(e) and 4(f) where much less errors appeared in the WTBMDU error image, indicating WTBMDU performs better in maintaining fine details. On the other hand, similar as observed in Figure 3, when the acceleration factor increased to as large as 20-fold, the advantage of the nonconvex optimization degraded and none of these methods can faithfully reconstruct the original image from such fewer k-space samples with additional noise.
To investigate the sensitivity of various methods to different levels of complex white Gaussian noise, DLMRI, TBMDU, and WTBMDU were applied to reconstruct a T2weighted sagittal view of the lumbar spine under pseudo radial sampling at 6.09-fold acceleration. Figure 5 It can be observed that the skeletons in the top half part of the TBMDU reconstruction appear less obscured than those in the DLMRI results. Meanwhile, the reconstruction by WTBMDU is clearer and sharper than that by DLMRI and TBMDU and is relatively devoid of aliasing artifacts. This reveals that our method provides a more accurate reconstruction of image contrast and sharper anatomical depiction in noisy case. Figure 6 shows the comparison between different methods on a physical phantom, which is often used to assess the resolution of MR reconstruction. Figure 6(a) displays the fully sampled reconstruction of the physical phantom. Figures 6(c), 6(d), 6(e), and 6(f) exhibit the results of DLMRI, TBMDU, WTBMDU-L1, and WTBMDU-L2 with = 0.7 at 80% undersampling ratio. The corresponding PSNR values are 18.66 dB, 26.82 dB, 28.89 dB, and 29.12 dB, respectively. We can find that the WTBMDU reconstructions exhibit higher resolution than those with DLMRI and TBMDU and are almost devoid of aliasing artifacts especially in zoomed-in regions.

Reconstruction of Experimental MRI Data (Complex-Valued).
To investigate the noise sensitivity of proposed method on the complex k-space data, complex Gaussian noise of = 30 was added to the k-space with 5-fold acceleration. In this case, the PSNR values of DLMRI, TBMDU, and WTBMDU-  are shown in Figure 7. The enlargements of two region-ofinterests (ROIs) are presented in Figures 7(d) and 7(e). As can be observed, the DLMRI reconstruction exhibits more oscillating artifacts than that of the other two methods. Besides, the spot and circle in the bottom right of the WTBMDU reconstruction appear less obscured than those in the DLMRI and TBMDU results. In Figures 8 and 9, two real brain data sets containing more fine-detailed structures [44,46] were used for comparison. The images in Figures 8(a) and 9(a) are both the fully-sampled reconstruction of size 256 × 256 as references. A variable density Cartesian sampling trajectory with 80% undersampling was employed as shown in Figure 8 Figure 9. Some regions of error in these subplots have been illustrated with red arrows, indicating that the reconstruction using WTBMDU shows better fidelity to the reference image than that of DLMRI and TBMDU. The enlargements of the reconstruction results with these methods are shown in Figures 8(f) and 9(e). It can be observed that visible aliasing artifacts along the phase encoding direction (horizontal in the image plane) in the DLMRI reconstruction is more pronounced than those of TBMDU reconstruction, while there are almost no artifacts in that of WTBMDU with = 0.7. Close inspection of the zoomed-in images indicates that the weighted sparse regularization enhances the reconstruction quality.
In summary, the results from our work clearly demonstrate the advantages of introducing iterative reweighting scheme and alternating direction method in nonconvex optimization over conventional methods, for constrained image reconstruction from undersampled k-space data. Not only for the brain image containing large piecewise constant regions, as shown in Figures 1, 2, and 3, but also for the circle of Willis and lumbar spine composed of many texture features as shown in Figures 4 and 5, the proposed method visually provides more pleasant results than existing methods. By means of the Bregman iteration/AL methodology, it is notable that the proposed method consistently supports superior results without any parameters tuned manually regardless of different sampling trajectories, varying sampling factors, and the existence of noise or not. We note that, in some circumstances, the PSNR value started to decrease a little bit after achieving the highest value. The reason of this phenomenon is that the solution of problem (1) does not necessarily have higher PSNR than the intermediate iterates.
In real-time imaging applications where the acquisition and reconstruction speed is crucial, a numerical algorithm which converges fast at the beginning iterations is highly appreciated and useful.

Conclusions and Discussions
This work presents a new regularizer combining the nonconvex pseudonorm and adaptive dictionary for MRI reconstruction from undersampled k-space data. Based on the potential success of TBMDU in the AL framework for CS-MRI, the proposed WTBMDU method extends the plain TBMDU to handle nonconvex regularization by weighting the sparse coding of the coefficients. The strategy of employing iteratively reweighted 1 or 2 technique results in the WTBMDU-L1 and WTBMDU-L2 algorithms, respectively. High-accuracy reconstructions are obtained from significantly undersampled k-space data by minimizing the adaptively sparsity-promoting regularization criterion subject to the data-consistency constraint. In particular, compared to the DLMRI method that minimizes the 0 regularization by the greedy algorithm of matching pursuit, the iteratively reweighted strategy combined with AL framework yields much higher PSNR values. On the other hand, compared to the plain TBMDU for 1 -norm minimization, the adaptively weighted method achieves 0.3-2.5 dB PSNR improvement and visually provides more pleasure results. Various experimental results demonstrate the superior performance of the method under a variety of sampling trajectories and k-space acceleration factors.
WTBMDU is very general since it represents an algorithmic framework that can be easily adapted to different reweighting strategies and nonconvex sparsity-inducing functions proposed in the previous literatures. Some extensions will be studied in future as follows: (i) for the reweighted 1 strategy, using different nonconvex objective functions as conducted in [38,41]; besides, the shrinkage can be generalized as in [47,48]; (ii) for the reweighted 2 strategy, trying different nonconvex objective functions as in [39]. Additionally, the proposed scheme can be easily extended to incorporate other prior (e.g., spike and slab prior used in [49], where the regularization consists of the 2 -norm and be combined with the homotopic technique described in [14][15][16], where the value of slowly decreases from 1 to 0.
Another extension we will consider in future is to extend our proposed model to parallel MRI reconstruction. We note that a very recently published paper [21], under the work of Ramani and Fessler, addressed the regularized SENSEreconstruction using AL methods. In order to effectively solve the unconstrained SENSE-reconstruction consisting of TV/wavelets regularization term and data-fidelity term using the AL formalism, they split not only the regularization term, but also the Fourier encoding and spatial components in the data-fidelity term, by introducing auxiliary variables to decouple the data-domain components, and the regularization component, respectively. We believe that the regularizer of the sparse representations of overlapping image patches in our work can be naturally incorporated into their work in a unified AL framework.
International Journal of Biomedical Imaging