Recent Development of Dual-Dictionary Learning Approach in Medical Image Analysis and Reconstruction

As an implementation of compressive sensing (CS), dual-dictionary learning (DDL) method provides an ideal access to restore signals of two related dictionaries and sparse representation. It has been proven that this method performs well in medical image reconstruction with highly undersampled data, especially for multimodality imaging like CT-MRI hybrid reconstruction. Because of its outstanding strength, short signal acquisition time, and low radiation dose, DDL has allured a broad interest in both academic and industrial fields. Here in this review article, we summarize DDL's development history, conclude the latest advance, and also discuss its role in the future directions and potential applications in medical imaging. Meanwhile, this paper points out that DDL is still in the initial stage, and it is necessary to make further studies to improve this method, especially in dictionary training.


Introduction
Compressive sensing (CS) is a novel theory in information acquisition and processing [1]. Since general signals are broadband, traditional signal reconstruction methods usually adopt Nyquist Sampling, requiring high sample rate and long processing time. However, CS theory offers a way to restore signal accurately with less measurement by solving an optimization problem in which signal is sparse, represented using a basis matrix, and the high-dimensional transformation is projected to a lower dimensional subspace. Therefore, CS theory has been widely recognized and applied in various fields.
Some groups focus on studies of CS applications and have developed various braches such as Bayesian CS and 1-Bit CS [2][3][4]. After it is applied in medical imaging reconstruction, CS theory is proven to be a method that effectively retains high image quality using undersampling measurement data in different imaging modalities including computed tomography (CT) and magnetic resonance imaging (MRI) [5][6][7]. Besides, CS theory shows great potential in multimodalities image reconstruction, one of the future directions of medical imaging.
Dictionary learning (DL) is a typical method of CS image reconstruction. In this method, sampled data is compressible in specific transform domain, and transformation coefficients are projected to a lower dimensional vector with essential image information retained well. As a result, complex reconstruction problem is simplified to an optimization problem. Usually, one should take three problems into consideration to solve image reconstruction problems using DL methods. First, design an overcompleted dictionary which can represent a signal sparsely. Second, get a measurement matrix strictly satisfied with isometry property. Third, develop a fast signal reconstruction algorithm with good robustness. The designed dictionary is important to the accuracy of CS image reconstruction. In DL method, the dictionary is self-adaptive and flexible; it is trained by particular image samples or group of images. Using different training methods, the image sparseness is quite different [8].
Though DL-based approach has been recognized in medical image reconstruction field, single dictionary applied in the whole image process brings out a limit in image quality. That means only one dictionary is far from enough as the prior information. In order to improve image quality, research scholars have optimized DL method to dual-dictionary learning (DDL) which has more diverse prior information in imaging modalities like CT and MRI. DDL method was initially developed for image super-resolution. Lu et al. [9,10] applied this method for CT reconstruction. Song et al. [11] used it in 3D MRI reconstruction. DDL shows a great potential in medical image reconstruction.
In this paper, we discuss the DL method in Section 2. Based on DL method, we review DDL's history and new development in Section 3, including its theory, feasibility demonstration, and the application in different fields. In Section 4, we discuss the use of DDL in medical image analysis. In the section of Discussion and Conclusion, we summarize algorithms and explore the future directions in medical image reconstruction.

DL Method and Theory.
According to the CS theory, an undersampling image reconstruction problem is to solve an underdetermined system of linear equations = by minimizing the 0 quasi norm (e.g., number of nonzeros) of the sparsified transform Ψ ; it means the image is sparse after a completed sparse transform Ψ ∈ R × . The corresponding optimization problem is In (1), is the image to be reconstructed, is the codebook for the given measurements . Equation (1) is also known as a sparse coding problem, which is a NP-hard problem (nondeterministic polynomial). It can be solved by some greedy algorithms, for example, orthogonal matching pursuit (OMP) [12]. It is notable that if the 0 norm is replaced with 1 norm, the problem can be solved by linear programming in the real domain or second order cone programming in the complex domain.
Given an image of size × , it can be decomposed into some small patches of size × , ≪ . Each patch can be expressed as a = 2 dimensional vector x ∈ R . All the patches are extracted from the object image according to the patch size and the slide distance. A dictionary ∈ R × is a matrix that consists of atoms ∈ R = × which are the columns of the dictionary. As is the patch vector from sample images, the initial dictionary constructed from the extracted patches is usually redundant or overcompleted; that is, ≪ . Using specific atoms of initial dictionary , each vector x in the image can be approximately represented as sparse coefficient [13]. Consider where > 0 for the error bound and ∈ R for the sparse representation vector which has few nonzero elements: ‖ ‖ 0 ≪ ≪ , = 1, 2, . . . , . To get the sparse representation of the vector x, one can minimize the 0 norm as If an image contains patches, DL is to find a dictionarỹ in which all the patches should be sparsely represented as follows: Usually, if ] is fixed by specific value, (3) is equivalent to solve the following problem: 2.2. Dictionary Construction. DL problem is NP-hard because it turns to a sparse coding problem when and x are fixed. Currently, mainly four adaptive dictionary training algorithms were proposed to solve such a dictionary learning problem.
(1) Direct method (DM): DM is an original method that preserves all the details in the sample images because of a direct extraction process, and then a target image can be fully recovered as the patches are well chosen. Usually, this method is effective in super-resolution image reconstruction.
(2) Method of optimal directions (MOD): MOD fixes the coefficients corresponding to the dictionary vectors and then updates the atoms by minimizing the residuals between the training vectors and its representations. The main advantage of MOD is that it gives the optimal adjustment of the dictionary vectors in each iteration. Usually, it provides better convergence properties in ECG (electrocardiogram) signals [14].
(3) Generalized principal component analysis (GPCA): GPCA is a general method for modeling and segmenting some mixed data using a collection of subspaces. By introducing certain algebraic models and techniques into data clustering, traditionally a statistical problem, GPCA offers a new spectrum of algorithms for data modeling and clustering [15].
(4) -means singular value decomposition ( -SVD): -SVD is an iterative method updating the dictionary atoms to fit the data better. The method does SVD on the errors and updates the current dictionary atom and coefficient simultaneously with the item which has the minimum error. As the most widely used method to train the dictionary, -SVD has an excellent convergence and sparsity [16].
Dictionary learning can be used to reconstruct image; a classic algorithm is summarized in Figure 1. Given an initial value 0 (initial dictionary), do dictionary learning using appropriate training method and obtain the sparse representation, and then update under specific transform (i.e., wavelet, Fourier) and output the result after several iterations at last.  [17][18][19], super-resolution reconstruction [20][21][22][23], image deblurring [24][25][26], denoising [27][28][29][30][31][32], medical image reconstruction [13,33], image prediction [34], and image inpainting [35]. However, both dynamitic atoms in each iteration step and certain noise in measurement data would increase iteration time making DL method slow in most cases. As to improve DL's inefficiency, some come up with the solution that by introducing two or more dictionaries image quality would be further improved within less time.

From Single to Dual-Dictionary. DL method is widely used in image restoration
One of the improved methods is dual-dictionary learning (DDL). DDL theory is first introduced by Curzion et al. as PADDL; it aimed to train a linear mapping in the case of a single dictionary. Note that this method is not using two different dictionaries but training one dictionary with its "dual" dictionary. In PADDL method, the essential concept is to update the dictionary = [ 1 , . . . , ] ∈ R × by means of its "dual" dictionary = [ 1 , . . . , ] ∈ R × , as an auxiliary item. It aims to find an optimal pair of linear operators by minimizing the following: where ∈ R × is the matrix to be trained and ∈ R × is the representation. The can be treated as filters to approximate its optimal . is the weight parameters.
The result shows that this dual-dictionary training method can be applied well in calculating the sparse representations [36].

DDL in Super-Resolution Reconstruction. Zhang et al.
proposed an efficient sparse representation method to solve image super-resolution reconstruction via DDL [37]. In this work, they assume that image patches with different resolution can share the same underlying sparse representation. Thus, given a dictionary pair { ℎ , }, where ℎ stands for high resolution and stands for low resolution, the sparse representation of from low-resolution image is similar as (3). Consider With the sparse representation vector , the highresolution patch can be approximately expressed as = ℎ . Put all the high-resolution patches back into corresponding positions and perform normalization. Finally we obtain the estimation of the high-resolution image .
The optimization model for learning coupled dictionaries with "dual" is as follows: in which and are the dimension of the high-and low-resolution patches. = as mentioned in Section 2.1. After multiplying = by ℎ , we acquire the highresolution patch . In this method, and ℎ are treated as one dictionary and trained simultaneously with their dual, which refers to and ℎ .
With the approximate sparse coding procedure via model (8), the result shows that their method speeds up the overall super-resolution process significantly.  Wang et al. reconstructed the high-frequency (HF) details from the low-resolution images using the prior models. HF is decomposed into a combination of two components, main high-frequency (MHF) and residual high-frequency (RHF). Wang et al. restored MHF and RHF, respectively, with dualdictionary and then added up MHF and RHF at last. For dictionary construction, -SVD was used to train the two dictionaries. The experiment result reveals that the PSNR values are better than bicubic and sparse representation algorithm.

DDL in Human Pose Estimation.
Ji and Su proposed a new method for robust 3D human pose estimation using DDL [39]. In their study, they constructed two dictionaries simultaneously including visual observation dictionary and body configuration dictionary. Both of the two dictionaries share with a same sparse representation with respect to every visual observation and its corresponding 3D body pose.
Since outline features are usually corrupted, the optimization model for robust human pose estimation is as follows: min , , , where ∈ R × for observation data matrix, ∈ R × for observation dictionary, ∈ R × for 3D pose data matrix, and ∈ R × for body configuration dictionary. ∈ R × for common sparse representation of and , and is the corruption item to be minimized. To solve problem (9), Hao and Fei used an inexact Augmented Lagrange Multiplier (IALM) method to update the two dictionaries. More details related to the IALM method can be learned from [29].
The experimental results show that their approach performs well in recovering outlines from corrupted data compared with other methods.

DDL Algorithm in Medical Image Reconstruction
Recently, DDL has gained attention in medical image reconstruction, which can improve image qualities and accelerate reconstruction process.

Method and Theory.
Let be a low-quality image and = [ 1 , 2 , . . . , ], and let ∈ R × be a low dictionary constructed from . Similarly, let ℎ be the high-quality counterpart of and ℎ = [ 1 ℎ , 2 ℎ , . . . , ℎ ]; ℎ ∈ R × constructed from ℎ . As a corresponding relation between and ℎ , they can be connected with a general following model: where is the noise and is the transform operator. For a specific ℎ , we can assume that each patch ℎ in ℎ can be expressed as the linear combination of the atoms in the following dictionary ℎ : where is the error; ‖ ‖ 2 2 < . is sparse coefficient, ‖ ‖ 0 ≪ . Combining (11) and (10) gives According to the above derivations which are referred to as the Sparse-Land Model, the low-quality patch can be sparse coded by the same vector under dictionary = ℎ . Thus, given the dictionaries and ℎ with accurate one-to-one mapping atoms, we can approximately recover ℎ simply by multiplying ℎ and the sparse representation obtained from as follows: The general workflow for DDL method in medical image reconstruction is summarized in Figure 2. Given two sets of measured data (high-resolution sample images and lowresolution sample images), we can obtain two dictionaries and ℎ using appropriate training methods (DM, MOD, GPCA, or -SVD). When a measured data is input, we can obtain the sparse representation with and then update the using ℎ .

DDL in CT Reconstruction. Computed tomography (CT)
reconstruction is a process obtaining the tomographic image of human body from X-ray projection data. The reconstruction methods can be divided into two types, analytic and iterative methods. In recent years, CS-based iterative method was applied in 3D X-ray image reconstruction. It performs more flexible and accurate than analytic method in most of cases. Some typical topics include interior CT problem, lowdose imaging, and incomplete data reconstruction [40][41][42][43][44]. Lu et al. made a progress in few-view image reconstruction of CT images (SART-TV-DL) [9, 10] using DDL. Since each pair of corresponding sample images is reconstructed from the same object just different in view numbers of projection, a high-quality image and its low-quality counterpart have the relationship described in (10).
In their work, a set of high-quality images which were reconstructed with SART algorithm from adequate projection were used to construct a high-quality dictionary ℎ ; however, according to the pixel-to-pixel mapping rule, a lowquality dictionary can be also generated from a set of blurry images which were reconstructed from under-sampled projection data. To solve the dictionary training problem, they used DM mentioned in Section 1 because it could reserve most details of the sample images. Moreover, this method can generate dictionaries easiest and fastest.
However, in a CT image, pixel values alone cannot reflect the relationship of the adjacent two pixels. Therefore, in  addition to DM, they used pixel values combined with its first-order gradient vector along and direction to provide more information of an image vector for each patch. That is, if an image patch is of size √ × √ , the atom in the dictionary had 3 features because of the gradient. As the dictionaries were redundant or overcomplete, they reduced the redundancy of the dictionaries by means of setting a minimum Euclidean distances threshold. The real data results demonstrate the potential of SART-TV-DL algorithm in CT image reconstruction with 30-50 views. It contributes to some preclinical and clinical applications such as C-arm, breast CT, and tomosynthesis. Different from Lu's work, Cao and Xing applied DDL in CT limited angle reconstruction [45]. In his work, a twodictionary learning (ART-TV-TDL) algorithm is proposed to remove the limited angle artifacts. The two dictionaries were, respectively, object dictionary learned from a high-quality training image and artifact dictionary from artifact image. A limited angle reconstructioñ, which could be divided into the object part and the artifact part , had the different sparse representation coefficients with and as follows: Here and are the sparse coefficient with 1 and 2 sparsity; the training method was -SVD in this work. To get a better image with restrain artifacts, they combined these two representations for iterative reconstruction. Consider where , , and are parameters to balance the effect.
Their results show that the ART-TV-TDL method has smaller RMSE values in different limited angles (90 and 120) compared with ART-TV method.

DDL in 3D MRI
Reconstruction. Song et al. proposed a novel method for multislice (3D) MRI reconstruction from undersampled -space data using dual-dictionary learning (Dual-DL-MRI) [11]. For a high-resolution × × MRI images series ℎ , one can represent them as one vector high ∈ R ×1 of length and get its undersampled -space measurements by Fourier transform =̃h igh .̃is a three-dimension undersampling Fourier matrix. Therefore, the corresponding series low ∈ R ×1 can be reconstructed from undersampledspace by inverse Fourier transform as follows: As we can see, (16) is one form of (10), which demonstrates the possibility of dual-dictionary in MRI reconstruction.
To construct dual-dictionary, they used -SVD method to train the two dictionaries simultaneously to ensure the matching accuracy (one-to-one correspondence); and ℎ can be obtained by where = [ low high ] = [ 1 , 2 , . . . , ] stands for two sample sets that are one-to-one matching; = [ ℎ ]. It is worth noting that no more feature vectors are written in each dictionary atom except pixel values.

Computational and Mathematical Methods in Medicine
After updating the reconstruction result for each slice in the Fourier domain (restore the measured data), their work successfully reduce the PSNR of low-resolution MRI reconstruction images.

DDL in Multimodality Image Reconstruction.
Multimodality biomedical imaging has found its increasing applications during the last decade and is becoming routine in clinical practice. Multimodality imaging is to integrate multiple imaging techniques into one instrument or fuse two or more imaging modalities such as CT, MRI, PET, and SPECT. This integration of structural, functional, and molecular information provides more accurate diagnoses. For example, MRI methods offer human soft tissue information with excellent clarity whereas CT depicts human hard tissue such as bone. Both of CT and MRI reveal important functional information. If these two modalities can be combined in one device, some small disease such as caducous blood clots could be exactly diagnosed. However, the imaging principles of MRI and CT are totally different, and how to build an accurate connection of these two modalities is an urgent problem.
In order to stylize the synergy between CT and MRI data sets from an object at the same time, Lu et al. try to investigate the possibility of CT-MRI unified imaging via dual-dictionary [46]. Figures 3(a) and 3(b) are, respectively, CT and MRI image; these two images are obtained from one layer of a patient's brain and are well registered. With an MRI image as the a priori information, Lu tries to recover its corresponding CT image.
Since CT scan is totally different with MRI scan in physical principle, they use direct method to reserve as much information as possible to establish a knowledge-based connection between the two datasets. The two dictionaries are MR and CT ; the former is derived from high-resolution MRI images, and the latter is from high-resolution CT images. The significant point of two dictionaries is that the patches in each dictionary are restricted one-to-one correspondence.
In reconstruction step, MR and CT are treated as and ℎ in (12), respectively. With dual-dictionary learning, a base CT image is first obtained just from a high-quality MRI image without corresponding CT data. Second, combined with base CT image and highly undersampled CT data, they reconstruct better resolution CT image using iterative method. The base CT image provides a better resolution and outline information, while highly undersampled CT image provides all the detailed information.

Discussion and Conclusion
In this paper, we discussed the recent advances of the DDL methods in medical imaging. Based on highly undersampled measured data, DDL algorithm has shown its great potential in reconstructing high-resolution images [47,48].
Nowadays, MRI has become an indispensable medical modality of imaging diagnosis. However, during an MRI process, the scan time is usually up to fifteen minutes or even more. Patients might feel uncomfortable to keep motionless for a long time in the huge MRI gantry. Moreover, motion artifacts which reduce the images quality are always inevitable due to some organ movements such as heartbeat, pulse, and spasm. Researches demonstrated that the average displacement is over 0.35 mm within 100 seconds for one person lying on the cradle, while this number is up to 2.5 mm for a patient [42,43]. Therefore, it has an important clinical significance to save the MRI scan time for better images quality and healthcare.
DDL method may be the future direction of fast MRI reconstruction. As mentioned in Section 4.4, the same slice of CT and MRI images from one object are structurally correlated. The advantage of CT is that the scanning time is short for some typical parts of body. Besides, the spatial resolution of CT is better than MRI. In the fast MRI, the measurement data is incomplete. Therefore, if the CT image data can be utilized as prior information in MRI reconstruction process, fewer measurement data ( -space) is required for highresolution MRI image reconstruction. The essence of the reviewed DDL is establishing an appropriate relation between two spatial domains (e.g., different resolutions and different frequencies). One domain is for atom matching and the other domain is for image updating. Similarly, we may establish a quantitative relation between the two modalities using DDL. The relation can be a one-to-one mapping between the images boundaries which reflect the correlation between CT and MRI. In this way, DDL enables the fast MRI.
Overall, DDL method has shown its effective application in medical image reconstruction. With DDL method, we can reconstruct a high-resolution image with highly undersampling data. Inspired by its performances in one medical modality, DDL can be applied in structurally correlated image reconstruction problem, for example, multimodalities image reconstruction (CT-MRI).
However, the research work of DDL still remains in preliminary stage. For example, as discussed in the paper, reconstruction results may be relatively sensitive to the matching accuracy between the two dictionaries. Thus, how to establish closest connections between the images with different resolutions or even different modalities will be an important issue to be solved in the future. Also, the redundancy of dictionaries should be eliminated more reasonable to ensure better sparse representation.