A Preconditioning Technique for First-Order Primal-Dual Splitting Method in Convex Optimization

We introduce a preconditioning technique for the first-order primal-dual splitting method. The primal-dual splitting method offers a very general framework for solving a large class of optimization problems arising in image processing. The key idea of the preconditioning technique is that the constant iterative parameters are updated self-adaptively in the iteration process.We also give a simple and easy way to choose the diagonal preconditioners while the convergence of the iterative algorithm is maintained. The efficiency of the proposedmethod is demonstrated on an image denoising problem.Numerical results show that the preconditioned iterative algorithm performs better than the original one.


Introduction
Many real-world application problems arising in signal and image processing [1][2][3], machine learning [4][5][6], and medical image reconstruction [7,8] can be modeled as solving some convex optimization problems (maybe nonsmooth).In recent years, the optimization of the sum of two convex functions has received much attention, which takes the form of min where  ∈ Γ 0 (  ), ℎ ∈ Γ 0 (  ), and  :   →   is a linear transformation matrix.Here Γ 0 (  ) denotes the set of proper, lower semicontinuous convex functions from   to (−∞, +∞].The functions  and  in (1) usually denote the data error term and the regularization term, respectively.Chambolle and Pock [9] proposed a general primal-dual method to solve problem (1).Under the assumption that the functions  and ℎ have a closed-form solution of proximity operator, they proved the convergence of the proposed iterative algorithms.He and Yuan [10] have shown that the primal-dual method of Chambolle and Pock [9] is equivalent to the proximity point algorithm (PPA).Thus, the original convergence analysis can be easily obtained by the wellknown PPA method.In order to accelerate the primal-dual method [9], Pock and Chambolle introduced in [11] a preconditioning technique for the primal-dual iterative algorithm for which the constant iterative parameters were replaced by some precondition iteration matrices.Then, the convergence of the preconditioned iterative algorithm followed directly from the PPA method.They also gave a practical way to choose the precondition iteration matrix.As an application, Sidky et al. [12] applied the primal-dual method of Chambolle and Pock [9] to solve a variety of application problems arising in medical image reconstruction.The primal-dual method and the corresponding precondition method presented the performance very well.Some related works can also be found in [13,14].
If the objective function in (1) has a differentiable term with Lipschitz continuous gradient, such as the least squares loss function, the primal-dual method introduced by Chambolle and Pock [9] did not use the gradient operator of the function.In order to solve such a more general problem, Combettes and Pesquet [15], Condat [1], and Vũ [16] introduced a primal-dual method.For instance, Condat [1] considered an optimization problem involving the sum of three convex functions, with a smooth function with Lipschitz continuous gradient, a nonsmooth proximable function, and linear composite functions.The problem is presented below: min where  ∈ Γ 0 (  ),  ∈ Γ 0 (  ), ℎ ∈ Γ 0 (  ),  :   →  is differentiable, and its gradient ∇ is Lipschitz continuous with Lipschitz constant  > 0;  :   →   is a linear transformation matrix.To solve problem (2), he proposed a primal-dual splitting method and proved the convergence of the new iterative algorithm in an infinite-dimensional Hilbert space based on the fixed point theory of nonexpansive mappings.
The purpose of this paper is to introduce a preconditioned primal-dual splitting method to solve problem (2).The advantage of the preconditioning technique is that the iterative parameters will be updated self-adaptively.Furthermore, we give a family of preconditioners which are restricted to diagonal matrices and guarantee the convergence of the algorithm.To illustrate the efficiency of the proposed method, we compare it with the original method on image denoising problem.In addition, Combettes et al. [17] recently proposed a variable metric primal-dual method; they also referred to the preconditioning technique.But our algorithm is different from the method proposed by Combettes et al.In our algorithm, we use two different metrics: one for the primal variable; one for the dual variable(s).The other difference is in [17], which is a smooth term in the dual problem.When it is zero, the convergence conditions are stronger than the ones presented here.What is more, we explain how to choose the variable metric.
The rest of the paper is organized as follows.In Section 2, we briefly review the primal-dual splitting method proposed by Condat [1].In Section 3, we present the preconditioning technique and provide a practical way to choose the iterative parameters matrices.In Section 4, we make several experiments in image denoising problems.Finally, we make a brief conclusion on this paper.

A Primal-Dual Splitting Method of Condat [1]
In this section, we briefly review the primal-dual splitting method introduced by Condat [1].First, we introduce some definitions and notations.Let  be a real Hilbert space, with its inner product ⟨⋅, ⋅⟩ and norm ‖ ⋅ ‖ The corresponding saddle point of primal problem (4) and dual problem ( 5) is as follows: The pair (x, ŷ) can be found via the following monotone variational inclusion: where  and ℎ * are the subdifferential of  and ℎ * , respectively.
If some stopping criteria have been reached, then the algorithm stops.
The convergence of Algorithm 1 was ensured by the following theorem.
Theorem 2 (see [1]).Let  > 0,  > 0, and the sequences {  }, { , }, { , }, and { ℎ, } be the parameters of Algorithm 1.Let  be the Lipschitz constant of ∇.Suppose that  > 0 and the following conditions hold: Then the sequences {  } and {  } generated by Algorithm 1 converge weakly to x of ( 4) and ŷ of ( 5), respectively.(2) The role of primal variable  and dual variable  in Algorithm 1 can be exchanged.The convergence of the new iterative algorithm can also be ensured, according to the same Theorem 2. In practice, the performance of the two iterative algorithms is nearly the same.

A Preconditioned Primal-Dual Splitting Method for Solving (4)
In this section, we give a precondition version of Algorithm 1.
The main idea is motivated by the work of Pock and Chambolle [11].The iterative parameters in Algorithm 1 are replaced by some symmetrical positive matrices.First, we give the detailed iterative algorithm below.Then we will prove its convergence.
As a matter of fact,  and Σ in Theorem 5 could be any symmetric, positive definite maps.In order to ensure that the proximity operator of  and  * has a closed-form solution, it is sufficient to choose diagonal matrices for both of them.In the following, we give a practical way to choose the symmetric positive matrices and ensure the convergence of Theorem 5. To facilitate our proof, we need the following lemmas, which were obtained by [11].
Therefore, we are in the position to give our way to choose the matrices  and Σ accordingly.
( We will prove (17).It is easy to see that the proof of ( 17) is equivalent to the positive definite map of the following matrix : ) .

⋅ τ𝑗
By the definition of the operator norm, we obtain Finally, the strict inequality of (20) can be obtained from the above proof process with one of the above inequalities becoming strictly smaller.

Applications
In this section, we present an application of our proposed iterative algorithm.We aim at solving the following constrained total variation (TV) denoising problem: where  ∈   is a noisy image which was contaminated by Gaussian noise,  > 0 is the regularization parameter, and  is a closed convex set representing the prior information of the denoised image.By using the indicator function, constrained (TV) denoising problem (23) could be formulated by the following unconstrained optimization problem as follows: where the indicator function   ∈ Γ 0 (  ) :  ∈    → {0, if  ∈ ; +∞, otherwise}, since the total variation term Mathematical Problems in Engineering
If the constrained set  =   , then the constrained TV denoising problem reduces to the unconstrained TV denoising problem: The above TV denoising problem is often referred to as the ROF denoising model which was first introduced in computer vision by Rudin et al. [19].
where  ori is the original clear image and  rec is the reconstructed image.The reconstructed time is denoted by " (s)" and the iteration number "Iter" is recorded when the stopping criteria satisfied.The tested images are the well-known "Barbara," "Boat," and "Lena".These images have the same size of 512 × 512 and are displayed in Figure 1.
Experiment 1.We present how the iterative parameters are chosen.First, we choose different combination of parameters  and  and then apply Algorithm 1 to solve image denoising problem (23).We choose "Barbara" as the test image and add it by random Gaussian noise with zero mean and standard deviation 0.05.For convenience, we define  max = 2/,  max = (1/‖‖ 2 )(1/ − /2).Here, the Lipschitz constant  = 1 and ‖‖ 2 = 8.The numerical results are reported in Table 1.Meanwhile, we plot the SNR versus the iteration numbers in Figure 2. We can see from Figure 2 that when the iterative parameter  = 0.9 max and  = 0.5 max , the greater , the faster the convergence speed.For the small , it has no apparent difference by the choice of parameter .So we select  = 0.1 max and  = 0.1 max for Algorithm 1 for the comparative experiments.
For our proposed Algorithm 4, the corresponding preconditioned iterative matrices are chosen according to Lemma 7.There is only one parameter  that needs to be set, and the experiments results are reported in Table 2. Figure 3 shows the SNR value versus the iteration number.We can see from it that the performance of the three choices of  is nearly the same.So we choose  = 1 for Algorithm 4 in the following test.
Experiment 2. We show the performance comparison between Algorithm 1 and our proposed Algorithm 4. The constraint set  is set as nonnegative set; that is,  = { |  ≥ 0} .To perform fair comparison, we add each of these images with random Gaussian noise with 0 mean value and different level of standard variation .
From Table 3, we can see that our proposed Algorithm 4 converges faster than Algorithm 1 in terms of iteration numbers and iteration time in CPU time.For the large noise level, Algorithm 4 reaches higher SNR value than Algorithm 1 with less iteration numbers.Both of the algorithms get the

Conclusion
In this paper, we have studied the general optimization problem with the sum of three convex functions which is composed of a differential function with Lipschitz continuous gradient, a proximable function, and a linear composition function.Many interesting problems arising in image restoration and image reconstruction are special case of this problem.Inspired by the preconditioning technique proposed by Pock and Chambolle, we have introduced a primal-dual splitting algorithm with self-adaptive step-size to solve such problem.We also proposed a practical way to choose these step-sizes with a proof of convergence.Numerical results on image denoising problem showed that the precondition iterative algorithm performs better than the original one with constant step-size.

Figure 3 :
Figure 3: (a) The relation between SNR and iteration number with the nonnegative constraint for different choice of .(b) The relation between SNR and iteration number with the bounded constraint for different choice of .
.1.Numerical Experiments.In the following, we present some preliminary numerical results and show the efficiency of our proposed methods.All the experiments are run on a personal Lenovo computer with Pentium(R) Dual-Core CPU @ 2.8 GHz and RAM 4 GB.For all the tested iterative algorithms the stopping criterion is       +1 −

Table 1 :
Numerical results obtained by Algorithm 1.

Table 2 :
Numerical results obtained by Algorithm 4.

Table 3 :
Performance comparison between Algorithms 1 and 4.cleared image, which have nearly the same SNR value finally.To visualize the reconstructed images, we present the noised image and the final denoising image in Figures4, 5, and 6, respectively.