A Proximal Alternating Direction Method of Multipliers with a Substitution Procedure

In this paper, we considers the separable convex programming problem with linear constraints. Its objective function is the sum of m individual blocks with nonoverlapping variables and each block consists of two functions: one is smooth convex and the other one is convex. For the general case m ≥ 3, we present a gradient-based alternating direction method of multipliers with a substitution. For the proposed algorithm, we prove its convergence via the analytic framework of contractive-type methods and derive a worst-case O ( 1/ t ) convergence rate in nonergodic sense. Finally, some preliminary numerical results are reported to support the eﬃciency of the proposed algorithm.


Introduction
In this paper, we consider the following convex minimization model with linear constraints and separable objective function: where f i : R n i ⟶ R ∪ +∞ { } (i � 1, . . . , m) are closed proper convex functions and g i : R n i ⟶ R (i � 1, . . . , m) are smooth convex functions, X i ⊆ R n i (i � 1, . . . , m) are closed convex sets, A i ∈ R l×n i (i � 1, . . . , m) are given matrices, and b ∈ R l is a given vector. Furthermore, we assume that each g i has Lipschitz-continuous gradient, i.e., there exists L i > 0 such that roughout the paper, the solution set of (1) is assumed to be nonempty.
A fundamental method for solving (1) in the case of m � 2 is the alternating direction method of multipliers (ADMM), which was presented originally in [1,2]. We refer to [3,4] for some review papers on ADMM. ere are many problems of form (1) with m ≥ 3 in contemporary applications, such as the robust principal component analysis model [5], the total variation-based image restoration problem [6], the superresolution image reconstruction problem [7,8], the multistage stochastic programming problem [9], the deblurring Poissonian images problem [10], the latent variable Gaussian graphical model selection [11], the quadratic discriminant analysis model [12], and the electrical engineering [13,14]. en, our discussion focuses on (1) in the case of m ≥ 3. A natural idea for solving (1) is to extend the ADMM from the special case m � 2 to the general case m ≥ 3. is straightforward extension can be written as follows: e convergence of (3) is proved in some special cases (see [15][16][17]). Unfortunately, without further conditions, the direct extension of ADMM (3) for the general case m ≥ 3 may fail to converge (see [18]). In [19,20], the authors present two convergent semiproximal ADMM for two types of 3-block problems. Recently, He et al. [21] showed that if a new iterate is generated by correcting the output of (3) with a substitution procedure, then the sequence of iterates converges to a solution of (1). Since then, several variants of the ADMM were proposed for solving (1) (see [21][22][23][24][25][26]).
In (3), all the x i -related subproblems are in the form of with a certain known a i ∈ R l and a symmetric positive definite matrix H. When A i is not the identity matrix, problem (4) becomes complicated. A popular technique is to linearize the quadratic term of (4) (see [27,28]), that is, one can solve the following problem instead of (4): with a certain known c i ∈ R l . In general, one can solve the following problem instead of (4): where x k i is the current iteration. If G i � τ i I i − A T i HA i ≻ 0, then (6) becomes the form of (5).
Since g i is smooth, the following problem is easier than (6): Now, we can give the gradient-based ADMM (G-ADMM) iterative scheme as follows: In this paper, imal ADMM with a substitution based on (8). In Section 2, we provide some preliminaries for further analysis. en, we present the gradient-based alternating direction method of multipliers with a substitution (G-ADMM-S) for solving (1) and its convergence is shown in Section 3. In Section 4, we estimate the worstcase iteration complexity for the proposed algorithm in nonergodic sense. In Section 5, some preliminary numerical results are reported to support the efficiency of the proposed algorithm. Finally, some conclusions are given in Section 6.

Preliminaries
In this section, we provide some preliminaries. Let 〈x, y〉 � x T y and ‖x‖ � ����� � 〈x, x〉 √ . G ≻ 0( ≽ 0) denotes that G is a positive definite (semidefinite) matrix. For any positive definite matrix G, we denote ‖·‖ G as the G-norm. If G is the product of a positive parameter β and the identity matrix I, i.e., G � βI, we use a simpler notation: e domain of f denoted by domf: � x ∈ R n | f(x) < +∞ . We say that f is convex if 2 Mathematical Problems in Engineering For convex function f, the subdifferential of f is the setvalued operator defined by 2.1. Variational Characterizations of (1).
i � 1, 2, . . . , m, and W: � X 1 × X 2 × · · · × X m × R l . Since all Θ i (x i ) are convex functions, by invoking the first-order necessary and sufficient condition for convex programming, one can easily find out that problem (1) is characterized by the following variational inequality: we obtain for all (x 1 , x 2 , . . . , x m , λ) ∈ W. e Lagrange function of (1) is given by Let at is, for any λ ∈ R l and Finding a saddle point of L( en, (14) can be rewritten as the following variational inequality (VI): Let W * be the solution set of VI(W, G, Θ). Since we have assumed that the solution set of (1) is nonempty, W * is also nonempty. It follows from the definition of G(w) that . λ max (·) denotes the maximum eigenvalue of one matrix, and λ min (·) denotes the minimum eigenvalue of one matrix. e following notions will be used in the later analysis: It is easy to see that Q �

Algorithm and Convergence Analysis
In this section, we first describe G-ADMM-S and then prove its convergence via the analytic framework of the contractive-type method [29]. roughout this section, we assume that λ min (G i ) > L i (i � 1, 2, . . . , m). We propose the iterative scheme of G-ADMM-S for solving (1) in Algorithm G-ADMM-S: Let c ∈ (0, 2)D k and b k be defined in (18) and (19), respectively. Start with w 0 . With the given iterate w k , the new iterate w k+1 is given as follows: Step 1 (G-ADMM procedure). Execute scheme (8) to generate w k .
Step 2 (substitution procedure). Generate the new iterate w k+1 via where Next, we establish the global convergence of Algorithm G-ADMM-S following the analytic framework of contractive-type methods. We outline the proof sketch as follows: (1) Prove that − D k is a descent direction of the function (1/2)‖w − w * ‖ 2 at the point w � w k whenever w k ≠ w k , where w k is generated by G-ADMM scheme (8) and w * ∈ W * (2) Prove that the sequence generated by Algorithm G-ADMM-S is contractive with respect to W * (3) Establish the convergence Accordingly, we divide the convergence analysis into three sections to address the claims listed above.

Verification of the Descent Direction.
In this section, we show that − D k is a descent direction of the function (1/2)‖w − w * ‖ 2 at the point w � w k whenever w k ≠ w k and w * ∈ W * . For this purpose, we first prove an important inequality for the output of G-ADMM procedure (8), which will be used often in our further discussion. where Proof. By the optimality condition of the x i -related subproblem in (8), for i � 1, 2, . . . , m, we have x k i ∈ X i and where δ(X i ) is the indicator function of the set X i . us, w k ∈ W and there exists η ∈ zδ( . From the subgradient inequality, one has From the definition of zδ( at is, for all for all x i ∈ X i . Summing the above inequality over i � 1, 2, . . . , m, we obtain where Mathematical Problems in Engineering en, by adding the following term to both sides of (30), we get Since Combining the above two formulas, we have where

Mathematical Problems in Engineering
Using the notations of G(w k ) (see (15)) and D k (see (18)), assertion (23) is proved. □ Based on assertion (23), we can get the following result.
Proof. It follows from (23) that Using (17) and the optimality of w * , we have us, (40) e next theorem implies that − D k is a descent direction of the function (1/2)‖w − w * ‖ 2 at the point w � w k whenever w k ≠ w k . □ Theorem 2. For all w * ∈ W * , Proof. It follows from (37) that at is, (w k − w * ) T D k ≥ b k . Now, we treat the first term of the right-hand side of (43): where the first inequality follows from the Lipschitz continuous of ∇g i . en, let us deal with the second term of the right-hand side of (43): Mathematical Problems in Engineering where Now, we specify the choices of parameters to implement these algorithms. We set H � βI with β � 0.01, the relaxation parameter c � 1.8, r 1 � n1 + β‖A T 1 A 1 ‖, r 2 � ‖M‖ F + β‖A T 2 A 2 ‖, r 3 � n3 + β‖A T 3 A 3 ‖, and G i � r i I n i ×n i − μA T i A i (i � 1, 2, 3). We consider two cases of the parameter μ: Case 1: μ � β; Case 2: μ � 0.15.
We test 7 groups of problems with random data. Numerical results are reported in Table 2. For each scenario, we test 5 times and report the average performance. Specifically, we report the number of iterations ("Iter."), the computing time in seconds ("Time"), and the absolute error of function value ("f-error"). e numerical results show that Algorithm G-ADMM-S is effective.

Conclusion
In this paper, for the linearly constrained separable convex programming, whose objective function is the sum of m individual blocks with nonoverlapping variables and each block is convex, we present a gradient-based ADMM with a substitution in the case m ≥ 3. We have analysed its convergence and iteration complexity. e preliminary numerical results have shown the efficiency of the proposed algorithm.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors declare that they have no conflicts of interest.