AAA Abstract and Applied Analysis 1687-0409 1085-3375 Hindawi Publishing Corporation 183961 10.1155/2013/183961 183961 Research Article On the Convergence Analysis of the Alternating Direction Method of Multipliers with Three Blocks http://orcid.org/0000-0001-8057-3690 Chen Caihua 1 Shen Yuan 2 You Yanfei 3 Minghua Xu 1 International Center of Management Science and Engineering, School of Management and Engineering Nanjing University Nanjing 210093 China nju.edu.cn 2 School of Applied Mathematics Nanjing University of Finance & Economics Nanjing 210023 China njue.edu.cn 3 Department of Mathematics Nanjing University Nanjing 210093 China nju.edu.cn 2013 26 10 2013 2013 04 07 2013 05 09 2013 2013 Copyright © 2013 Caihua Chen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We consider a class of linearly constrained separable convex programming problems whose objective functions are the sum of three convex functions without coupled variables. For those problems, Han and Yuan (2012) have shown that the sequence generated by the alternating direction method of multipliers (ADMM) with three blocks converges globally to their KKT points under some technical conditions. In this paper, a new proof of this result is found under new conditions which are much weaker than Han and Yuan’s assumptions. Moreover, in order to accelerate the ADMM with three blocks, we also propose a relaxed ADMM involving an additional computation of optimal step size and establish its global convergence under mild conditions.

1. Introduction

In various fields of applied mathematics and engineering, many problems can be equivalently formulated as a separable convex optimization problem with two blocks; that is, given two closed convex functions fi:ni{+},i=1,2, to find a solution pair (x1*,x2*) of the following problem: (1)minf1(x1)+f2(x2)s.t.A1x1+A2x2=b, where Ai is a matrix in p×ni,  i=1,2, and b is a vector in p. The classical alternating direction method of multipliers (ADMM) [1, 2] applied to problem (1) yields the following scheme: (2)x1k+1=argminx1n1f1(x1)-A1Tλk,x1+β2A1x1+A2x2k-b2,x2k+1=argminx2n2f2(x2)-A2Tλk,x2+β2A1x1k+1+A2x2-b2,λk+1=λk-β(A1x1k+1+A2x2k+1-b), where λk is a Lagrangian multiplier and β>0 is a penalty parameter. Possibly due to its simplicity and effectiveness, the ADMM with two blocks has received continuous attention both in theoretical and application domains. We refer to  for theoretical results on ADMM with two blocks and  for its efficient applications in high-dimensional statistics, compressive sensing, finance, image processing, and engineering, to name just a few.

In this paper, we concentrate on the linearly constrained convex programming problem with three blocks: (3)minf1(x1)+f2(x2)+f3(x3)s.t.A1x1+A2x2+A3x3=b, where f3:n3{+} is a closed convex function and A3 is a matrix in p×n3. For solving (3), a nature idea is to extend the ADMM with two blocks to the ADMM with three blocks in which the next iteration (x2k+1,x3k+1,λk+1) is updated by (4)(x2k+1,x3k+1,λk+1):=(x~2k,x~3k,λ~k), where (5)x~1k=argminx1n1f1(x1)-A1Tλk,x1+β2A1x1+A2x2k+A3x3k-b2,x~2k=argminx2n2f2(x2)-A2Tλk,x2+β2A1x~1k+A2x2+A3xk-b2,x~3k=argminx3n3f3(x3)-A3Tλk,x3+β2A1x~1k+A2x~2k+A3x3-b2,λ~k=λk-β(A1x~1k+A2x~2k+A3x~3k-b). Similar to the ADMM with two blocks, the ADMM with three blocks has found numerous applications in a broad spectrum of areas, such as doubly nonnegative cone programming , high-dimensional statistics [15, 16], imaging science , and engineering . Even though its numerical efficiency is clearly seen from those applications, the theoretical treatment of ADMM with three blocks is challenging and the convergence of the ADMM is still open given only the convex assumptions of the objective function. To alleviate this difficulty, the authors of [19, 20] proposed prediction-correction type methods to solve the general separable convex programming; however, numerical results show that the direct ADMM outperforms its variants substantially. Therefore, it is of great significance to investigate the theoretical performance of the ADMM with three blocks even only to provide sufficient conditions to guarantee the convergence. To the best of our knowledge, there exist only two works aiming to attack the convergence problem of the direct ADMM with three blocks. By using an error bound analysis method, Hong and Luo  proved the linear convergence of the ADMM with m blocks for sufficiently small β subject to some technical conditions. However, the sufficiently small requirement on β makes the algorithm difficult to implement. In , Han and Yuan employed a contractive analysis method to establish the convergence of ADMM under the strongly convex assumptions of fi and the parameter β less than a threshold depending on all the strongly convex moduli. In this paper, we firstly prove the convergence of ADMM with three blocks under two conditions weaker than those of . In our conditions, the threshold on the parameter β only relies on the strongly convex moduli of f2 and f3, and furthermore f1 is not necessarily strongly convex in one of our conditions. Also, the restricted range of β in this paper is shown to be at least three times as big as that of .

In order to accelerate the ADMM with three blocks, we also propose a relaxed ADMM with three blocks which involves an additional computation of optimal step size. Specifically, with the triple (x2k,x3k,λk), we first generate a predictor (x~2k,x~3k,λ~k) according to (5) and then obtain (x2k+1,x3k+1,λk+1) in the next iteration by (6)x2k+1=x2k-γαk*(x2k-x~2k),x3k+1=x3k-γαk*(x3k-x~3k),λk+1=λk-γαk*(λk-λ~k),

where γ(0,2) and αk* is special step size defined in (43). The convergence of the relaxed ADMM is also established under mild conditions. We should mention that it is possible to modify the analyses given in this paper to be problems with more than three blocks of separability. But this is not the focus of this paper.

The remaining parts of this paper are organized as follows. In Section 2, we list some preliminaries on the strongly convex function, subdifferential, and the ADMM with three blocks. In Section 3, we first show the contractive property of the distance between the sequence generated by ADMM with three blocks and the solution set and then prove the convergence of ADMM under certain conditions. In Section 4, we extend the direct ADMM with three blocks to the relaxed ADMM with an optimal step size and establish its convergence under suitable conditions. We conclude our paper in Section 5.

Notation. For any positive integer m, let Im be the m×m identity matrix. We use · and ·2 to denote the vector Euclidean norm and the spectral norm of matrices (defined as the maximum singular value of matrices). For any symmetric matrix Sn×n, we write xS2=xTSx for any xn. G and M are two (n2+n3+p)×(n2+n3+p) matrices defined by (7)G:=(βA2TA2000βA3TA3000Iβ),M:=(2βA2TA2000βA3TA3000Iβ), respectively. For given x1n1,x2n2,x3n3, and λp, we frequently use u and v to denote the joint vectors of x2,x3,λ and x1,x2,x3,λ, respectively; that is, (8)u=[x2T,x3T,λT]T,v=[x1T,x2T,x3T,λT]T, while u~ and v~ are the joint vectors corresponding to x~2,x~3,λ~ and x~1,x~2,x~3,λ~.

2. Preliminaries

Throughout this paper, we assume fi,i=1,2,3, are strongly convex functions with modulus μi0; that is (9)fi((1-α)z+αz)(1-α)fi(z)+αfi(z)-12μiα(1-α)z-z2,z,zni, for each i. Note that fi is a strongly convex function with modulus 0 being equivalent to the convexity of fi. Let x be a point of dom(fi); the subdifferential of fi at x is defined by (10)fi(x):={x*f(z)f(x)+x*,z-x,z}. From Proposition  6 in , we know that, for each i, fi is strongly monotone with modulus μi which means (11)z1-z2,x1-x2μiz1-z220,x1,x2,z1fi(x1),z2fi(x2).

The next lemma introduced in  plays a key role in the convergence analysis of the ADMM and the relaxed ADMM with three blocks.

Lemma 1.

Let (x1*,x2*,x3*,λ*) be any KKT point of problem (3). Let v~k be generated by (5) from given uk. Then, one has (12)u~k-u*,G(uk-u~k)i=13μix~ik-xi*2+λk-λ~k,i=23Ai(xik-x~ik)+βA3(x~3k-x3*),A2(x~2k-x2k).

3. The ADMM with Three Blocks

In this section, we first investigate the contractive property of the distance between the sequence generated by ADMM with three blocks and the solution set under the condition that 0<βmin{μ2/A222,μ3/A322}.

Lemma 2.

Let v*=(x1*,x2*,x3*,λ*) be a KKT point of problem (3) and let the sequence {vk=(x1k,x2k,x3k,λk)} be generated by the ADMM with three blocks. Then, it holds that (13)uk+1-u*M2uk-u*M2-βA3(x3k+1-x3k)2-βA1x1k+1+A2x2k+A3x3k+1-b2-2μ1x1k+1-x1*2-2x2k+1-x2*μ2In2-βA2TA22-2x3k+1-x3*μ3In3-βA3TA32.

Proof.

Since x3j minimizes f3(·)-A3Tλj,·, we deduce from the first order optimality condition that (14)A3Tλjf3(x3j),j=0,1,,k. By (14) and the monotonicity of f3(·) (11), it is easily seen that (15)x3k-x3k+1,A3Tλk-A3Tλk+10. Then for each k, (16)uk+1-u*,G(uk-uk+1)i=13μixik+1-xi*2+λk-λk+1,A2(xik-x2k+1)+βA3(x3k+1-x3*),A2(x2k+1-x2k)i=12μixik+1-xi*2+x3k+1-x3*μ3In3-βA3TA32+λk-λk+1,A2(x2k-x2k+1)-β4A2(x2k+1-x2k)2, where the last “≥” follows from the elementary inequality (17)x,y-x2-14y2. Since (18)A3(x3k+1-x3k)22A3(x3k+1-x3*)2+2A3(x3k-x3*)2, by direct computations, we further obtain that (19)uk-u*G2uk+1-u*G2+uk+1-ukG2+2μ1x1k+1-x1*2+2x2k+1-x2*μ2In2-(β/2)A2TA22+2x3k+1-x3*μ3In3-βA3TA32+2λk-λk+1,A2(x2k-x2k+1)-βA2(x2k-x2*)2, which, together with G=M-(βA2TA200), implies (20)uk-u*M2uk+1-ukG2+uk+1-u*M2+2μ1x1k+1-x1*2+2x2k+1-x2*μ2In2-βA2TA22+2x3k+1-x3*μ3In3-βA3TA32+2λk-λk+1,A2(x2k-x2k+1). Note that (21)x2k-x2k+1βA2TA22+2λk-λk+1,A2(x2k-x2k+1)+1βλk-λk+12=βA1x1k+1+A2x2k+A3x3k+1-b2. We complete the proof of this lemma.

With the above preparation, we are ready to prove the convergence of the ADMM with three blocks for solving (3) given the following conditions.

Theorem 3.

Let {vk=(x1k,x2k,x3k,λk)} be the sequence generated by the ADMM with three blocks. Then {vk} converges to a KKT point of problem (3) if either of the following conditions holds:

μ1>0 and 0<βmin{μ2/A222,μ3/A322};

A1 is of full column rank, 0<β<μ2A222, and βμ3A322.

Proof.

By the inequality (13), it follows that the sequence {A2x2k,A3x3k,λk} is bounded. Recall that (22)A1x1k+1=λk-λk+1β-A2x2k+1-A3x3k+1+b. Hence {A1x1k} is also bounded. Moreover, from (13) we see immediately that (23)+>k=1βA3(x3k+1-x3k)2+βA1x1k+1+A2x2k+A3x3k+1-b2+k=12μ1x1k+1-x1*2+2x2k+1-x2*μ2In2-βA2TA22+2x3k+1-x3*μ3In3-βA3TA32. According to the condition that 0<βmin{μ2/A222,μ3/A322}, we know (24)k=1A3(x3k+1-x3k)2<,k=1A1x1k+1+A2x2k+A3x3k+1-b2<+,k=1μ1x1k+1-x1*2<+,k=1x2k+1-x2*μ2In2-βA2TA22<+,k=1x3k+1-x3*μ3In3-βA3TA32<+. It therefore holds that (25)limkA3(x3k+1-x3k)2=0,limkA1x1k+1+A2x2k+A3x3k+1-b2=0,(26)limkμ1x1k+1-x1*2=0,limkx2k+1-x2*μ2In2-βA2TA22=0,limkx3k+1-x3*μ3In3-βA3TA32=0. Therefore, the sequence {μ1x1k2, x2kμ2In2-βA2TA22, x3kμ3In3-βA2TA22} is bounded, which, together with the boundedness of {A1x1k,A2x2k,A3x3k,λk}, implies that {x2k,x3k,λk} is bounded, and {x1k} is bounded given the condition μ1>0 or A1 is of full column rank. Moreover, since (27)x3k+1-x3k2=A3x3k+1-A3x3k2+x3k+1-x3kμ3In3-A3TA32, by the first equality in (25) and the third equality in (26), it holds that (28)limkx3k+1-x3k=0. We proceed to prove the convergence of ADMM by considering the following two cases.

Case  1  (μ1>0 and βmin(μ2/A222,μ3/A322)). In this case, the sequence {x1k} converges to x1* and then (29)limkA2x2k+1-A2x2k=0,limkλk+1-λk=0. By the second equality in (26), we deduce from (29) that (30)limkx2k+1-x2k=0. Since {x2k,x3k,λk} is bounded, there exist a triple (x2,x3,λ) and a subsequence {nk} such that (31)limkx2nk=x2,limkx2nk=x2,limkλnk=λ,which by combining (25), (29) with given conditions, implies (32)limkx2nk+1=x2,limkx2nk+1=x2,limkλnk+1=λ. Note that (33)0f1(x1k+1)-A1Tλk+1+A1TA2(x2k-x2k+1)+A1TA3(x3k-x3k+1),0f2(x2k+1)-A2Tλk+1+A2TA3(x3k-x3k+1),0f3(x3k+1)-A3Tλk+1,λk+1=λk-β(A1x1k+1+A2x2k+1+A3x3k+1). Then, by taking the limits on both sides of (33), using (25) and (29), and invoking the upper semicontinuous of f1(·),  f2(·), and f3(·) , one can immediately write (34)0f1(x*)-A1Tλ,0f2(x2)-A2Tλ,0f3(x3)-A3Tλ,A1x*+A2x2+A3x3=b, which indicates (x1*,x2,x3,λ) is a KKT point of problem (3). Hence, the inequality (13) is also valid if (x1*,x2*,x3*,λ*) is replaced by (x1*,x2,x3,λ). Then it holds that (35)2βA2x2k+1-A2x22+βA3x3k+1-A3x32+1βλk+1-λ22βA2x2k-A2x22+βA3x3k-A3x32+1βλk-λ2, which yields (36)limkx2k-x2A2TA22=0,limkx3k-x3A3TA32=0,(37)limkλk=λ. By adding the last two equalities in (26) to (36), we know (38)limkx2k=x2,  limkx3k=x3. Therefore, we have shown that the whole sequence {(x1k,x2k,x3k,λk)} converges to (x1*,x2,x3,λ) under condition (i) in Theorem 3.

Case  2  (A1 is of full column rank, 0<β<μ2/A222, and βμ3/A322). In this case, the sequence {x2k} converges to x2* and {x1k} is bounded. From the second equality in (25) and (28), we have (39)limkA1x1k+1-A1x1k  =0,limkλk-λk+1  =0. Since A1 is of full column rank, it therefore holds that (40)limkx1k+1-x1k  =0. Let (x1,x3,λ) be a cluster point of the sequence {x1k,x3k,λk}. Following a similar proof in Case 1, we are able to show (x1,x2*,x3,λ) is a KKT point of problem (3) and the whole sequence {(x1k,x2k,x3k,λk)} converges to this point.

Remark 4 (see [<xref ref-type="bibr" rid="B9">22</xref>]).

the authors proved the convergence of the ADMM under the conditions that f1,f2, and f3 are strongly convex and 0<β<min1i3{μi/3Ai22}. Our result improves the upper bound min1i3{μi/3Ai22} by min{μ2/A222,μ3/A322}. Moreover, in our condition (ii), the strongly convexity assumption is only imposed on f2 and f3 while f1 is not necessarily strongly convex with positive modulus.

4. The Relaxed ADMM with Three Blocks

For the ADMM with two blocks, Ye and Yuan  developed a variant of alternating direction method with an optimal step size. Numerical results demonstrated that an additional computation on the optimal step size would improve the efficiency of the new variant of ADMM. In this section, by adopting the essential idea of Ye and Yuan , we propose a relaxed ADMM with three blocks to accelerate the ADMM via an optimal step size. For notational simplicity, we write (41)Φ(uk,u~k)3β4A2(x2k-x~2k)2+βA3(x3-x~3k)2+1βλk-λ~k2+λk-λ~k,A2(x2k-x~2k)+A3(x3k-x~3k). With uk=(x2k,x3k,λk), the new iterate of extended ADMM is produced by (42)uk+1=uk-γα*(uk-u~k),γ(0,2), where u~k is the solution of (5) and α* is defined by (43)α*:=Φ(uk,u~k)uk-u~kG2.

Lemma 5.

Let the sequence {uk} be generated by the relaxed ADMM with three blocks. Then, if 0<βμ3/A322, the following statements are valid:

Φ(uk,u~k)(1/6)uk-uk+1G2 and thus α*1/6;

uk+1-u*G2uk-u*G2-(1/36)γ(2-γ)uk-u~kG2(1/3)γμ1x~1k-x1*2(1/3)γμ2x~2k-x2*2-(1/3)γx~3k-x3*μ3I-βA3TA32.

Proof.

By direct computations to Φ(uk,u~k), we know that (44)Φ(uk,u~k)=3β4A2(x2k-x~2k)2+βA3(x3-x~3k)2+1βλk-λ~k2+λk-λ~k,A2(x2k-x~2k)+A3(x3k-x~3k)3β4A2(x2k-x~2k)2+βA3(x3-x~3k)2+1βλk-λ~k2-β2A2(x2k-x~2k)2-12βλk-λ~k2-3β4A3(x3k-x~3k)2-13βλk-λ~k2=β4A2(x2k-x~2k)2+β4A3(x3k-x~3k)2+16βλk-λ~k2, where the second inequality follows Cauchy inequality. It therefore holds that (45)Φ(uk,u~k)16uk-u~kG2, which completes the proof of the first part. By Lemma 1 and the elementary inequality (17), it can be easily verified that (46)uk-u*,G(uk-u~k)Φ(uk,u~k)+μ1x~1k-x1*2+μ2x~2k-x2*2+x~3k-x3*μ3In3-βA3TA32 and then (47)uk+1-u*G2=uk-u*-γα*(uk-u~k)G2uk-u*G2-γ(2-γ)(α*)2×uk-u~kG2-2γα*μ1x~1k-x1*2-2μ2γα*x~2k-x2*2-2γα*x~3k-x3*μ3In3-βA3TA32. This, together with the fact that α*1/6, completes the proof.

Based on the above inequality, we are able to prove the following convergence result of the relaxed ADMM with three blocks. Since the proof is in line with that of Theorem 3, we omit it.

Theorem 6.

Let {vk=(x1k,x2k,x3k,λk)} be the sequence generated by the relaxed ADMM. Then {vk} converges to a KKT point of problem (3) under the conditions that 0<βμ3/A322 and A1,A2, and A3 are of full column rank.

5. Conclusion Remarks

In this paper, we take a step to investigate the ADMM for separable convex programming problems with three blocks. Based on the contractive analysis of the distance between the sequence and the solution set, we establish theoretical results to guarantee the global convergence of ADMM with three blocks under weaker conditions than those employed in . By adopting the essential idea of , we also present a relaxed ADMM with an optimal step size to accelerate the ADMM and prove its convergence under mild assumptions.

Acknowledgment

The first author is supported by the Natural Science Foundation of Jiangsu Province and the National Natural Science Foundation of China under Project no. 71271112. The second author is supported by university natural science research fund of jiangsu province under grant no. 13KJD110002.

Gabay D. Mercier B. A dual algorithm for the solution of nonlinear variational problems via finite element approximations Computational Mathematics with Applications 1976 2 17 40 Glowinski R. Marrocco A. Sur l'approximation, par éléments finis d'ordre un, et la résolution, par pénalisation-dualité, d'une classe de problèmes de Dirichlet non linéaires 1975 9 R-2 41 76 MR0388811 ZBL0368.65053 Eckstein J. Bertsekas D. On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators Mathematical Programming 1992 55 3 293 318 10.1007/BF01581204 MR1168183 ZBL0765.90073 Gabay D. Chapter ix applications of the method of multipliers to variational inequalities Studies in Mathematics and Its Applications 1983 15 299 331 Glowinski R. Numerical Methods for Nonlinear Variational Problems 1984 New York, NY, USA Springer MR737005 Glowinski R. Le Tallec P. Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics 1989 9 Philadelphia, Pa, USA Society for Industrial and Applied Mathematics (SIAM) 10.1137/1.9781611970838 MR1060954 He B. Liao L. Han D. Yang H. A new inexact alternating directions method for monotone variational inequalities Mathematical Programming 2002 92 1 103 118 10.1007/s101070100280 MR1892298 ZBL1009.90108 He B. Yuan X. On the O(1/n) convergence rate of the Douglas-Rachford alternating direction method SIAM Journal on Numerical Analysis 2012 50 2 700 709 10.1137/110836936 MR2914282 ZBL1245.90084 Chen C. He B. Yuan X. Matrix completion via an alternating direction method IMA Journal of Numerical Analysis 2012 32 1 227 245 10.1093/imanum/drq039 MR2875250 ZBL1236.65043 Fazel M. Pong T. K. Sun D. F. Tseng P. Hankel matrix rank minimization with applications to system identification and realization SIAM Journal on Matrix Analysis and Applications 2012 34 3 946 977 10.1137/110853996 He B. Xu M. Yuan X. Solving large-scale least squares semidefinite programming by alternating direction methods SIAM Journal on Matrix Analysis and Applications 2011 32 1 136 152 10.1137/090761549 MR2811295 ZBL1243.49039 Yang J. Zhang Y. Yin W. A fast alternating direction method for tvl1-l2 signal reconstruction from partial fourier data IEEE Journal of Selected Topics in Signal Processing 2010 4 2 288 297 Yang J. Zhang Y. Alternating direction method algorithms for l1-problems in compressive sensing SIAM Journal on Scientific Computing 2011 33 1 250 278 Wen Z. Goldfarb D. Yin W. Alternating direction augmented Lagrangian methods for semidefinite programming Mathematical Programming Computation 2010 2 3-4 203 230 10.1007/s12532-010-0017-1 MR2741485 ZBL1206.90088 Tao M. Yuan X. Recovering low-rank and sparse components of matrices from incomplete and noisy observations SIAM Journal on Optimization 2011 21 1 57 81 10.1137/100781894 MR2765489 ZBL1218.90115 Yang J. Sun D. Toh K. A proximal point algorithm for logdeterminant optimization with group Lasso regularization SIAM Journal on Optimization 2013 23 2 857 293 Peng Y. Ganesh A. Wright J. Xu W. Ma Y. RASL: robust alignment by sparse and low-rank decomposition for linearly correlated images Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR '10) 2010 763 770 Mohan K. London P. Fazel M. Witten D. Lee S. I. Node-based learning of multiple gaussian graphical models http://arxiv.org/abs/1303.5145 He B. Tao M. Xu M. H. Yuan X. M. Alternating directions based contraction method for generally separable linearly constrained convex programming problems Optimization 2013 62 4 573 596 He B. Tao M. Yuan X. Alternating direction method with Gaussian back substitution for separable convex programming SIAM Journal on Optimization 2012 22 2 313 340 10.1137/110822347 MR2968856 ZBL06081226 Hong M. Luo Z. On the linear convergence of the alternating direction Method of multipliers http://arxiv.org/abs/1208.3922 Han D. Yuan X. A note on the alternating direction method of multipliers Journal of Optimization Theory and Applications 2012 155 1 227 238 10.1007/s10957-012-0003-z MR2983116 ZBL1255.90093 Rockafellar R. Monotone operators and the proximal point algorithm SIAM Journal on Control and Optimization 1976 14 5 877 898 MR0410483 10.1137/0314056 ZBL0358.90053 Rockafellar R. Convex Analysis 1970 Princeton, NJ, USA Princeton University Press Princeton Mathematical Series, no. 28 MR0274683 Ye C. Yuan X.-M. A descent method for structured monotone variational inequalities Optimization Methods & Software 2007 22 2 329 338 10.1080/10556780600552693 MR2288768 ZBL1196.90118