We present a projection algorithm which modifies the method proposed by Censor and Elfving (1994) and also introduce a self-adaptive algorithm for the multiple-sets split feasibility problem (MSFP). The global rates of convergence are firstly investigated and the sequences generated by two algorithms are proved to converge to a solution of the MSFP. The efficiency of the proposed algorithms is illustrated by some numerical tests.
1. Introduction
The multiple-sets split feasibility problem (MSFP) is to find x* satisfying
(1)x*∈C≔⋂i=1tCisuchthatAx*∈Q≔⋂j=1rQj,
where A is an M×N real matrix, Ci⊆ℝN, i=1,…,t, and Qj⊆ℝM, j=1,…,r, are the nonempty closed convex sets. This problem was firstly proposed by Censor et al. in [1] and can be a model for many inverse problems where constraints are imposed on the solutions in the domain of a linear operator as well as in the operator's range. Many researchers studied the MSFP and introduced various algorithms to solve it (see [1–7] and the references therein). If t=r=1, then this problem reduces to the feasible case of the split feasibility problem (see, e.g., [8–11]), which is to find x*∈C with Ax*∈Q.
Assume that the MSFP (1) is consistent; that is, its solution set, denoted by Γ, is nonempty. For convenience reasons, Censor et al. [1] considered the following constrained MSFP:
(2)findx*∈Ωsuchthatx*solvestheMSFP,
where Ω⊆ℋ1 is an auxiliary simple nonempty closed convex set containing at least one solution of the MSFP. For solving the constrained MSFP, Censor et al. [1] defined a proximity function p(x) to measure the distance of a point to all sets
(3)p(x)=12∑i=1tαi∥x-PCi(x)∥2+12∑j=1rβj∥Ax-PQjA(x)∥2,
where αi>0 and βj>0 for all i and j, respectively, and ∑i=1tαi+∑j=1rβj=1. We see that
(4)∇p(x)=∑i=1tαi(x-PCi(x))+∑j=1rβjAT(I-PQj)Ax.
Censor et al. [1] proposed a projection algorithm as follows:
(5)xn+1=PΩ(xn-s∇p(xn)),
where s is a positive number such that 0<sL≤s≤sU<2/L(p) and L(p) is the Lipschitz constant of ∇p.
Observe that in the algorithm (5) the determination of the stepsize s depends on the operator (matrix) norm ∥A∥ (or the largest eigenvalue of A*A). This means that, in order to implement the algorithm (5), one has first to compute (or, at least, estimate) operator norm of A, which is in general not an easy work in practice. To overcome this difficulty, Zhang et al. [4] and Zhao and Yang [5, 7] proposed self-adaptive methods where the stepsize has no connection with matrix norms. Their methods actually compute the stepsize by adopting different self-adaptive strategies.
Note that the algorithms proposed by Censor et al. [1], Zhang et al. [4], and Zhao and Yang [5, 7] involve the projection to an auxiliary set Ω. In fact, the set Ω is introduced just for the convenience of the proof of the convergence and it may be difficult to determine Ω in some cases. Considering this, Zhao and Yang [6] presented simple projection algorithms which does not need projection to an auxiliary set Ω.
In this paper, we introduce two projection algorithms for solving the MSFP, inspired by Beck and Teboulle's iterative shrinkage-thresholding algorithm for linear inverse problem [12]. The first algorithm modifies Censor et al.'s method which does not need projection to an auxiliary set Ω. The second algorithm is self-adaptive and adopts the backtracking rule to determine the stepsize. We firstly study the global rate of convergence of two algorithms and prove that the sequences generated by the proposed algorithms converge to a solution of the MSFP. Some numerical results are presented, which illustrate the efficiency of the proposed algorithms.
2. Preliminaries
In this section, we review some definitions and lemmas which will be used in the main results.
The following lemma is not hard to prove (see [1, 13]).
Lemma 1.
Let p be given as in (3). Then
pis convex and continuously differential,
∇p(x) is Lipschitz continuous with L(p)=∑i=1tαi+ρ(ATA)∑j=1rβj as the Lipschitz constant, where ρ(ATA) is the spectral radius of the matrix ATA.
For any τ>0, consider the following quadratic approximation of p(x) at a given point y:
(6)Rτ(x,y)≔p(y)+〈x-y,∇p(y)〉+τ2∥x-y∥2,
which admits a unique minimizer
(7)Fτ(y)≔argmin{Rτ(x,y):x∈ℝN}.
Simple algebra shows that (ignoring constant terms in y)
(8)Fτ(y)=argminx{τ2∥x-(y-1τ∇p(y))∥2}=y-1τ∇p(y).
The following lemma is well known and a fundamental property for a smooth function in the class C1,1; for example, see [14, 15].
Lemma 2.
Let f:ℝn→ℝ be a continuously differentiable function with Lipschitz continuous gradient and Lipschitz constant L(f). Then, for any L>L(f),
(9)f(x)≤f(y)+〈x-y,∇f(y)〉+L2∥x-y∥2,foreveryx,y∈ℝn.
We are now ready to state and prove the promised key result.
Lemma 3 (see [12]).
Let y∈ℝn and τ>0 be such that
(10)p(Fτ(y))≤Rτ(Fτ(y),y).
Then for any x∈ℝn,
(11)p(x)-p(Fτ(y))≥τ2∥Fτ(y)-y∥2+τ〈y-x,Fτ(y)-y〉.
Proof.
From (10), we have
(12)p(x)-p(Fτ(y))≥p(x)-Rτ(Fτ(y),y).
Now, from the fact that p are convex, it follows that
(13)p(x)≥p(y)+〈x-y,∇p(y)〉.
On the other hand, by the definition of Rτ(x,y), one has
(14)Rτ(Fτ(y),y)=p(y)+〈Fτ(y)-y,∇p(y)〉+τ2∥Fτ(y)-y∥2.
Therefore, using (12)–(14), it follows that
(15)p(x)-p(Fτ(y))≥-τ2∥Fτ(y)-y∥2+〈x-Fτ(y),∇p(y)〉=-τ2∥Fτ(y)-y∥2+τ〈x-Fτ(y),y-Fτ(y)〉=τ2∥Fτ(y)-y∥2+τ〈y-x,Fτ(y)-y〉,
where in the first equality above we used (8).
Remark 4.
Note that, from Lemmas 1 and 2, it follows that if τ≥L(p), then the condition (10) is always satisfied for Fτ(y).
3. Two Projection Algorithms
In this section, we propose two projection algorithms which do not need an auxiliary set Ω; one modifies the algorithm introduced by Censor et al. [1] and the other is a self-adaptive algorithm which solves the MSFP without prior knowledge of spectral radius of the matrix ATA.
Algorithm 5.
Let L1≥L(p) be a fixed constant and take τn∈(L(p),L1). Let x0 be arbitrary. For n=0,1,2,…, compute
(16)xn+1=xn-1τn∇p(xn).
Remark 6.
Algorithm 5 is different from Censor et al.'s algorithm (5) in [1] and it does not need the projection to an auxiliary simple nonempty closed convex set Ω. In Algorithm 5, we take τn>L(p) instead of τn>L(p)/2 (as in Censor et al.'s algorithm) which is restricted to a smaller range.
Algorithm 7.
Given γ>0 and η>1, let x0 be arbitrary. For n=0,1,2,…, find the smallest nonnegative integer mn such that τn=γηmn and
(17)xn+1=xn-1τn∇p(xn),
which satisfies
(18)p(xn+1)-p(xn)+〈∇p(xn),xn-xn+1〉≤τn2∥xn-xn+1∥2.
Remark 8.
Note that the sequence of function values {p(xn)} produced by Algorithms 5 and 7 is nonincreasing. Indeed, for every n≥1,
(19)p(xn+1)≤Rτn(xn+1,xn)≤Rτn(xn,xn)=p(xn),
where the first inequality comes from Lemma 2 for Algorithm 5 and from (18) for Algorithm 7, and the second inequality follows from (7). τn in (19) is either chosen by the backtracking rule (18) or τn∈(L(p),L1), where L(p) is a given Lipschitz constant of ∇p.
Lemma 9.
There holds(20)βL(p)≤τn≤αL(p),
where α=L1/L(p), β=1 in Algorithm 5 and α=η, β=γ/L(p) in Algorithm 7.
Proof.
It is easy to verify (20) for Algorithm 5. By η>1 and the choice of τn, we get τn≥γ. From Lemma 2, it follows that inequality (18) is satisfied for τn≥L(p), where L(p) is the Lipschitz constant of ∇p. So, for Algorithm 7 one has τn≤ηL(p) for every n≥1.
Remark 10.
From Lemma 9, it follows that backtracking rule (18) is well defined.
Remark 11.
In algorithm “ISTA with backtracking” proposed by Beck and Teboulle [12], they took τn=τn-1ηmn, with τ0>0 and η>1. It is obvious that τn increases with n. It is verified that small τn is more efficient than a larger one in numerical experiments (see Table 1). So, in Algorithm 7, we take τn=γηmn for backtracking rule which is smaller than the one in the algorithm of Beck and Teboulle.
Computational results for Example 13 with different algorithms.
Initial point
Algorithm 5 with different τ
Algorithm 7
1.01L(p)
1.1L(p)
1.2L(p)
1.3L(p)
1.4L(p)
Iter.
Iter.
Iter.
Iter.
Iter.
Iter.
InIt.
(0, 0, 0, 0, 0)
96
104
114
123
132
7
22
(20, 10, 20, 10, 20)
1246
1358
1482
1606
1730
35
77
(100, 0, 0, 0, 0)
1256
1368
1493
1618
1743
39
90
(1, 1, 1, 1, 1)
1228
1338
1460
1582
1704
28
54
Theorem 12.
Let {xn} be a sequence generated by Algorithm 5 or Algorithm 7. Then {xn} converges to a solution of the MSFP (1), and furthermore for any n≥1 it holds that
(21)p(xn)≤αL(p)∥x0-x*∥22n,∀x*∈Γ.
Proof.
Invoking Lemma 3 with x=x*, y=xk, and τ=τk, we obtain
(22)2τk(p(x*)-p(xk+1))≥∥xk+1-xk∥2+2〈xk-x*,xk+1-xk〉=∥xk+1-x*∥2-∥xk-x*∥2,
which combined with (20) and the fact that p(x*)=0, p(xk+1)≥0 yields
(23)-2αL(p)p(xk+1)≥∥xk+1-x*∥2-∥xk-x*∥2,
which implies
(24)∥xk+1-x*∥≤∥xk-x*∥.
So {xn} is a Fejér monotone sequence. Summing the inequality (23) over k=0,1,…,n-1 gives
(25)-2αL(p)∑k=0n-1p(xk+1)≥∥xn-x*∥2-∥x0-x*∥2.
Invoking Lemma 3 one more time with x=y=xk and τ=τk yields
(26)2τk(p(xk)-p(xk+1))≥∥xk-xk+1∥2.
Since τk≥βL(p) (see (20)) and p(xk)-p(xk+1)≥0 (see (19)), it follows that
(27)2βL(p)(p(xk)-p(xk+1))≥∥xk-xk+1∥2.
Multiplying the last inequality by k and summing over k=0,…,n-1, we obtain
(28)2βL(p)∑k=0n-1(kp(xk)-(k+1)p(xk+1)+p(xk+1))≥∑k=0n-1k∥xk-xk+1∥2,
which simplifies to
(29)2βL(p)(-np(xn)+∑k=0n-1p(xk+1))≥∑k=0n-1k∥xk-xk+1∥2.
Adding (25) and (29) times β/α, we get
(30)-2nαL(p)p(xn)≥∥xn-x*∥2+βα∑k=0n-1k∥xk-xk+1∥2-∥x0-x*∥2,
and hence, it follows that
(31)p(xn)≤αL(p)∥x0-x*∥22n,∀x*∈Γ,
which yields
(32)limn→∞p(xn)=0.
Since {xn} is Fejér monotone, it is bounded. To prove the convergence of {xn}, it only remains to show that all converging subsequences have the same limit. Suppose in contradiction that two subsequences {xnk} and {xnl} converge to different limits x^ and x~, respectively (x^≠x~). We are to show that x^ is a solution of the MSFP. The continuity of p(x) then implies that
(33)0≤p(x^)=limk→∞p(xnk)=limn→∞p(xn)=0.
Therefore, p(x^)=0; that is, x^∈C=⋂i=1tCi and Ax^∈Q=⋂j=1rQj; that is, x^ is a solution of the MSFP. Similarly, we can show that it is a solution of the MSFP. Now, by Fejér monotonicity of the sequence {xn}, it follows that the sequence {∥xn-x^∥} is bounded and nonincreasing and thus has a limit limn→∞∥xn-x^∥=l1. However, we also have limn→∞∥xn-x^∥=limk→∞∥xnk-x^∥=0, and limn→∞∥xn-x^∥=liml→∞∥xnl-x^∥=∥x~-x^∥, so that l1=0=∥x~-x^∥, which is obviously a contradiction. Thus {xn} converges to a solution of the MSFP (1). The proof is completed.
4. Numerical Tests
In order to verify the theoretical assertions, we present some numerical tests in this section. We apply Algorithms 5 and 7 to solve two test problems of [4] (Examples 13 and 14) and compare the numerical results of two algorithms.
For convenience, we denote the vector with all elements 0 by e0 and the vector with all elements 1 by e1 in what follows. In the numerical results listed in the following tables, “Iter.” and “Sec.” denoted the number of iterations and the CPU time in seconds, respectively. For Algorithm 7, “InIt.” denoted the number of total iterations of finding suitable τn in (18).
Example 13 (see [4]).
Consider a split feasibility problem as finding x∈C={x∈ℝ5∣∥x∥≤0.25} such that Ax∈Q={y=(y1,y2,y3,y4)T∈ℝ4∣0.6≤yj≤1,j=1,2,3,4}, where the matrix
(34)A=(2-1323125212021-22-10-35).
The weights of p(x) were set to α=0.9 and β=0.1. In the implementation, we took p(x)<ε=10-9 as the stopping criterion as in [4].
For Algorithm 5, we tested τn=1.01L(p),1.1L(p),…,1.9L(p) and the numerical results were reported in Table 1 with different initial points x0. (Since the number of iterations for τn=1.5L(p),1.6L(p),…,1.9L(p) was larger than those for τn≤1.4L(p), we only reported the results for τn≤1.4L(p).) We took γ=1 and η=1.1 for Algorithm 7. Table 1 shows that Algorithm 5 was efficient when choosing a suitableτn (τn∈(L(p),1.1L(p)) was the best choice for the current example), while the number of iterations of Algorithm 5 was still larger than those for Algorithm 7.
Example 14 (see [4]).
Consider the MSFP, where A=(aij)N×N∈ℝN×N and aij∈(0,1) generated randomly:
(35)Ci={x∈ℝN∣∥x-di∥≤ri},i=1,2,…,t,Qj={y∈ℝN∣Lj≤y≤Uj},j=1,2,…,r,
where di∈ℝN is the center of the ball Ci, e0≤di≤10e1, and ri∈(40,50) is the radius; di and ri are both generated randomly. Lj and Uj are the boundary of the box Qj and are also generated randomly, satisfying 20e1≤Lj≤30e1, 40e1≤Uj≤80e1, respectively. The weights of p(x) were 1/(t+r). The stopping criterion was p(x)<ε=10-4 with the initial point x0=e0∈ℝN.
We tested Algorithms 5 and 7 with different t and r in different dimensional Euclidean space. In Algorithm 5, since a smaller τn is more efficient than a larger one, we take τn=1.01L(p) in the experiment. We take γ=1 and η=1.2 for Algorithm 7. For comparison, the same random values were taken in each test. The numerical results were listed in Table 2, from which we can observe the efficiency of the self-adaptive Algorithm 7, both from the points of view of number of iterations and CPU time.
Computational results for Example 14 with different dimensions and different numbers of Ci and Qj.
N
20
30
40
50
60
t=5,
Algorithm 5
Iter.
515
675
774
875
1098
Sec.
0.093
0.125
0.156
0.203
0.485
r=5
Algorithm 7
Iter.
11
8
7
7
7
InIt.
71
94
105
104
133
Sec.
0.016
0.031
0.047
0.062
0.078
t=10,
Algorithm 5
Iter.
772
1412
1456
1583
1614
Sec.
0.328
0.625
0.782
1.047
1.297
r=15
Algorithm 7
Iter.
14
13
9
8
7
InIt.
76
92
120
122
140
Sec.
0.031
0.063
0.078
0.094
0.125
t=30,
Algorithm 5
Iter.
854
1467
2100
2246
2448
Sec.
0.406
0.828
1.437
1.875
3.516
r=40
Algorithm 7
Iter.
15
13
13
13
9
InIt.
78
88
113
123
144
Sec.
0.032
0.047
0.093
0.188
0.297
Conflicts of Interests
There is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors express their thanks to Dr. Wenxing Zhang for his help in numerical tests. This research was supported by the National Natural Science Foundation of China (no. 11201476) and Fundamental Research Funds for the Central Universities (no. 3122013D017).
CensorY.ElfvingT.KopfN.BortfeldT.The multiple-sets split feasibility problem and its applications for inverse problems20052162071208410.1088/0266-5611/21/6/017MR2183668ZBL1089.65046LiZ.HanD.ZhangW.A self-adaptive projection-type method for nonlinear multiple-sets split feasibility problem201321115517010.1080/17415977.2012.677445MR3011907ZBL1272.65053XuH.-K.A variable Krasnosel'skii-Mann algorithm and the multiple-set split feasibility problem20062262021203410.1088/0266-5611/22/6/007MR2277527ZhangW.HanD.LiZ.A self-adaptive projection method for solving the multiple-sets split feasibility problem20092511, article 11500110.1088/0266-5611/25/11/115001MR2545996ZBL1185.65102ZhaoJ.YangQ.Self-adaptive projection methods for the multiple-sets split feasibility problem2011273, article 03500910.1088/0266-5611/27/3/035009MR2772528ZBL1215.65115ZhaoJ.YangQ.A simple projection method for solving the multiple-sets split feasibility problem2013213537546MR302243610.1080/17415977.2012.712521ZhaoJ.YangQ.Several acceleration schemes for solving the multiple-sets split feasibility problem201243771648165710.1016/j.laa.2012.05.018MR2946348ByrneC.A unified treatment of some iterative algorithms in signal processing and image reconstruction200420110312010.1088/0266-5611/20/1/006MR2044608ZBL1051.65067CensorY.ElfvingT.A multiprojection algorithm using Bregman projections in a product space199482–422123910.1007/BF02142692MR1309222ZBL0828.65065DongQ.-L.YaoY.HeS.Weak convergence theorems of the modified relaxed projection algorithms for the split feasibility problem in Hilbert spaces201310.1007/s11590-013-0619-4HeS.ZhuW.A note on approximating curve with 1-norm regularization method for the split feasibility problem201220121068389010.1155/2012/683890MR2948162ZBL1251.65083BeckA.TeboulleM.A fast iterative shrinkage-thresholding algorithm for linear inverse problems20092118320210.1137/080716542MR2486527ZBL1175.94009AubinJ.-P.1993140Berlin, GermanySpringer417Graduate Texts in MathematicsMR1217485BertsekasD. P.19992ndBelmont, Mass, USAAthena ScientificOrtegaJ. M.RheinboldtW. C.200030Philadelphia, Pa, USASIAM358Classics in Applied Mathematics10.1137/1.9780898719468MR1744713