Tikhonov functionals are known to be well suited for obtaining regularized
solutions of linear operator equations. We analyze two iterative
methods for finding the minimizer of norm-based Tikhonov functionals in
Banach spaces. One is the steepest descent method, whereby the iterations
are directly carried out in the underlying space, and the other one performs
iterations in the dual space. We prove strong convergence of both methods.
1. Introduction
This article is concerned with the stable solution of
operator equations of the first kind in Banach spaces. More precisely, we aim
at computing a solution x∈X of
Ax=y+η
for a linear,
continuous mapping A:X→Y, where X and Y
are Banach spaces and y∈Y denotes the
measured data which are contaminated by some noise η∈Y. There exists a large variety of regularization
methods for (1.1) in case that X and Y are Hilbert
spaces such as the truncated singular value decomposition, the Tikhonov-Phillips regularization, or iterative solvers like the Landweber method and the
method of conjugate gradients. We refer to the monographs of Louis [1], Rieder
[2], Engl et al. [3] for a comprehensive study of solution
methods for inverse problems in Hilbert spaces.
The development of explicit solvers for operator
equations in Banach spaces is a current field of research which has great
importance since the Banach space setting allows for dealing with inverse
problems in a mathematical framework which is often better adjusted to the
requirements of a certain application. Alber [4] established an iterative
regularization scheme in Banach spaces to solve (1.1) where
particularly A:X→X∗ is a monotone
operator. In case that X=Y, Plato [5] applied a linear Landweber method
together with the discrepancy principle in order to get a solution to (1.1) after
a discretization. Osher et al. [6] developed an iterative algorithm for image
restoration by minimizing the BV norm. Butnariu
and Resmerita [7] used Bregman projections to obtain a weakly convergent
algorithm for solving (1.1) in a Banach space setting. Schöpfer et al. [8] proved strong convergence and stability of a nonlinear Landweber
method for solving (1.1) in connection with the discrepancy principle in a fairly
general setting where X has to be
smooth and uniformly convex.
The idea of this paper is to get a solver for (1.1) by
minimizing a Tikhonov functional where we use Banach space norms in the data
term as well as in the penalty term. Since we only consider the case of exact
data we put η=0 in (1.1). That
means that we investigate the problem
minx∈XΨ(x),
where the
Tikhonov functional Ψ:X→ℝ is given by
Ψ(x)=1r∥Ax−y∥Yr+α1p∥x∥Xp,
with a continuous linear operator A:X→Y mapping between
two Banach spaces X and Y.
If X and Y are Hilbert
spaces, many results exist for problem (1.2) concerning solution methods,
convergence, and stability of them and parameter choice rules for α can be found in
the literature. In case that only Y is a Hilbert
space, this problem has been thoroughly studied and many solvers have been
established; see [9, 10]. A possibility to get an approximate solution for (1.2)
is to use the steepest descent method. Assume for the moment that both X and Y are Hilbert
spaces and r=p=2. Then Ψ is Gâteaux
differentiable and the steepest descent method applied to (1.2) coincides with
the well-known Landweber method
xn+1=xn−μn∇Ψ(xn)=xn−μnA∗(Axn−y).
This iterative
method converges to the unique minimizer of problem (1.2), if the stepsize μn is chosen
properly.
In the present paper, we consider two generalizations
of (1.4). First we notice that the natural extension of the gradient ∇Ψ for convex, but
not necessarily smooth, functionals Ψ is the notion
of the subdifferential ∂Ψ. We will elaborate the details later, but for the
time being we note that ∂Ψ is a set-valued
mapping, that is, ∂Ψ:X⇉X*. Here we make use of the usual notation in the
context of convex analysis, where f:X⇉Y means a mapping f from X to 2Y. We then consider the formally defined iterative
scheme xn+1*=xn*−μnψnwithψn∈∂Ψ(xn),xn+1=Jq*(xn+1*),
where Jq*:X*⇉X is a duality
mapping of X*. In the case of smooth Ψ we also
consider a second generalization to (1.4) xn+1=xn−μnJq*(∇Ψn(xn)). We will show that both schemes converge strongly to
the unique minimizer of problem (1.2), if μn is chosen
properly.
Alber et al. presented in [11] an
algorithm for the minimization of convex and not necessarily smooth functionals
on uniformly smooth and uniformly convex Banach spaces which looks very similar
to our first method in Section 3 and where the authors impose summation
conditions on the stepsizes μn. However, only weak convergence of the proposed
scheme is shown. Another interesting approach to obtain convergence results of
descent methods in general Banach spaces can be found in the recent papers by
Reich and Zaslavski [12, 13]. We want to emphasize that the most important
novelties of the present paper are the strong convergence results.
In the next section, we give the necessary theoretical
tools and apply them in Sections 3 and 4 to describe the methods and prove their
convergence properties.
2. Preliminaries
Throughout the paper, let X and Y be Banach
spaces with duals X* and Y*. Their norms will be denoted by ∥⋅∥. We omit indices indicating the space since it will
become clear from the context which one is meant. For x∈X and x*∈X*, we write〈x,x*〉=〈x*,x〉=x*(x).
Let p,q∈(1,∞) be conjugate
exponents such that
1p+1q=1.
2.1. Convexity and Smoothness of Banach Spaces
We introduce
some definitions and preliminary results about the geometry of Banach spaces,
which can be found in [14, 15].
The functions δX:[0,2]→[0,1] and ρX:[0,∞)→[0,∞) defined by
δX(ϵ)=inf{1−∥12(x+y)∥:∥x∥=∥y∥=1,∥x−y∥≥ϵ},ρX(τ)=12sup{∥x+y∥+∥x−y∥−2:∥x∥=1,∥y∥≤τ}
are referred to as the modulus of convexity of X and the modulus of smoothness of X.
Definition 2.1.
A Banach space X
is said to
be
uniformly
convex if δX(ϵ)>0 for all ϵ∈(0, 2],
p-convex or convex of power type if for some p>1 and C>0,δX(ϵ)≥Cϵp,
smooth if for every x≠0, there is a unique x*∈X* such that ∥x*∥=1 and 〈x*,x〉=∥x∥,
uniformly
smooth if limτ→0(ρX(τ)/τ)=0,
q-smooth or smooth of power type if for some q>1 and C>0,
ρX(τ)≤Cτq.
There is a tight connection between the modulus of
convexity and the modulus of smoothness. The Lindenstrauss duality formula
implies that
Xisp-convexiffX*isq-smooth,Xisq-smoothiffX*isp-convex, (cf. [16],
chapter II, Thereom 2.12). From Dvoretzky's theorem [17], it follows that p≥2 and q≤2. For Hilbert spaces the polarization identity ∥x−y∥2=∥x∥2−2〈x,y〉+∥y∥2asserts that
every Hilbert space is 2-convex and 2-smooth. For
the sequence spaces ℓp, Lebesgue spaces Lp, and Sobolev spaces Wpm it is also
known [18, 19] that ℓp,Lp,Wpmwith1<p≤2are2-convex,p-smooth,ℓq,Lq,Wqmwith2≤q<∞areq-convex,2-smooth.
2.2. Duality Mapping
For p>1 the set-valued
mapping Jp:X⇉X* defined byJp(x)={x*∈X*:〈x*,x〉=∥x∥∥x*∥,∥x*∥=∥x∥p−1}is called the duality mapping of X (with weight
function t↦tp−1). By jp we denote a
single-valued selection of Jp.
One can show [15, Theorem I.4.4] that Jp is monotone,
that is, 〈x*−y*,x−y〉≥0∀x*∈Jp(x),y*∈Jp(y).If X is smooth, the
duality mapping Jp is single
valued, that is, one can identify it as Jp:X→X* [15, Theorem I.4.5].
If X is uniformly
convex or uniformly smooth, then X is reflexive
[15, Theorems II.2.9 and II.2.15]. By Jp*, we then denote
the duality mapping from X* into X**=X.
Let ∂f:X⇉X* be the
subdifferential of the convex functional f:X→ℝ. At x∈X it is defined
by x¯∈∂f(x)⇔f(y)≥f(x)+〈x¯,y−x〉∀y∈X.
Another important property of Jp is due to the
theorem of Asplund [15, Theorem I.4.4] Jp=∂{1p∥⋅∥p}.
This equality is also valid in the case of set valued
duality mappings.
Example 2.2.
In Lr spaces with 1<r<∞, we have 〈Jp(f),g〉=∫(1∥f∥rr−p|f(x)|r−1sign(f(x)))⋅g(x)dx.
In the sequence spaces ℓr with 1<r<∞, we have 〈Jp(x),y〉=∑i(1∥x∥rr−p|xi|r−1sign(xi))⋅yi.
We also refer the interested reader to [20] where
additional information on duality mappings may be found.
2.3. Xu-Roach Inequalities
The next theorem (see [19]) provides us with
inequalities which will be of great relevance for proving the convergence of
our methods.
Theorem 2.3.
(1) Let X be a p-smooth Banach space. Then there exists a positive constant Gp
such that 1p∥x−y∥p≤1p∥x∥p−〈Jp(x),y〉+Gpp∥y∥p∀x,y∈X.
(2) Let X be a q-convex Banach space. Then there exists a positive constant Cq such
that 1q∥x−y∥q≥1q∥x∥q−〈Jq(x),y〉+Cqq∥y∥q∀x,y∈X.
We remark that in a real Hilbert space these inequalities reduce to the well-known
polarization identity (2.7). Further, we refer to [19] for the exact values of the constants
Gp and Cq. For special cases like ℓp-spaces these constants have a simple form, see
[8].
2.4. Bregman Distances
It turns out that due to the geometrical characteristics of Banach spaces other than
Hilbert spaces, it is often more appropriate to use Bregman distances instead of
conventional-norm-based functionals ∥x−y∥ or ∥Jp(x)−Jp(y)∥ for convergence analysis.
The idea to use such distances to design and analyze optimization algorithms
goes back to Bregman [21] and since then his ideas have been successfully applied in
various ways [4, 8, 22–26].
Definition 2.4.
Let X
be smooth and
convex of power type. Then the Bregman distances Δp(x,y) are defined as Δp(x,y):=1q∥Jp(x)∥q−〈Jp(x),y〉+1p∥y∥p.
We summarize a few facts concerning Bregman distances
and their relationship to the norm in X (see also [8, Theorem 2.12]
).
Theorem 2.5.
Let X
be smooth and
convex of power type. Then for all p>1, x,y∈X, and sequences (xn)n in X the following
holds:
Δp(x,y)≥0,
limn→∞∥xn−x∥=0⇔limn→∞Δp(xn,x)=0,
Δp(⋅,y) is coercive,
that is, the sequence (xn) remains bounded
if the sequence (Δp(xn,y)) is bounded.
Remark 2.6.
Δp(⋅,⋅)
is in general
not metric. In a real Hilbert space Δ2(x,y)=(1/2)∥x−y∥2.
To shorten the proof in Chapter 3, we formulate and
prove the following.
Lemma 2.7.
Let X be a p-convex Banach
space, then there exists a positive constant c, such that c⋅∥x−y∥p≤Δp(x,y).
Proof.
We
have (1/q)∥Jp(x)∥q=(1/q)∥x∥p and 〈Jp(x),x〉=∥x∥p, hence Δp(x,y)=1q∥Jp(x)∥q−〈Jp(x),y〉+1p∥y∥p=(1−1p)∥x∥p−〈Jp(x),y〉+1p∥y∥p=1p∥x−(x−y)∥p−1p∥x∥p+〈Jp(x),x−y〉.
By Theorem 2.3,
we obtain Δp(x,y)≥Cpp∥x−y∥p.
This completes the proof.
3. The Dual Method
This section deals with an iterative method for
minimizing functionals of Tikhonov type. In contrast to the algorithm described
in the next section, we iterate directly in the dual space X*.
Due to simplicity, we restrict ourselves to the
Tikhonov functional Ψ(x)=1r∥Ax−y∥Yr+α12∥x∥X2withr>1, where X is a 2-convex and
smooth Banach space, Y is an arbitrary
Banach space and A:X→Y is a linear,
continuous operator. For minimizing the functional, we choose an arbitrary starting
point x0*∈X* and consider
the following scheme
xn+1*=xn*−μnψnwithψn∈∂Ψ(xn),xn+1=J2*(xn+1*).
We show the convergence of this method in a
constructive way. This will be done via the following steps.
We show the
inequality
Δ2(xn+1,x†)≤Δ2(xn,x†)−μnα⋅Δ2(xn,x†)+μn2G22⋅∥ψn∥2,
where x† is the unique
minimizer of the Tikhonov functional (3.1).
We choose
admissible stepsizes μn and show that
the iterates approach x† in
the Bregman sense, if we assumeΔ2(xn,x†)≥ϵ.
We suppose ϵ>0 to be small and
specified later.
We establish an
upper estimate for Δ2(xn+1,x†) in the case
that the condition Δ2(xn,x†)≥ϵ is violated.
We choose ϵ such that in
the case Δ2(xn,x†)<ϵ the iterates
stay in a certain Bregman ball, that is, Δ2(xn+1,x†)<εaim, where εaim is some a
priori chosen precision we want to achieve.
Finally, we
state the iterative minimization scheme.
(i) First, we
calculate the estimate for Δn+1, where
Δn:=Δ2(xn,x†).
Under our assumptions on X, we know that Ψ has a unique
minimizer x†. Using (3.2) we getΔn+1=12∥xn+1*∥2−〈xn+1*,x†〉+12∥x†∥2=12∥xn*−μnψn∥2−〈xn*−μnψn,x†〉+12∥x†∥2.We remember
that X is 2-convex, hence X* is 2-smooth; see
Section 2.1. By Theorem 2.3 applied to X*, we get12∥xn*−μnψn∥2≤12∥xn*∥2−μn〈xn,ψn〉+G22⋅μn2∥ψn∥2.
We have∂Ψ(x)=A*Jr(Ax−y)+αJ2(x),
(cf. [27], Chapter I; Propositons 5.6, 5.7). By definition, x† is the
minimizer of Ψ, hence ψ†:=0∈∂Ψ(x†). Therefore, with the monotonicity of Jr, we get〈ψn,x†−xn〉=〈ψn−ψ†,x†−xn〉=α〈J2(xn)−J2(x†),x†−xn〉+〈A*jr(Axn−y)−A*jr(Ax†−y),x†−xn〉=−α〈J2(xn)−J2(x†),xn−x†〉−〈jr(Axn−y)−jr(Ax†−y),(Axn−y)−(Ax†−y)〉≤−α〈J2(xn)−J2(x†),xn−x†〉.Consider
Finally, we
arrive at the desired inequality Δn+1≤Δn−μnα⋅Δn+μn2G22⋅∥ψn∥2.
(ii) Next, we
choose admissible stepsizes. Assume that Δ2(x0,x†)=Δ0≤R.
We see that the choice μn=αG2∥ψn∥2⋅Δn
minimizes the right-hand side of (3.12). We do
not know the distance Δn, therefore, we set μn:=αG2P⋅ϵ.We will impose additional conditions on ϵ later. For the
time being, assume that ϵ is small. The
number P is defined by P=P(R)=sup{∥ψ∥2:ψ∈∂Ψ(x)withΔ2(x,x†)≤R}.The Tikhonov functional Ψ is bounded on
norm bounded sets, thus also ∂Ψ is bounded on
norm-bounded sets. By Lemma 2.7, we know then that∥x0−x†∥≤Rc. Hence, P is finite for
finite R.
Remark 3.1.
If we assume ∥x†∥≤ρ
and with the
help of Lemma 2.7, the definition of P, and the duality mapping J2, we get an estimate for P. We have ∥x−x†∥≤Rc,∥x∥≤∥x−x†∥+∥x†∥≤Rc+ρ.We calculate an
estimate for ∥ψ∥:
∥ψ∥=∥A*jr(Ax−y)+αJ2(x)∥≤∥A*∥∥jr(Ax−y)∥+α∥J2(x)∥≤∥A*∥∥Ax−y∥r−1+α∥x∥≤∥A∥(∥A∥[Rc+ρ]+∥y∥)r−1+α(Rc+ρ).This
calculation gives us an estimate for P. In practice, we will not determine this estimate
exactly, but choose P in a sense big
enough.
For Δn≥ϵ we approach the
minimizer x† in the Bregman
sense, that is,
Δn+1≤Δn−α2G2Pϵ2+α22G2Pϵ2=Δn−α22G2Pϵ2=:Δn−Dϵ2, where D:=D(R)=α22G2P.This ensuresΔn+1<Δn<⋯<Δ0
as long as Δn≥ϵ is fulfilled.
(iii) We know the
behavior of the Bregman distances, if Δn≥ϵ holds. Next, we
need to know what happens if Δn<ϵ. By (3.12), we then haveΔn+1≤Δn+Dϵ2<ϵ+Dϵ2.
(iv) We chooseϵ:=−1+1+4D⋅εaim2D,where εaim>0 is the accuracy
we aim at. For the case Δn<ϵ this choice of ϵ assures thatΔn+1<ϵ+Dϵ2=ϵaim.Note that the choice of ϵ implies ϵ≤εaim.
Next, we calculate an index N, which ensures that the iterates xn with n≥N are located in
a Bregman ball with radius εaim around x†. We know that if xn fulfills Δn≤εaim, then all following
iterates fulfill this condition as well.
Hence, the opposite case is Δn+1≥εaim≥ϵ. By (3.20), we know that this is only the case ifεaim≤Δn+1≤R−nDϵ2.
By choosing N such thatN>R−εaimDϵ2=R−εaim(1+(1−1+4Dεaim)/2Dεaim)εaim,
we get ΔN≤εaim.
Figure 1 illustrates
the behavior of the iterates.
Geometry of
the problem. The iterates xn approach x† as long as Δ2(xn,x†)≥ϵ. The auxiliary number ϵ is chosen such
that, if the iterates enter the Bregman ball with
radius εaim around x†, the following iterates
stay in that ball.
(v) We are now
in the same situation as described in (2). If we replace R by εaim, x0 by xN and εaim by some εaim,2<εaim and repeat the
argumentation in (2)–(4), we obtain a contracting sequence of Bregman balls.
If the sequence (εaim,k)k is a null
sequence, then by Lemma 2.7 the iterates xn converge
strongly to x†. This proves the following.
Theorem 3.2.
The
iterative method, defined by
choose an
arbitrary x0 and a
decreasing positive sequence (εk)k withlimk→∞εk=0Δ2(x0,x†)<ε1,
set k=1;
compute P, D, ϵ, and μ as P=sup{∥ψ∥2:ψ∈∂Ψ(x)withΔ2(x,x†)≤εk},D=α22,G2Pϵ=−1+1+4D⋅εk+12,Dμ=αG2Pϵ;
for at least N iterations,
whereN>εk−εk+1(1+(1−1+4Dεk+1)/2Dεk+1)εk+1;
let k←(k+1), reset P, D, ϵ, μ, N and go to step (S1), defines an
iterative minimization method for the Tikhonov functional Ψ, defined in (3.1) and the iterates converge strongly
to the unique minimizer x†.
Remark 3.3.
A similar construction can be carried out
for any p-convex and
smooth Banach space.
4. Steepest Descent Method
Let X be uniformly
convex and uniformly smooth and let Y be uniformly
smooth. Then the Tikhonov functionalΨ(x):=1r∥Ax−y∥r+αp∥x∥pis strictly
convex, weakly lower semicontinuous, coercive, and Gâteaux differentiable with
derivative ∇Ψ(x)=A*Jr(Ax−y)+αJp(x).Hence, there
exists the unique minimizer x† of Ψ, which is characterized byΨ(x†)=minx∈XΨ(x)⇔∇Ψ(x†)=0.In this
section, we consider the steepest descent method to find x†. In [28, 29], it has already been proven that for a
general continuously differentiable functional Ψ every cluster
point of such steepest descent method is a stationary point. Recently, Canuto
and Urban [30] have shown strong convergence under the additional assumption of
ellipticity, which our Ψ in (4.1) would
fulfill if we required X to be p-convex. Here
we prove strong convergence without this additional assumption. To make the
proof of convergence more transparent, we confine ourselves here to the case of r-smooth Y and p-smooth X (with then r,p∈(1,2] being the ones
appearing in the definition of the Tikhonov functional (4.1)) and refer the
interested reader to the appendix, where we prove the general case.
Theorem 4.1.
The
sequence (xn)n, generated by
choose an
arbitrary starting point x0∈X and set n=0;
if ∇Ψ(xn)=0, then STOP else do a line search to find μn>0 such thatΨ(xn−μnJq*(∇Ψ(xn)))=minμ∈ℝΨ(xn−μJq*(∇Ψ(xn)));
set xn+1:=xn−μnJq*(∇Ψ(xn)), n←(n+1) and go to step (S1), converges
strongly to the unique minimizer x† of Ψ.
Remark 4.2.
(a) If the stopping
criterion ∇Ψ(xn)=0 is fulfilled
for some n∈ℕ, then by (4.3), we already have xn=x† and we can stop
iterating.
(b) Due to the
properties of Ψ, the function fn:ℝ→[0,∞) defined byfn(μ):=Ψ(xn−μJq*(∇Ψ(xn)))appearing in
the line search of step (S1) is strictly
convex and differentiable with continuous derivativefn′(μ)=−〈∇Ψ(xn−μJq*(∇Ψ(xn))),Jq*(∇Ψ(xn))〉.Since fn′(0)=−∥∇Ψ(xn)∥q<0 and fn′ is increasing
by the monotonicity of the duality mappings, we know that μn must in fact be
positive.
Proof of Theorem 4.1.
By the above remark it suffices to prove convergence in case ∇Ψ(xn)≠0 for all n∈ℕ. We fix γ∈(0,1) and show that
there exists positive μ˜n such that
Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ),
which will
finally assure convergence. To establish this relation, we use the
characteristic inequalities in Theorem 2.3 to estimate, for all μ>0,Ψ(xn+1)≤Ψ(xn−μJq*(∇Ψ(xn)))=1r∥(Axn−y)−μAJq*(∇Ψ(xn))∥r+αp∥xn−μJq*(∇Ψ(xn))∥p≤1r∥Axn−y∥r−〈Jr(Axn−y),μAJq*(∇Ψ(xn))〉,+Grr∥μAJq*(∇Ψ(xn))∥r+αp∥xn∥p−α〈Jp(xn),μJq*(∇Ψ(xn))〉+αGpp∥μJq*(∇Ψ(xn))∥p.
By (4.1) and (4.2)
for x=xn and〈∇Ψ(xn),Jq*(∇Ψ(xn))〉=∥∇Ψ(xn)∥q=∥Jq*(∇Ψ(xn))∥p,we can further estimate
Ψ(xn+1)≤Ψ(xn)−μ∥∇Ψ(xn)∥q+Grr∥AJq*(∇Ψ(xn))∥rμr+αGpp∥∇Ψ(xn)∥qμp=Ψ(xn)−μ∥∇Ψ(xn)∥q(1−ϕn(μ)),
whereby we set
ϕn(μ):=Grr∥AJq*(∇Ψ(xn))∥r∥∇Ψ(xn)∥qμr−1+αGppμp−1.The function ϕn:(0,∞)→(0,∞) is continuous
and increasing with limμ→0ϕn(μ)=0 and limμ→∞ϕn(μ)=∞. Hence, there exists a μ˜n>0 such thatϕn(μ˜n)=γand we get
Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ).We show that limn→∞∥∇Ψ(xn)∥=0. From (4.13), we infer that the sequence (Ψ(xn))n is decreasing
and especially bounded and thatlimn→∞μ˜n∥∇Ψ(xn)∥q=0.Since Ψ is coercive,
the sequence (xn)n remains bounded
and (4.2) then implies that the sequence (∇Ψ(xn))n is bounded as
well. Suppose limsupn→∞∥∇Ψ(xn)∥=ϵ>0 and let ∥∇Ψ(xnk)∥→ϵ for k→∞. Then we must have limk→∞μ˜nk=0 by (4.14). But by
the definition of ϕn (4.11) and the
choice of μ˜n (4.12), we get
for some constant C>0 with ∥AJq*(∇Ψ(xn))∥r≤C,0<γ=ϕnk(μnk)≤GrrC∥∇Ψ(xnk)∥qμ˜nkr−1+αGppμ˜nkp−1.Since the
right-hand side converges to zero for k→∞, this leads to a contradiction. So we have limsupn→∞∥∇Ψ(xn)∥=0 and thus limn→∞∥∇Ψ(xn)∥=0. We finally show that (xn)n converges
strongly to x†. By (4.3) and the monotonicity of the duality mapping Jr, we get∥∇Ψ(xn)∥∥xn−x†∥≥〈∇Ψ(xn),xn−x†〉=〈∇Ψ(xn)−∇Ψ(x†),xn−x†〉=〈Jr(Axn−y)−Jr(Ax†−y),(Axn−y)−(Ax†−y)〉+α〈Jp(xn)−Jp(x†),xn−x†〉≥α〈Jp(xn)−Jp(x†),xn−x†〉.Since (xn)n is bounded and limn→∞∥∇Ψ(xn)∥=0, this yieldslimn→∞〈Jp(xn)−Jp(x†),xn−x†〉=0,from which we
infer that (xn)n converges
strongly to x† in a uniformly
convex X [15, Theorom
II.2.17].
5. Conclusions
We have analyzed two conceptionally quite different
nonlinear iterative methods for finding the minimizer of norm-based Tikhonov
functionals in Banach spaces. One is the steepest descent method, where the
iterations are directly carried out in the X-space
by pulling the gradient of the Tikhonov functional back to X via duality
mappings. The method is shown to be strongly convergent in case the involved
spaces are nice enough. In the other one, the iterations are performed in the
dual space X*. Though this method seems to be inherently slow,
strong convergence can be shown without restrictions on the Y-space.
AppendixSteepest Descent Method in Uniformly Smooth Spaces
As already pointed out in Section 4, we prove here
Theorem 4.1 for the general case of X being uniformly
convex and uniformly smooth and Y being uniformly
smooth, and with r,p≥2 in the
definition of the Tikhonov functional (4.1). To do so, we need some additional
results based on the paper of Xu and Roach [19].
In what follows C,L>0 are always
supposed to be (generic) constants and we writea∨b=max{a,b},a∧b=min{a,b}.Let ρ¯X:(0,∞)→(0,1] be the functionρ¯X(τ):=ρX(τ)τ,where ρX is the modulus
of smoothness of a Banach space X. The function ρ¯X is known to be
continuous and nondecreasing [14, 31].
The next lemma allows us to estimate ∥Jp(x)−Jp(y)∥ via ρ¯X(∥x−y∥), which in turn will be used to derive a version of
the characteristic inequality that is more convenient for our purpose.
Lemma A.1.
Let X
be a uniformly
smooth Banach space with duality mapping Jp with weight p≥2. Then for all x,y∈X the following
inequalities are valid: ∥Jp(x)−Jp(y)∥≤Cmax{1,(∥x∥∨∥y∥)p−1}ρ¯X(∥x−y∥)(hence, Jp is uniformly
continuous on bounded sets) and ∥x−y∥p≤∥x∥p−p〈Jp(x),y〉+C(1∨(∥x∥+∥y∥)p−1)ρX(∥y∥).
Proof.
We at
first prove (A.3). By [19, formula (3.1)], we have ∥Jp(x)−Jp(y)∥≤C(∥x∥∨∥y∥)p−1ρ¯X(∥x−y∥∥x∥∨∥y∥). We estimate
similarly as after inequality (3.5) in the same paper. If 1/(∥x∥∨∥y∥)≤1, then we get by the monotonicity of ρ¯Xρ¯X(∥x−y∥∥x∥∨∥y∥)≤ρ¯X(∥x−y∥)and therefore
(A.3) is valid. In case 1/(∥x∥∨∥y∥)≥1 (⇔∥x∥∨∥y∥≤1), we use the
fact that ρX is equivalent
to a decreasing function (i.e. ρX(η)/η2≤L(ρX(τ)/τ2) for η≥τ>0 [14]) and getρX(∥x−y∥∥x∥∨∥y∥)≤L(∥x∥∨∥y∥)2ρX(∥x−y∥)and therefore ρ¯X(∥x−y∥∥x∥∨∥y∥)≤L∥x∥∨∥y∥ρ¯X(∥x−y∥).For p≥2, we thus arrive
at ∥Jp(x)−Jp(y)∥≤CL(∥x∥∨∥y∥)p−2ρ¯X(∥x−y∥)≤CLρ¯X(∥x−y∥) and also in
this case (A.3) is valid.
Let us prove (A.4). As in [19], we consider the
continuously differentiable function f:[0,1]→ℝ with f(t):=∥x−ty∥p,f′(t)=−p〈Jp(x−ty),y〉,f(0)=∥x∥p,f(1)=∥x−y∥p,f′(0)=−p〈Jp(x),y〉
and get ∥x−y∥p−∥x∥p+p〈Jp(x),y〉=f(1)−f(0)−f′(0)=∫01f′(t)−f′(0)dt=p∫01〈Jp(x)−Jp(x−ty),y〉dt≤p∫01∥Jp(x)−Jp(x−ty)∥∥y∥dt. For t∈[0,1], we set y˜:=x−ty and get x−y˜=ty, ∥y˜∥≤∥x∥+∥y∥ and thus ∥x∥∨∥y˜∥≤∥x∥+∥y∥. By the monotonicity of ρ¯X, we haveρ¯X(t∥y∥)∥y∥≤ρ¯X(∥y∥)∥y∥=ρX(∥y∥)and by (A.3), we
thus obtain ∥x−y∥p−∥x∥p+p〈Jp(x),y〉≤p∫01Cmax{1,(∥x∥+∥y∥)p−1}ρ¯X(t∥y∥)∥y∥dt≤Cmax{1,(∥x∥+∥y∥)p−1}ρX(∥y∥).
The proof of Theorem 4.1 is now quite similar to the
case of smoothness of power type, though it is more technical, and we only give
the main modifications.
Proof of
Theorem 4.1 (for uniformly smooth spaces).
We fix γ∈(0,1), μ¯>0 and for n∈ℕ, we choose μ˜n∈(0,μ¯] such thatϕn(μ˜n)=ϕn(μ¯)∧γ.Here the
function ϕn:(0,∞)→(0,∞) is defined by ϕn(μ):=CYr(1∨(∥Axn−y∥+μ¯∥AJq*(∇Ψ(xn))∥)r−1)×∥AJq*(∇Ψ(xn))∥∥∇Ψ(xn)∥qρ¯Y(μ∥AJq*(∇Ψ(xn))∥)+αCXp(1∨(∥xn∥+μ¯∥∇Ψ(xn)∥q−1)p−1)×1∥∇Ψ(xn)∥ρ¯X(μ∥∇Ψ(xn)∥q−1)
with the
constants CX,CY being the ones
appearing in the respective characteristic inequalities (A.4). This choice of μ˜n is possible
since by the properties of ρ¯Y and ρ¯X, the function ϕn is continuous,
increasing and limμ→0ϕn(μ)=0. We again aim at an inequality of the formΨ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ),which will
finally assure convergence. Here we use the characteristic inequalities (A.4) to
estimate Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q+CYr(1∨(∥Axn−y∥+∥μ˜nAJq*(∇Ψ(xn))∥)r−1)ρY(∥μ˜nAJq*(∇Ψ(xn))∥)+αCXp(1∨(∥xn∥+∥μ˜nJq*(∇Ψ(xn))∥)p−1)ρX(∥μ˜nJq*(∇Ψ(xn))∥).Since μ˜n≤μ¯ and by the
definition of ϕn (A.15), we can
further estimateΨ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q+CYr(1∨(∥Axn−y∥+μ¯∥AJq*(∇Ψ(xn))∥)r−1)ρY(∥μ˜nAJq*(∇Ψ(xn))∥)+αCXp(1∨(∥xn∥+μ¯∥Jq*(∇Ψ(xn))∥)p−1)ρX(μ˜n∥Jq*(∇Ψ(xn))∥).=Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−ϕn(μ˜n))The choice of μ˜n (A.14) finally
yields Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ).
It remains to
show that this implies limn→∞∥∇Ψ(xn)∥=0. The rest then follows analogously as in the proof of Theorem
4.1. From (A.19), we infer that limn→∞μ˜n∥∇Ψ(xn)∥q=0 and that the
sequences (xn)n and (∇Ψ(xn))n are bounded.
Suppose limsupn→∞∥∇Ψ(xn)∥=ϵ>0 and let ∥∇Ψ(xnk)∥→ϵ for k→∞. Then we must have limk→∞μ˜nk=0 by (A.20). We
show that this leads to a contradiction. On the one hand by (A.15), we get
ϕnk(μ˜nk)≤L1∥∇Ψ(xnk)∥qρ¯Y(μ˜nkL2)+C1∥∇Ψ(xnk)∥ρ¯X(μ˜nkC2). Since the right-hand side converges to zero for k→∞, so does ϕnk(μ˜nk). On the other hand, by (A.14), we haveϕnk(μ˜nk)=ϕnk(μ¯)∧γ,ϕnk(μ¯)≥0+Cρ¯X(μ¯∥∇Ψ(xnk)∥q−1).Hence, ϕnk(μ˜nk)≥L>0 for all k big enough
which contradicts limk→∞ϕnk(μ˜nk)=0. So we have limsupn→∞∥∇Ψ(xn)∥=0 and thus limn→∞∥∇Ψ(xn)∥=0.
Acknowledgment
The first author was supported by Deutsche Forschungsgemeinschaft, Grant no. MA 1657/15-1.
LouisA. K.1989Stuttgart, GermanyB. G. Teubner205Teubner Studienbücher MathematikMR1002946ZBL0667.65045RiederA.2003Braunschweig, GermanyVieweg & Sohnxiv+300MR2030046ZBL1057.65035EnglH.HankeM.NeubauerA.2000Dordrecht, The NetherlandsKluwer AcademicAlberY. I.Iterative regularization in Banach spaces198630418ZBL0623.47071PlatoR.On the discrepancy principle for iterative and parametric methods to solve linear ill-posed equations19967519912010.1007/s002110050232MR1417865ZBL0864.65034OsherS.BurgerM.GoldfarbD.XuJ.YinW.An iterative regularization method for total variation-based image restoration20054246048910.1137/040605412MR2162864ButnariuD.ResmeritaE.Bregman distances, totally convex functions, and a method for solving operator equations in Banach spaces20062006398491910.1155/AAA/2006/84919MR2211675SchöpferF.LouisA. K.SchusterT.Nonlinear iterative methods for linear ill-posed problems in Banach spaces2006221311329MR2194197DaubechiesI.DefriseM.De MolC.An iterative thresholding algorithm for linear inverse problems with a sparsity constraint200457111413145710.1002/cpa.20042MR2077704ZBL1077.65055BrediesK.LorenzD.MaassP.A generalized conditional gradient method and its connection to an iterative shrinkage methodto appear in Computational Optimization and ApplicationsAlberY. I.IusemA. N.SolodovM. V.Minimization of nonsmooth convex functionals in Banach spaces199742235255MR1613463ZBL0895.90150ReichS.ZaslavskiA. J.Generic convergence of descent methods in Banach spaces2000252231242MR1853950ZBL1032.90064ReichS.ZaslavskiA. J.The set of divergent descent methods in a Banach space is σ-porous20011141003101810.1137/S1052623400370357MR1855218ZBL1032.90065LindenstraussJ.TzafririL.197997Berlin, GermanySpringerx+243Results in Mathematics and Related AreasMR540367ZBL0403.46022CioranescuI.199062Dordrecht, The NetherlandsKluwer Academic Publishersxiv+260Mathematics and Its ApplicationsMR1079061ZBL0712.47043DevilleR.GodefroyG.ZizlerV.199364Harlow, UKLongman Scientific & Technicalxii+376Pitman Monographs and Surveys in Pure and Applied MathematicsMR1211634ZBL0782.46019DvoretzkyA.Some results on convex bodies and Banach spaces1961Jerusalem, IsraelJerusalem Academic Press123160MR0139079ZBL0119.31803HannerO.On the uniform convexity of Lp and lp195633239244MR0077087ZBL0071.32801XuZ. B.RoachG. F.Characteristic inequalities of uniformly convex and uniformly smooth Banach spaces1991157118921010.1016/0022-247X(91)90144-OMR1109451ZBL0757.46034ReichS.Review of I. Cioranescu “Geometry of Banach spaces, duality mappings and nonlinear problems”1992262367370BregmanL. M.The relaxation method for finding common points of convex sets and its application to the solution of problems in convex programming19677200217ByrneC.CensorY.Proximity function minimization using multiple Bregman projections, with applications to split feasibility and Kullback-Leibler distance minimization20011051–4779810.1023/A:1013349430987MR1879420ZBL1012.90035AlberY. I.ButnariuD.Convergence of Bregman projection methods for solving consistent convex feasibility problems in reflexive Banach spaces1997921336110.1023/A:1022631928592MR1428127ZBL0886.90179BauschkeH. H.BorweinJ. M.CombettesP. L.Bregman monotone optimization algorithms200342259663610.1137/S0363012902407120MR1982285ZBL1049.90053BauschkeH. H.LewisA. S.Dykstra's algorithm with Bregman projections: a convergence proof200048440942710.1080/02331930008844513MR1811866LaffertyJ. D.PietraS. D.PietraV. D.Statistical learning algorithms based on Bregman distancesProceedings of the 5th Canadian Workshop on Information TheoryJune 1997Toronto, Ontario, CanadaEkelandI.TemamR.1976Amsterdam, The NetherlandsNorth-Hollandix+402MR0463994PhelpsR. R.Metric projections and the gradient projection method in Banach spaces198523697397710.1137/0323055MR809544ZBL0579.90099ByrdR. H.TapiaR. A.An extension of Curry's theorem to steepest descent in normed linear spaces19759124725410.1007/BF01681347MR0385676ZBL0317.90048CanutoC.UrbanK.Adaptive optimization of convex functionals in Banach spaces20054252043207510.1137/S0036142903429730MR2139236ZBL1081.65053FigielT.On the moduli of convexity and smoothness1976562121155MR0425581ZBL0344.46052