1. Introduction

AAA

Abstract and Applied Analysis

1687-04091085-3375

Hindawi Publishing Corporation

192679

10.1155/2008/192679

192679

Research Article

Minimization of Tikhonov Functionals in Banach Spaces

Bonesky

Thomas

¹Kazimierski

Kamil S.

¹Maass

Peter

¹Schöpfer

Frank

²Schuster

Thomas

²Reich

Simeon

Center for Industrial Mathematics

University of Bremen

Bremen 28334

Germany²

Fakultät für Maschinenbau

Helmut-Schmidt-Universität, Universität der Bundeswehr Hamburg

Holstenhofweg 85

Hamburg 22043

Germany

16122007

20080307200731102007

2008

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Tikhonov functionals are known to be well suited for obtaining regularized solutions of linear operator equations. We analyze two iterative methods for finding the minimizer of norm-based Tikhonov functionals in Banach spaces. One is the steepest descent method, whereby the iterations are directly carried out in the underlying space, and the other one performs iterations in the dual space. We prove strong convergence of both methods.

1. Introduction

This article is concerned with the stable solution of operator equations of the first kind in Banach spaces. More precisely, we aim at computing a solution x∈X of (1.1)Ax=y+η for a linear, continuous mapping A:X→Y, where X and Y are Banach spaces and y∈Y denotes the measured data which are contaminated by some noise η∈Y. There exists a large variety of regularization methods for (1.1) in case that X and Y are Hilbert spaces such as the truncated singular value decomposition, the Tikhonov-Phillips regularization, or iterative solvers like the Landweber method and the method of conjugate gradients. We refer to the monographs of Louis [1], Rieder [2], Engl et al. [3] for a comprehensive study of solution methods for inverse problems in Hilbert spaces.

The development of explicit solvers for operator equations in Banach spaces is a current field of research which has great importance since the Banach space setting allows for dealing with inverse problems in a mathematical framework which is often better adjusted to the requirements of a certain application. Alber [4] established an iterative regularization scheme in Banach spaces to solve (1.1) where particularly A:X→X∗ is a monotone operator. In case that X=Y, Plato [5] applied a linear Landweber method together with the discrepancy principle in order to get a solution to (1.1) after a discretization. Osher et al. [6] developed an iterative algorithm for image restoration by minimizing the BV norm. Butnariu and Resmerita [7] used Bregman projections to obtain a weakly convergent algorithm for solving (1.1) in a Banach space setting. Schöpfer et al. [8] proved strong convergence and stability of a nonlinear Landweber method for solving (1.1) in connection with the discrepancy principle in a fairly general setting where X has to be smooth and uniformly convex.

The idea of this paper is to get a solver for (1.1) by minimizing a Tikhonov functional where we use Banach space norms in the data term as well as in the penalty term. Since we only consider the case of exact data we put η=0 in (1.1). That means that we investigate the problem (1.2)min⁡x∈XΨ(x), where the Tikhonov functional Ψ:X→ℝ is given by (1.3)Ψ(x)=1r∥Ax−y∥Yr+α1p∥x∥Xp, with a continuous linear operator A:X→Y mapping between two Banach spaces X and Y.

If X and Y are Hilbert spaces, many results exist for problem (1.2) concerning solution methods, convergence, and stability of them and parameter choice rules for α can be found in the literature. In case that only Y is a Hilbert space, this problem has been thoroughly studied and many solvers have been established; see [9, 10]. A possibility to get an approximate solution for (1.2) is to use the steepest descent method. Assume for the moment that both X and Y are Hilbert spaces and r=p=2. Then Ψ is Gâteaux differentiable and the steepest descent method applied to (1.2) coincides with the well-known Landweber method (1.4)xn+1=xn−μn∇Ψ(xn)=xn−μnA∗(Axn−y). This iterative method converges to the unique minimizer of problem (1.2), if the stepsize μn is chosen properly.

In the present paper, we consider two generalizations of (1.4). First we notice that the natural extension of the gradient ∇Ψ for convex, but not necessarily smooth, functionals Ψ is the notion of the subdifferential ∂Ψ. We will elaborate the details later, but for the time being we note that ∂Ψ is a set-valued mapping, that is, ∂Ψ:X⇉X*. Here we make use of the usual notation in the context of convex analysis, where f:X⇉Y means a mapping f from X to 2Y. We then consider the formally defined iterative scheme (1.5)xn+1*=xn*−μnψn with ψn∈∂Ψ(xn),xn+1=Jq*(xn+1*), where Jq*:X*⇉X is a duality mapping of X*. In the case of smooth Ψ we also consider a second generalization to (1.4) (1.6)xn+1=xn−μnJq*(∇Ψn(xn)). We will show that both schemes converge strongly to the unique minimizer of problem (1.2), if μn is chosen properly.

Alber et al. presented in [11] an algorithm for the minimization of convex and not necessarily smooth functionals on uniformly smooth and uniformly convex Banach spaces which looks very similar to our first method in Section 3 and where the authors impose summation conditions on the stepsizes μn. However, only weak convergence of the proposed scheme is shown. Another interesting approach to obtain convergence results of descent methods in general Banach spaces can be found in the recent papers by Reich and Zaslavski [12, 13]. We want to emphasize that the most important novelties of the present paper are the strong convergence results.

In the next section, we give the necessary theoretical tools and apply them in Sections 3 and 4 to describe the methods and prove their convergence properties.

2. Preliminaries

Throughout the paper, let X and Y be Banach spaces with duals X* and Y*. Their norms will be denoted by ∥⋅∥. We omit indices indicating the space since it will become clear from the context which one is meant. For x∈X and x*∈X*, we write(2.1)〈x,x*〉=〈x*,x〉=x*(x).

Let p,q∈(1,∞) be conjugate exponents such that (2.2)1p+1q=1.

2.1. Convexity and Smoothness of Banach Spaces

We introduce some definitions and preliminary results about the geometry of Banach spaces, which can be found in [14, 15].

The functions δX:[0,2]→[0,1] and ρX:[0,∞)→[0,∞) defined by (2.3)δX(ϵ)=inf⁡{1−∥12(x+y)∥:∥x∥=∥y∥=1,∥x−y∥≥ϵ},ρX(τ)=12sup⁡{∥x+y∥+∥x−y∥−2:∥x∥=1,∥y∥≤τ} are referred to as the modulus of convexity of X and the modulus of smoothness of X.

Definition 2.1.

A Banach space X is said to be

(1)

uniformly convex if δX(ϵ)>0 for all ϵ∈(0, 2],

(2)

p-convex or convex of power type if for some p>1 and C>0,(2.4)δX(ϵ)≥Cϵp,

(3)

smooth if for every x ≠ 0, there is a unique x*∈X* such that ∥x*∥=1 and 〈x*,x〉=∥x∥,

(4)

uniformly smooth if lim⁡τ→0(ρX(τ)/τ)=0,

(5)

q-smooth or smooth of power type if for some q>1 and C>0, (2.5)ρX(τ)≤Cτq.

There is a tight connection between the modulus of convexity and the modulus of smoothness. The Lindenstrauss duality formula implies that (2.6)X is p-convex iff X* is q-smooth,X is q-smooth iff X* is p-convex, (cf. [16], chapter II, Thereom 2.12). From Dvoretzky's theorem [17], it follows that p≥2 and q≤2. For Hilbert spaces the polarization identity (2.7)∥x−y∥2=∥x∥2−2〈x,y〉+∥y∥2asserts that every Hilbert space is 2-convex and 2-smooth. For the sequence spaces ℓp, Lebesgue spaces Lp, and Sobolev spaces Wpm it is also known [18, 19] that (2.8)ℓp,Lp,Wpm with 1<p≤2 are 2-convex, p-smooth,ℓq,Lq,Wqm with 2≤q<∞ are q-convex , 2-smooth.

2.2. Duality Mapping

For p>1 the set-valued mapping Jp:X⇉X* defined by(2.9)Jp(x)={x*∈X*:〈x*,x〉=∥x∥∥x*∥,∥x*∥=∥x∥p−1}is called the duality mapping of X (with weight function t↦tp−1). By jp we denote a single-valued selection of Jp.

One can show [15, Theorem I.4.4] that Jp is monotone, that is, (2.10)〈x*−y*,x−y〉≥0 ∀x*∈Jp(x), y*∈Jp(y).If X is smooth, the duality mapping Jp is single valued, that is, one can identify it as Jp:X→X* [15, Theorem I.4.5].

If X is uniformly convex or uniformly smooth, then X is reflexive [15, Theorems II.2.9 and II.2.15]. By Jp*, we then denote the duality mapping from X* into X**=X.

Let ∂f:X⇉X* be the subdifferential of the convex functional f:X→ℝ. At x∈X it is defined by (2.11)x¯∈∂f(x)⇔f(y)≥f(x)+〈x¯,y−x〉 ∀y∈X. Another important property of Jp is due to the theorem of Asplund [15, Theorem I.4.4] (2.12)Jp=∂{1p∥⋅∥p}. This equality is also valid in the case of set valued duality mappings.

Example 2.2.

In Lr spaces with 1<r<∞, we have (2.13)〈Jp(f),g〉=∫(1∥f∥rr−p|f(x)|r−1sign(f(x)))⋅g(x)dx.

In the sequence spaces ℓr with 1<r<∞, we have (2.14)〈Jp(x),y〉=∑i(1∥x∥rr−p|xi|r−1sign(xi))⋅yi.

We also refer the interested reader to [20] where additional information on duality mappings may be found.

2.3. Xu-Roach Inequalities

The next theorem (see [19]) provides us with inequalities which will be of great relevance for proving the convergence of our methods.

Theorem 2.3.

(1) Let X be a p-smooth Banach space. Then there exists a positive constant Gp such that (2.15)1p∥x−y∥p≤1p∥x∥p−〈Jp(x),y〉+Gpp∥y∥p ∀x,y∈X.

(2) Let X be a q-convex Banach space. Then there exists a positive constant Cq such that (2.16)1q∥x−y∥q≥1q∥x∥q−〈Jq(x),y〉+Cqq∥y∥q ∀x,y∈X.

We remark that in a real Hilbert space these inequalities reduce to the well-known polarization identity (2.7). Further, we refer to [19] for the exact values of the constants Gp and Cq. For special cases like ℓp-spaces these constants have a simple form, see [8].

2.4. Bregman Distances

It turns out that due to the geometrical characteristics of Banach spaces other than Hilbert spaces, it is often more appropriate to use Bregman distances instead of conventional-norm-based functionals ∥x−y∥ or ∥Jp(x)−Jp(y)∥ for convergence analysis. The idea to use such distances to design and analyze optimization algorithms goes back to Bregman [21] and since then his ideas have been successfully applied in various ways [4, 8, 22–26].

Definition 2.4.

Let X be smooth and convex of power type. Then the Bregman distances Δp(x,y) are defined as (2.17)Δp(x,y):=1q∥Jp(x)∥q−〈Jp(x),y〉+1p∥y∥p.

We summarize a few facts concerning Bregman distances and their relationship to the norm in X (see also [8, Theorem 2.12] ).

Theorem 2.5.

Let X be smooth and convex of power type. Then for all p>1, x,y∈X, and sequences (xn)n in X the following holds:

(1)

Δp(x,y)≥0,

(2)

lim⁡n→∞∥xn−x∥=0⇔lim⁡n→∞Δp(xn,x)=0,

(3)

Δp(⋅,y) is coercive, that is, the sequence (xn) remains bounded if the sequence (Δp(xn,y)) is bounded.

Remark 2.6.

Δp(⋅,⋅) is in general not metric. In a real Hilbert space Δ2(x,y)=(1/2)∥x−y∥2.

To shorten the proof in Chapter 3, we formulate and prove the following.

Lemma 2.7.

Let X be a p-convex Banach space, then there exists a positive constant c, such that (2.18)c⋅∥x−y∥p≤Δp(x,y).

Proof.

We have (1/q)∥Jp(x)∥q=(1/q)∥x∥p and 〈Jp(x),x〉=∥x∥p, hence (2.19)Δp(x,y)=1q∥Jp(x)∥q−〈Jp(x),y〉+1p∥y∥p=(1−1p)∥x∥p−〈Jp(x),y〉+1p∥y∥p=1p∥x−(x−y)∥p−1p∥x∥p+〈Jp(x),x−y〉.

By Theorem 2.3, we obtain (2.20)Δp(x,y)≥Cpp∥x−y∥p.

This completes the proof.

3. The Dual Method

This section deals with an iterative method for minimizing functionals of Tikhonov type. In contrast to the algorithm described in the next section, we iterate directly in the dual space X*.

Due to simplicity, we restrict ourselves to the Tikhonov functional (3.1)Ψ(x)=1r∥Ax−y∥Yr+α12∥x∥X2 with r>1, where X is a 2-convex and smooth Banach space, Y is an arbitrary Banach space and A:X→Y is a linear, continuous operator. For minimizing the functional, we choose an arbitrary starting point x0*∈X* and consider the following scheme (3.2)xn+1*=xn*−μnψn with ψn∈∂Ψ(xn),xn+1=J2*(xn+1*).

We show the convergence of this method in a constructive way. This will be done via the following steps.

(1)

We show the inequality (3.3)Δ2(xn+1,x†)≤Δ2(xn,x†)−μnα⋅Δ2(xn,x†)+μn2G22⋅∥ψn∥2, where x† is the unique minimizer of the Tikhonov functional (3.1).

(2)

We choose admissible stepsizes μn and show that the iterates approach x† in the Bregman sense, if we assume(3.4)Δ2(xn,x†)≥ϵ.

We suppose ϵ>0 to be small and specified later.

(3)

We establish an upper estimate for Δ2(xn+1,x†) in the case that the condition Δ2(xn,x†)≥ϵ is violated.

(4)

We choose ϵ such that in the case Δ2(xn,x†)<ϵ the iterates stay in a certain Bregman ball, that is, Δ2(xn+1,x†)<εaim, where εaim is some a priori chosen precision we want to achieve.

(5)

Finally, we state the iterative minimization scheme.

(i) First, we calculate the estimate for Δn+1, where (3.5)Δn:=Δ2(xn,x†).

Under our assumptions on X, we know that Ψ has a unique minimizer x†. Using (3.2) we get(3.6)Δn+1=12∥xn+1*∥2−〈xn+1*,x†〉+12∥x†∥2=12∥xn*−μnψn∥2−〈xn*−μnψn,x†〉+12∥x†∥2.We remember that X is 2-convex, hence X* is 2-smooth; see Section 2.1. By Theorem 2.3 applied to X*, we get(3.7)12∥xn*−μnψn∥2≤12∥xn*∥2−μn〈xn,ψn〉+G22⋅μn2∥ψn∥2.

Therefore, (3.8)Δn+1≤12∥xn*∥2−μn〈ψn,xn〉+G22⋅μn2∥ψn∥2−〈xn*,x†〉+μn〈ψn,x†〉+12∥x†∥2=Δn+μn〈ψn,x†−xn〉+μn2⋅G22⋅∥ψn∥2.

We have(3.9)∂Ψ(x)=A*Jr(Ax−y)+αJ2(x), (cf. [27], Chapter I; Propositons 5.6, 5.7). By definition, x† is the minimizer of Ψ, hence ψ†:=0∈∂Ψ(x†). Therefore, with the monotonicity of Jr, we get(3.10)〈ψn,x†−xn〉 =〈ψn−ψ†,x†−xn〉 =α〈J2(xn)−J2(x†),x†−xn〉+〈A*jr(Axn−y)−A*jr(Ax†−y),x†−xn〉 =−α〈J2(xn)−J2(x†),xn−x†〉−〈jr(Axn−y)−jr(Ax†−y),(Axn−y)−(Ax†−y)〉 ≤−α〈J2(xn)−J2(x†),xn−x†〉.Consider

(3.11)〈ψn,x†−xn〉≤−α〈J2(xn)−J2(x†),xn−x†〉=−α[Δ2(xn,x†)+Δ2(x†,xn)]≤−α⋅Δn.

Finally, we arrive at the desired inequality (3.12)Δn+1≤Δn−μnα⋅Δn+μn2G22⋅∥ψn∥2.

(ii) Next, we choose admissible stepsizes. Assume that (3.13)Δ2(x0,x†)=Δ0≤R.

We see that the choice (3.14)μn=αG2∥ψn∥2⋅Δn

minimizes the right-hand side of (3.12). We do not know the distance Δn, therefore, we set (3.15)μn:=αG2P⋅ϵ.We will impose additional conditions on ϵ later. For the time being, assume that ϵ is small. The number P is defined by (3.16)P=P(R)=sup⁡{∥ψ∥2:ψ∈∂Ψ(x) with Δ2(x,x†)≤R}.The Tikhonov functional Ψ is bounded on norm bounded sets, thus also ∂Ψ is bounded on norm-bounded sets. By Lemma 2.7, we know then that(3.17)∥x0−x†∥≤Rc. Hence, P is finite for finite R.

Remark 3.1.

If we assume ∥x†∥≤ρ and with the help of Lemma 2.7, the definition of P, and the duality mapping J2, we get an estimate for P. We have (3.18)∥x−x†∥≤Rc,∥x∥≤∥x−x†∥+∥x†∥≤Rc+ρ.We calculate an estimate for ∥ψ∥: (3.19)∥ψ∥=∥A*jr(Ax−y)+αJ2(x)∥ ≤∥A*∥∥jr(Ax−y)∥+α∥J2(x)∥≤∥A*∥∥Ax−y∥r−1+α∥x∥≤∥A∥(∥A∥[Rc+ρ]+∥y∥)r−1+α(Rc+ρ).This calculation gives us an estimate for P. In practice, we will not determine this estimate exactly, but choose P in a sense big enough.

For Δn≥ϵ we approach the minimizer x† in the Bregman sense, that is, (3.20)Δn+1≤Δn−α2G2Pϵ2+α22G2Pϵ2=Δn−α22G2Pϵ2=:Δn−Dϵ2, where (3.21)D:=D(R)=α22G2P.This ensures(3.22)Δn+1<Δn<⋯<Δ0 as long as Δn≥ϵ is fulfilled.

(iii) We know the behavior of the Bregman distances, if Δn≥ϵ holds. Next, we need to know what happens if Δn<ϵ. By (3.12), we then have(3.23)Δn+1≤Δn+Dϵ2<ϵ+Dϵ2.

(iv) We choose(3.24)ϵ:=−1+1+4D⋅εaim2D,where εaim>0 is the accuracy we aim at. For the case Δn<ϵ this choice of ϵ assures that(3.25)Δn+1<ϵ+Dϵ2=ϵaim.Note that the choice of ϵ implies ϵ≤εaim.

Next, we calculate an index N, which ensures that the iterates xn with n≥N are located in a Bregman ball with radius εaim around x†. We know that if xn fulfills Δn≤εaim, then all following iterates fulfill this condition as well.

Hence, the opposite case is Δn+1≥εaim≥ϵ. By (3.20), we know that this is only the case if(3.26)εaim≤Δn+1≤R−nDϵ2.

By choosing N such that(3.27)N>R−εaimDϵ2=R−εaim(1+(1−1+4Dεaim)/2Dεaim)εaim,

we get (3.28)ΔN≤εaim.

Figure 1 illustrates the behavior of the iterates.

Figure 1

Geometry of the problem. The iterates xn approach x† as long as Δ2(xn,x†)≥ϵ. The auxiliary number ϵ is chosen such that, if the iterates enter the Bregman ball with radius εaim around x†, the following iterates stay in that ball.

(v) We are now in the same situation as described in (2). If we replace R by εaim, x0 by xN and εaim by some εaim,2<εaim and repeat the argumentation in (2)–(4), we obtain a contracting sequence of Bregman balls.

If the sequence (εaim,k)k is a null sequence, then by Lemma 2.7 the iterates xn converge strongly to x†. This proves the following.

Theorem 3.2.

The iterative method, defined by

(S₀)

choose an arbitrary x0 and a decreasing positive sequence (εk)k with(3.29)lim⁡k→∞εk=0Δ2(x0,x†)<ε1,

set k=1;

(S₁)

compute P, D, ϵ, and μ as (3.30)P=sup⁡{∥ψ∥2:ψ∈∂Ψ(x) with Δ2(x,x†)≤εk},D=α22,G2Pϵ=−1+1+4D⋅εk+12,Dμ=αG2Pϵ;

(S₂)

iterate xn by(3.31)xn+1*=xn*−μ⋅ψn with ψn∈∂Ψ(xn),xn+1=J2*(xn+1*),

for at least N iterations, where(3.32)N>εk−εk+1(1+(1−1+4Dεk+1)/2Dεk+1)εk+1;

(S₃)

let k←(k+1), reset P, D, ϵ, μ, N and go to step (S₁), defines an iterative minimization method for the Tikhonov functional Ψ, defined in (3.1) and the iterates converge strongly to the unique minimizer x†.

Remark 3.3.

A similar construction can be carried out for any p-convex and smooth Banach space.

4. Steepest Descent Method

Let X be uniformly convex and uniformly smooth and let Y be uniformly smooth. Then the Tikhonov functional(4.1)Ψ(x):=1r∥Ax−y∥r+αp∥x∥pis strictly convex, weakly lower semicontinuous, coercive, and Gâteaux differentiable with derivative (4.2)∇Ψ(x)=A*Jr(Ax−y)+αJp(x).Hence, there exists the unique minimizer x† of Ψ, which is characterized by(4.3)Ψ(x†)=min⁡x∈XΨ(x)⇔∇Ψ(x†)=0.In this section, we consider the steepest descent method to find x†. In [28, 29], it has already been proven that for a general continuously differentiable functional Ψ every cluster point of such steepest descent method is a stationary point. Recently, Canuto and Urban [30] have shown strong convergence under the additional assumption of ellipticity, which our Ψ in (4.1) would fulfill if we required X to be p-convex. Here we prove strong convergence without this additional assumption. To make the proof of convergence more transparent, we confine ourselves here to the case of r-smooth Y and p-smooth X (with then r,p∈(1,2] being the ones appearing in the definition of the Tikhonov functional (4.1)) and refer the interested reader to the appendix, where we prove the general case.

Theorem 4.1.

The sequence (xn)n, generated by

(S₀)

choose an arbitrary starting point x0∈X and set n=0;

(S₁)

if ∇Ψ(xn)=0, then STOP else do a line search to find μn>0 such that(4.4)Ψ(xn−μnJq*(∇Ψ(xn)))=min⁡μ∈ℝΨ(xn−μJq*(∇Ψ(xn)));

(S₂)

set xn+1:=xn−μnJq*(∇Ψ(xn)), n←(n+1) and go to step (S₁), converges strongly to the unique minimizer x† of Ψ.

Remark 4.2.

(a) If the stopping criterion ∇Ψ(xn)=0 is fulfilled for some n∈ℕ, then by (4.3), we already have xn=x† and we can stop iterating.

(b) Due to the properties of Ψ, the function fn:ℝ→[0,∞) defined by(4.5)fn(μ):=Ψ(xn−μJq*(∇Ψ(xn)))appearing in the line search of step (S₁) is strictly convex and differentiable with continuous derivative(4.6)fn′(μ)=−〈∇Ψ(xn−μJq*(∇Ψ(xn))),Jq*(∇Ψ(xn))〉.Since fn′(0)=−∥∇Ψ(xn)∥q<0 and fn′ is increasing by the monotonicity of the duality mappings, we know that μn must in fact be positive.

Proof of Theorem <xref ref-type="statement" rid="thm4">4.1</xref>.

By the above remark it suffices to prove convergence in case ∇Ψ(xn) ≠ 0 for all n∈ℕ. We fix γ∈(0,1) and show that there exists positive μ˜n such that (4.7)Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ), which will finally assure convergence. To establish this relation, we use the characteristic inequalities in Theorem 2.3 to estimate, for all μ>0,(4.8)Ψ(xn+1)≤Ψ(xn−μJq*(∇Ψ(xn)))=1r∥(Axn−y)−μAJq*(∇Ψ(xn))∥r+αp∥xn−μJq*(∇Ψ(xn))∥p≤1r∥Axn−y∥r−〈Jr(Axn−y),μAJq*(∇Ψ(xn))〉,+Grr∥μAJq*(∇Ψ(xn))∥r +αp∥xn∥p−α〈Jp(xn),μJq*(∇Ψ(xn))〉+αGpp∥μJq*(∇Ψ(xn))∥p.

By (4.1) and (4.2) for x=xn and(4.9)〈∇Ψ(xn),Jq*(∇Ψ(xn))〉=∥∇Ψ(xn)∥q=∥Jq*(∇Ψ(xn))∥p,we can further estimate (4.10)Ψ(xn+1)≤Ψ(xn)−μ∥∇Ψ(xn)∥q+Grr∥AJq*(∇Ψ(xn))∥rμr+αGpp∥∇Ψ(xn)∥qμp=Ψ(xn)−μ∥∇Ψ(xn)∥q(1−ϕn(μ)), whereby we set (4.11)ϕn(μ):=Grr∥AJq*(∇Ψ(xn))∥r∥∇Ψ(xn)∥qμr−1+αGppμp−1.The function ϕn:(0,∞)→(0,∞) is continuous and increasing with lim⁡μ→0 ϕn(μ)=0 and lim⁡μ→∞ ϕn (μ)=∞. Hence, there exists a μ˜n>0 such that(4.12)ϕn(μ˜n)=γand we get (4.13)Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ).We show that lim⁡n→∞∥∇Ψ(xn)∥=0. From (4.13), we infer that the sequence (Ψ(xn))n is decreasing and especially bounded and that(4.14)lim⁡n→∞μ˜n∥∇Ψ(xn)∥q=0.Since Ψ is coercive, the sequence (xn)n remains bounded and (4.2) then implies that the sequence (∇Ψ(xn))n is bounded as well. Suppose limsup⁡n→∞∥∇Ψ(xn)∥=ϵ>0 and let ∥∇Ψ(xnk)∥→ϵ for k→∞. Then we must have lim⁡k→∞μ˜nk=0 by (4.14). But by the definition of ϕn (4.11) and the choice of μ˜n (4.12), we get for some constant C>0 with ∥AJq*(∇Ψ(xn))∥r≤C,(4.15)0<γ=ϕnk(μnk)≤GrrC∥∇Ψ(xnk)∥qμ˜nkr−1+αGppμ˜nkp−1.Since the right-hand side converges to zero for k→∞, this leads to a contradiction. So we have limsup⁡n→∞∥∇Ψ(xn)∥=0 and thus lim⁡n→∞∥∇Ψ(xn)∥=0. We finally show that (xn)n converges strongly to x†. By (4.3) and the monotonicity of the duality mapping Jr, we get(4.16)∥∇Ψ(xn)∥ ∥xn−x†∥≥〈∇Ψ(xn),xn−x†〉=〈∇Ψ(xn)−∇Ψ(x†),xn−x†〉=〈Jr(Axn−y)−Jr(Ax†−y),(Axn−y)−(Ax†−y)〉 +α〈Jp(xn)−Jp(x†),xn−x†〉≥α〈Jp(xn)−Jp(x†),xn−x†〉.Since (xn)n is bounded and lim⁡n→∞∥∇Ψ(xn)∥=0, this yields(4.17)lim⁡n→∞〈Jp(xn)−Jp(x†),xn−x†〉=0,from which we infer that (xn)n converges strongly to x† in a uniformly convex X [15, Theorom II.2.17].

5. Conclusions

We have analyzed two conceptionally quite different nonlinear iterative methods for finding the minimizer of norm-based Tikhonov functionals in Banach spaces. One is the steepest descent method, where the iterations are directly carried out in the X-space by pulling the gradient of the Tikhonov functional back to X via duality mappings. The method is shown to be strongly convergent in case the involved spaces are nice enough. In the other one, the iterations are performed in the dual space X*. Though this method seems to be inherently slow, strong convergence can be shown without restrictions on the Y-space.

AppendixSteepest Descent Method in Uniformly Smooth Spaces

As already pointed out in Section 4, we prove here Theorem 4.1 for the general case of X being uniformly convex and uniformly smooth and Y being uniformly smooth, and with r,p≥2 in the definition of the Tikhonov functional (4.1). To do so, we need some additional results based on the paper of Xu and Roach [19].

In what follows C,L>0 are always supposed to be (generic) constants and we write(A.1)a∨b=max⁡{a,b}, a∧b=min⁡{a,b}.Let ρ¯X:(0,∞)→(0,1] be the function(A.2)ρ¯X(τ):=ρX(τ)τ,where ρX is the modulus of smoothness of a Banach space X. The function ρ¯X is known to be continuous and nondecreasing [14, 31].

The next lemma allows us to estimate ∥Jp(x)−Jp(y)∥ via ρ¯X(∥x−y∥), which in turn will be used to derive a version of the characteristic inequality that is more convenient for our purpose.

Lemma A.1.

Let X be a uniformly smooth Banach space with duality mapping Jp with weight p≥2. Then for all x,y∈X the following inequalities are valid: (A.3)∥Jp(x)−Jp(y)∥≤Cmax⁡{1,(∥x∥∨∥y∥)p−1}ρ¯X(∥x−y∥)(hence, Jp is uniformly continuous on bounded sets) and (A.4)∥x−y∥p≤∥x∥p−p〈Jp(x),y〉+C(1∨(∥x∥+∥y∥)p−1)ρX(∥y∥).

Proof.

We at first prove (A.3). By [19, formula (3.1)], we have (A.5)∥Jp(x)−Jp(y)∥≤C(∥x∥∨∥y∥)p−1ρ¯X(∥x−y∥∥x∥∨∥y∥). We estimate similarly as after inequality (3.5) in the same paper. If 1/(∥x∥∨∥y∥)≤1, then we get by the monotonicity of ρ¯X(A.6)ρ¯X(∥x−y∥∥x∥∨∥y∥)≤ρ¯X(∥x−y∥)and therefore (A.3) is valid. In case 1/(∥x∥∨∥y∥)≥1 (⇔∥x∥∨∥y∥≤1), we use the fact that ρX is equivalent to a decreasing function (i.e. ρX(η)/η2≤L(ρX(τ)/τ2) for η≥τ>0 [14]) and get(A.7)ρX(∥x−y∥∥x∥∨∥y∥)≤L(∥x∥∨∥y∥)2ρX(∥x−y∥)and therefore (A.8)ρ¯X(∥x−y∥∥x∥∨∥y∥)≤L∥x∥∨∥y∥ρ¯X(∥x−y∥).For p≥2, we thus arrive at (A.9)∥Jp(x)−Jp(y)∥≤CL(∥x∥∨∥y∥)p−2ρ¯X(∥x−y∥)≤CLρ¯X(∥x−y∥) and also in this case (A.3) is valid.

Let us prove (A.4). As in [19], we consider the continuously differentiable function f:[0,1]→ℝ with (A.10)f(t):=∥x−ty∥p, f′(t)=−p〈Jp(x−ty),y〉,f(0)=∥x∥p, f(1)=∥x−y∥p, f′(0)=−p〈Jp(x),y〉 and get (A.11)∥x−y∥p−∥x∥p+p〈Jp(x),y〉=f(1)−f(0)−f′(0)=∫01f′(t)−f′(0)dt=p∫01〈Jp(x)−Jp(x−ty),y〉dt≤p∫01∥Jp(x)−Jp(x−ty)∥∥y∥dt. For t∈[0,1], we set y˜:=x−ty and get x−y˜=ty, ∥y˜∥≤∥x∥+∥y∥ and thus ∥x∥∨∥y˜∥≤∥x∥+∥y∥. By the monotonicity of ρ¯X, we have(A.12)ρ¯X(t∥y∥)∥y∥≤ρ¯X(∥y∥)∥y∥=ρX(∥y∥)and by (A.3), we thus obtain (A.13)∥x−y∥p−∥x∥p+p〈Jp(x),y〉≤p∫01Cmax⁡{1,(∥x∥+∥y∥)p−1}ρ¯X(t∥y∥)∥y∥dt≤Cmax⁡{1,(∥x∥+∥y∥)p−1}ρX(∥y∥).

The proof of Theorem 4.1 is now quite similar to the case of smoothness of power type, though it is more technical, and we only give the main modifications.

Proof of Theorem <xref ref-type="statement" rid="thm4">4.1</xref> (for uniformly smooth spaces).

We fix γ∈(0,1), μ¯>0 and for n∈ℕ, we choose μ˜n∈(0,μ¯] such that(A.14)ϕn(μ˜n)=ϕn(μ¯)∧γ.Here the function ϕn:(0,∞)→(0,∞) is defined by (A.15)ϕn(μ):=CYr(1∨(∥Axn−y∥+μ¯∥AJq*(∇Ψ(xn))∥)r−1) ×∥AJq*(∇Ψ(xn))∥∥∇Ψ(xn)∥qρ¯Y(μ∥AJq*(∇Ψ(xn))∥) +αCXp(1∨(∥xn∥+μ¯∥∇Ψ(xn)∥q−1)p−1) ×1∥∇Ψ(xn)∥ρ¯X(μ∥∇Ψ(xn)∥q−1) with the constants CX,CY being the ones appearing in the respective characteristic inequalities (A.4). This choice of μ˜n is possible since by the properties of ρ¯Y and ρ¯X, the function ϕn is continuous, increasing and lim⁡μ→0ϕn(μ)=0. We again aim at an inequality of the form(A.16)Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ),which will finally assure convergence. Here we use the characteristic inequalities (A.4) to estimate (A.17)Ψ(xn+1) ≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q +CYr(1∨(∥Axn−y∥+∥μ˜nAJq*(∇Ψ(xn))∥)r−1)ρY(∥μ˜nAJq*(∇Ψ(xn))∥) +αCXp(1∨(∥xn∥+∥μ˜nJq*(∇Ψ(xn))∥)p−1)ρX(∥μ˜nJq*(∇Ψ(xn))∥).Since μ˜n≤μ¯ and by the definition of ϕn (A.15), we can further estimate(A.18)Ψ(xn+1) ≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q +CYr(1∨(∥Axn−y∥+μ¯∥AJq*(∇Ψ(xn))∥)r−1)ρY(∥μ˜nAJq*(∇Ψ(xn))∥) +αCXp(1∨(∥xn∥+μ¯∥Jq*(∇Ψ(xn))∥)p−1)ρX(μ˜n∥Jq*(∇Ψ(xn))∥).=Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−ϕn(μ˜n))The choice of μ˜n (A.14) finally yields (A.19)Ψ(xn+1)≤Ψ(xn)−μ˜n∥∇Ψ(xn)∥q(1−γ). It remains to show that this implies lim⁡n→∞∥∇Ψ(xn)∥=0. The rest then follows analogously as in the proof of Theorem 4.1. From (A.19), we infer that (A.20)lim⁡n→∞μ˜n∥∇Ψ(xn)∥q=0 and that the sequences (xn)n and (∇Ψ(xn))n are bounded.

Suppose limsup⁡n→∞∥∇Ψ(xn)∥=ϵ>0 and let ∥∇Ψ(xnk)∥→ϵ for k→∞. Then we must have lim⁡k→∞μ˜nk=0 by (A.20). We show that this leads to a contradiction. On the one hand by (A.15), we get (A.21)ϕnk(μ˜nk)≤L1∥∇Ψ(xnk)∥qρ¯Y(μ˜nkL2)+C1∥∇Ψ(xnk)∥ρ¯X(μ˜nkC2). Since the right-hand side converges to zero for k→∞, so does ϕnk(μ˜nk). On the other hand, by (A.14), we have(A.22)ϕnk(μ˜nk)=ϕnk(μ¯)∧γ,ϕnk(μ¯)≥0+C ρ¯X(μ¯∥∇Ψ(xnk)∥q−1).Hence, ϕnk(μ˜nk)≥L>0 for all k big enough which contradicts lim⁡k→∞ϕnk(μ˜nk)=0. So we have limsup⁡n→∞∥∇Ψ(xn)∥=0 and thus lim⁡n→∞∥∇Ψ(xn)∥=0.

Acknowledgment

The first author was supported by Deutsche Forschungsgemeinschaft, Grant no. MA 1657/15-1.

Louis

A. K.

Inverse und schlecht gestellte Probleme1989

Stuttgart, Germany

B. G. Teubner

205Teubner Studienbücher Mathematik

MR1002946

ZBL0667.65045

Rieder

No Problems with Inverse Problems2003

Braunschweig, Germany

Vieweg & Sohn

xiv+300

MR2030046

ZBL1057.65035

Engl

Hanke

Neubauer

Regularization of Inverse Problems2000

Dordrecht, The Netherlands

Kluwer Academic

Alber

Y. I.

Iterative regularization in Banach spaces

Soviet Mathematics198630418

ZBL0623.47071

Plato

On the discrepancy principle for iterative and parametric methods to solve linear ill-posed equations

Numerische Mathematik199675199120

10.1007/s002110050232

MR1417865

ZBL0864.65034

Osher

Burger

Goldfarb

Yin

An iterative regularization method for total variation-based image restoration

Multiscale Modeling & Simulation200542460489

10.1137/040605412

MR2162864

Butnariu

Resmerita

Bregman distances, totally convex functions, and a method for solving operator equations in Banach spaces

Abstract and Applied Analysis2006200639

84919

10.1155/AAA/2006/84919

MR2211675

Schöpfer

Louis

A. K.

Schuster

Nonlinear iterative methods for linear ill-posed problems in Banach spaces

Inverse Problems2006221311329

MR2194197

Daubechies

Defrise

De Mol

An iterative thresholding algorithm for linear inverse problems with a sparsity constraint

Communications on Pure and Applied Mathematics2004571114131457

10.1002/cpa.20042

MR2077704

ZBL1077.65055

Bredies

Lorenz

Maass

A generalized conditional gradient method and its connection to an iterative shrinkage method

to appear in Computational Optimization and Applications

Alber

Y. I.

Iusem

A. N.

Solodov

M. V.

Minimization of nonsmooth convex functionals in Banach spaces

Journal of Convex Analysis199742235255

MR1613463

ZBL0895.90150

Reich

Zaslavski

A. J.

Generic convergence of descent methods in Banach spaces

Mathematics of Operations Research2000252231242

MR1853950

ZBL1032.90064

Reich

Zaslavski

A. J.

The set of divergent descent methods in a Banach space is σ-porous

SIAM Journal on Optimization200111410031018

10.1137/S1052623400370357

MR1855218

ZBL1032.90065

Lindenstrauss

Tzafriri

Classical Banach Spaces. II197997

Berlin, Germany

Springer

x+243Results in Mathematics and Related Areas

MR540367

ZBL0403.46022

Cioranescu

Geometry of Banach Spaces, Duality Mappings and Nonlinear Problems199062

Dordrecht, The Netherlands

Kluwer Academic Publishers

xiv+260Mathematics and Its Applications

MR1079061

ZBL0712.47043

Deville

Godefroy

Zizler

Smoothness and Renormings in Banach Spaces199364

Harlow, UK

Longman Scientific & Technical

xii+376Pitman Monographs and Surveys in Pure and Applied Mathematics

MR1211634

ZBL0782.46019

Dvoretzky

Some results on convex bodies and Banach spaces

Proceedings of the International Symposium on Linear Spaces1961

Jerusalem, Israel

Jerusalem Academic Press

123160

MR0139079

ZBL0119.31803

Hanner

On the uniform convexity of Lp and lp

Arkiv för Matematik195633239244

MR0077087

ZBL0071.32801

Z. B.

Roach

G. F.

Characteristic inequalities of uniformly convex and uniformly smooth Banach spaces

Journal of Mathematical Analysis and Applications19911571189210

10.1016/0022-247X(91)90144-O

MR1109451

ZBL0757.46034

Reich

Review of I. Cioranescu “Geometry of Banach spaces, duality mappings and nonlinear problems”

Bulletin of the American Mathematical Society1992262367370

Bregman

L. M.

The relaxation method for finding common points of convex sets and its application to the solution of problems in convex programming

USSR Computational Mathematics and Mathematical Physics19677200217

Byrne

Censor

Proximity function minimization using multiple Bregman projections, with applications to split feasibility and Kullback-Leibler distance minimization

Annals of Operations Research20011051–47798

10.1023/A:1013349430987

MR1879420

ZBL1012.90035

Alber

Y. I.

Butnariu

Convergence of Bregman projection methods for solving consistent convex feasibility problems in reflexive Banach spaces

Journal of Optimization Theory and Applications19979213361

10.1023/A:1022631928592

MR1428127

ZBL0886.90179

Bauschke

H. H.

Borwein

J. M.

Combettes

P. L.

Bregman monotone optimization algorithms

SIAM Journal on Control and Optimization2003422596636

10.1137/S0363012902407120

MR1982285

ZBL1049.90053

Bauschke

H. H.

Lewis

A. S.

Dykstra's algorithm with Bregman projections: a convergence proof

Optimization2000484409427

10.1080/02331930008844513

MR1811866

Lafferty

J. D.

Pietra

S. D.

Pietra

V. D.

Statistical learning algorithms based on Bregman distances

Proceedings of the 5th Canadian Workshop on Information Theory

June 1997

Toronto, Ontario, Canada

Ekeland

Temam

Convex Analysis and Variational Problems1976

Amsterdam, The Netherlands

North-Holland

ix+402

MR0463994

Phelps

R. R.

Metric projections and the gradient projection method in Banach spaces

SIAM Journal on Control and Optimization1985236973977

10.1137/0323055

MR809544

ZBL0579.90099

Byrd

R. H.

Tapia

R. A.

An extension of Curry's theorem to steepest descent in normed linear spaces

Mathematical Programming197591247254

10.1007/BF01681347

MR0385676

ZBL0317.90048

Canuto

Urban

Adaptive optimization of convex functionals in Banach spaces

SIAM Journal on Numerical Analysis200542520432075

10.1137/S0036142903429730

MR2139236

ZBL1081.65053

Figiel

On the moduli of convexity and smoothness

Studia Mathematica1976562121155

MR0425581

ZBL0344.46052