Convex Set Theoretic Image Recovery via Chaotic Iterations of Approximate Projections

A definition of oblique projections onto closed convex sets that use seminorms induced by diagonal matrices which may have zeros on the diagonal is introduced. Existence and uniqueness of such projections are secured via directional affinity of the sets with respect to the diagonal matrices involved. A block-iterative algorithmic scheme for solving the convex feasibility problem, employing seminorm-induced oblique projections, is constructed and its convergence for the consistent case is established. The fully simultaneous algorithm converges also in the inconsistent case to the minimum of a certain proximity function.


Introduction
The motivation of this paper comes from the recent work of Censor et al. [14,15] who used generalized oblique projections for the solution of large and sparse systems of linear equations arising in the fully discretized approach to the problem of image reconstruction from projections, see, for example, Herman [22] and Censor [11].In order to achieve significant acceleration in the algorithm's behavior, they had to use generalized oblique projections (as they called them) which allow zeros on the diagonals of the weighting matrices for the fully simultaneous projections algorithm that they used.
Since the work of [14,15] is limited to linear equations, it is natural to inquire whether it can be extended, and if so, in what directions and to what extent?Is it possible to extend the definition of the generalized oblique projections of [14,15] to cover convex sets which are not linear (i.e., not hyperplanes or halfspaces)?If there is a way to do so, can one employ these projections in iterative algorithmic schemes for solving the convex (not necessarily linear) feasibility problem?Under what conditions would such algorithms converge?How would a fully simultaneous iterative algorithm, which employs these projections for several contexts.Kayalar and Weinert [24] promote oblique projections for local processing in sensor arrays and credit Murray [26] and Lorch [25] for pioneering work on oblique projections.Behrens and Scharf [6] use oblique projections for signal processing applications.Oblique projections onto hyperplanes in a fully simultaneous (Cimmino) algorithm have been proposed and used by Arioli et al. [2].They showed that using a Cimmino algorithm, in which all projections are oblique with respect to a given symmetric positive definite matrix G for a system Ax = b, is equivalent to applying a Cimmino algorithm with all orthogonal projections to the postconditioned system AG −1/2 x = b, where x = G 1/2 x and G 1/2 , the symmetric square root of G, is the unique symmetric positive definite matrix obtained from Our work is related to the block-iterative multiprojection successive generalized projection (BIMSGP) method developed by Byrne in [9].Specifically, a special case of our algorithmic scheme (Algorithm 3.2), in which the diagonal weight matrices are not allowed to change with the iteration index and for which only fixed blocks of constraints are permissible, is derivable from [9,Algorithm 4.2].We return to this point at the end of Section 3.
The fully simultaneous projection algorithm with orthogonal projections was first proposed by Cimmino [17] for linear systems of equations and then by Auslender [3] for general convex sets, see the review paper of Bauschke and Borwein [5].For general convex sets and orthogonal projections, Iusem and De Pierro [23] proved local convergence in the inconsistent case, while Combettes [18] showed global convergence in the inconsistent case by employing a product space formulation which extends the one of Pierra [27] and can handle the inconsistent case.Our work in Section 4 generalizes these previous algorithmic schemes to the case of seminorm-induced oblique projections.Our convergence analysis in Section 4 might also be deduced, with some additional efforts, from results in [10], but we prefer to prove it here along the self-contained lines of the proof developed in [15]; thereby, making it independent of the theory of Bregman functions and other technical tools used in [10].

Seminorm-induced oblique projections
Let G = diag(g 1 ,g 2 ,...,g n ) be a diagonal n × n matrix with g j ≥ 0, for all j = 1,2,...,n, and G = 0, that is, at least one element of G is different from zero.Denote the index set of positive diagonal elements of G by J = J(G) = { j | g j > 0, 1 ≤ j ≤ n} and its cardinality by |J|.We consider the restriction R |J| of R n to only those components that belong to J.For convenience and notational simplicity, assume, without loss of generality, that the zero entries on the diagonal of G (if any) all come last.If z = (z j ) ∈ R n , we denote its restriction to R |J| by z [J] .For any nonempty closed convex set C ⊆ R n , the restriction of C to R |J| is the set Further, denote by G [J] the |J| × |J| matrix obtained from G by deleting all zeros from its diagonal.The function x G is defined, for every x ∈ R n , by where x G is the (classical) Hilbert space ellipsoidal norm.Observe that x G is a vector seminorm (see, e.g., Taylor [29]) because it may be equal to zero for x = 0 if G has at least one g j = 0.For this reason, a projection onto a convex set with respect to a seminorm needs to be carefully defined.Our next definitions and proposition accommodate this possibility.Definition 2.1 (seminorm-induced oblique projection).Let C ⊆ R n be a closed convex set and let G = 0 be an n × n nonnegative diagonal matrix.For any y ∈ R n , the seminorm-induced oblique projection of y onto C with respect to G, denoted by P G C (y), is defined as a point which has the following properties: (i) P G C (y) ∈ C, and (ii) P G C (y) = y * , where ) The presence of zeros on the diagonal of G whenever j ∈ J implies that (2.5) A seminorm-induced oblique projection reduces to an ellipsoidal oblique projection of y onto C if all diagonal elements of G are positive.This is a special case of a Bregman projection of y onto C according to the Bregman function f (x) = x 2 G = x,Gx , consult, for example, Censor and Zenios [16, Chapter 2] for background material and further references on Bregman functions, distances, and projections originating from Bregman [8].
On the other hand, a general seminorm-induced oblique projection onto a set C ⊆ R n can be identified with an appropriately defined ellipsoidal oblique projection onto a certain cylindrical set generated from C, as we show next.Let G be a diagonal n × n nonnegative nonzero matrix with index set of positive diagonal elements J and |J| < n.Define the n × n diagonal positive matrix Ĝ by Proof.Denote the complement of J in {1, 2,...,n} by J so that any u ∈ R n can be represented as u = (u [J] ,u [ J ] ).Then, where I is the (n − |J|) × (n − |J|) unit matrix.This concludes the proof in view of (2.3) and (2.4).
In order to secure existence and uniqueness of seminorm-induced oblique projections, we need to impose a "cylindricity" relationship between the seminorm inducing matrix G and the convex set C onto which we project, as we do in the next definitions.Definition 2.3 (directional affinity of sets).A nonempty closed convex set C ⊆ R n is said to be affine in the direction d ∈ R n if (i) together with any two distinct points of the set whose difference vector is a scalar multiple of d, the line through these points belongs to the set, and (ii) there is at least one pair of points in the set which fulfills (i).
Definition 2.4 (directional affinity with respect to G).Let C ⊆ R n be a nonempty closed convex set and let G be a nonnegative diagonal n × n matrix G = diag(g 1 , g 2 ,...,g n ) with g j ≥ 0, for all j = 1,2,...,n, and G = 0.If C is affine in the direction of every standard basis direction vector e j of R n for which j / ∈ J, then C is called directionally affine with respect to G. Proposition 2.5.Let C ⊆ R n be a nonempty closed convex set and let G = diag(g 1 , g 2 ,...,g n ) be a nonnegative diagonal n × n matrix with g j ≥ 0, for all j = 1,2,...,n, and G = 0.If C is directionally affine with respect to G, then for any y ∈ R n , there is a unique seminorm-induced oblique projection P G C (y) of y onto C with respect to G.
Proof.By standard results on (classical) oblique projections onto closed convex sets (use, e.g., Censor and Zenios [16 is a Bregman function), it follows that there exists a unique z [J] ∈ C [J] that solves (2.4).The directional affinity of C with respect to G allows us to define y * j = y j , as we did in (2.3).

Seminorm-induced oblique projections
The next examples demonstrate these notions.
Example 2.6.Let the convex set be represented as , where h : R n → R is a convex function, and assume that ∂h/∂x j ≡ 0, for all j / ∈ J.Then, h(x) does not depend on the values that the jth component of x is taking as long as j / ∈ J and Proposition 2.5, together with (2.3), guarantees that P G C (y) ∈ C. Example 2.7.Let the convex set be a hyperplane H = {x ∈ R n | a,x = b} with a = (a j ) ∈ R n and b ∈ R given and assume that in the matrix G it is true that g j = 0 if and only if a j = 0.Then, there is a closed-form formula for the seminorminduced oblique projection P G H (y) of a point y ∈ R n onto H with respect to G, given by Censor et al. in [15], which has the following form: (2.9) It is not difficult to verify that P G H (y) ∈ H and that it is the appropriate expression for the seminorm-induced oblique projection.This example is in fact a special case of the previous example.It has been analyzed and used in practice in [14,15].
Postulating a relationship between the seminorm inducing matrix G and the convex set onto which we project, namely, that the set must be a directionally affine set, limits the scope of our theory of seminorm-induced oblique projections onto (general) convex sets.We do not know if it is possible to relax or lift this condition.In the linear case of Example 2.7, this condition was termed sparsity pattern orientation (SPO) by Censor et al. [15,Definition 3.1], and was taken as an advantage for creating a new accelerated fully simultaneous projection method called the component averaging (CAV) method.
The next proposition generalizes a classical result.We prove it here along the same lines as the proof of a similar result for Bregman generalized distances (see Bregman [8]) given in [16, Theorem 2.4.1].
Proposition 2.8.Let Q ⊆ R n be a nonempty closed convex set and let G = 0 be a nonnegative diagonal matrix such that Q is diagonally affine with respect to G. If z ∈ Q is any given point, then for any y ∈ R n , the following inequality holds: (2.10) Proof.By expanding the function Y. Censor and T. Elfving 393 according to (2.2), we find that E(u) = u,α + β for some α ∈ R n and β ∈ R, which are independent of u; thus, E(u) is convex.Denoting u λ = λz + (1 − λ)P G Q (y), for any 0 ≤ λ ≤ 1, we know that u λ ∈ Q by Proposition 2.5.Due to the convexity of E(u), we then obtain (observe that due to the linearity of E(u), the inequalities in the next two formulae are actually equalities) (2.12) For λ > 0, this gives (2.13) The first term on the right-hand side of (2.13) is nonnegative because of the minimization property (2.5) of P G Q (y) and the second term tends to zero as λ → 0 + .Therefore, for small enough positive values of λ, the right-hand side of (2.13) is nonnegative.

The block-iterative algorithmic scheme
Let C i ⊆ R n , i = 1,2,...,m, be closed convex sets such that C := ∩ m i=1 C i = ∅.For each k = 0,1,2,..., we define a vector {G k i } m i=1 , each of whose m components is an n × n diagonal matrix with g k i j ≥ 0, for all i = 1,2,...,m, all j = 1,2,...,n, and all k ≥ 0. Denote by B(k) the index set be an infinite sequence of vectors of nonnegative diagonal n × n matrices.
(i) If there exists an > 0 such that, for every i ∈ B(k), the diagonal elements g k i j are either g k i j = 0 or g k i j ≥ > 0, for all j = 1,2,...,n, and for all k ≥ 0; and (ii) if m i=1 G k i = I, for all k ≥ 0, where I denotes the unit matrix; i=1 } k≥0 is called a fair sequence of vectors of diagonal matrices.

Seminorm-induced oblique projections
With regard to this definition, it should be noted that condition (i) is stronger than the condition used by Aharoni and Censor [1] regarding the weights they used in their BIP method.Their scalar weights in BIP can be obtained from the system is their vector of weights used in the kth iteration.The (weaker) condition that they use is that "for all i = 1,2,...,m, the series ∞ k=0 w k i = +∞."The purpose of either condition (i) in Definition 3.1 or the condition of [1] is to guarantee that none of sets C i is gradually ignored by ever diminishing weights.We do not know whether our convergence result, presented below, can be strengthened by using a condition similar to that of [1].
The algorithm that we propose and study here for solving the convex feasibility problem of finding a point x * ∈ C = ∩ m i=1 C i is formulated as follows.Algorithm 3.2.Initialization: x 0 ∈ R n is arbitrary.
Iterative step: given x k calculate where {λ k } k≥0 are relaxation parameters, {{G k i } m i=1 } k≥0 is a sequence of vectors of diagonal matrices, and P G k i Ci (x k ) is the seminorm-induced oblique projection of x k onto C i with respect to G k i .This algorithmic scheme includes, as a special case, the sequential seminorminduced oblique projections algorithm that is obtained from (3.3) by choosing at the kth iteration, for i = 1,2,...,m, where {i(k)} k≥0 is a control sequence which governs the sequential progress such as a cyclic control in which i(k) = k(modm) + 1.If all G k i are nonzero matrices at every iterative step, then a fully simultaneous algorithm with seminorm-induced oblique projections is obtained from our algorithmic scheme.In addition to the consistent case result for such a fully simultaneous algorithm, we present in Section 4 also an inconsistent case analysis.In between these two extremes, the algorithmic scheme allows variable blocks of constraints to be acted upon, as iterations proceed.At the kth iteration, only those sets C i for which G k i = 0 will be included in the block.The blocks may vary in size (i.e., number of sets included) and in composition (which sets are included) as long as the conditions of the convergence theorem, given below, are met.

Y. Censor and T. Elfving 395
We consider first the case of unity relaxation, that is, λ k = 1, for all k ≥ 0, and define the operators so that (3.3) is now Our goal is to prove the following convergence result.
Theorem 3.3.If the following assumptions hold: is a fair sequence of vectors of diagonal matrices, (iv) every index i = 1,2,...,m appears in infinitely many sets B(k), (v) for every i = 1,2,...,m, the set C i is directionally affine with respect to G k i , for all k ≥ 0, then any sequence {x k } k≥0 , generated by Algorithm 3.2, with λ k = 1 for all k ≥ 0, converges to a point x * ∈ C.
We define, for all k ≥ 0, the functions for which the following auxiliary result holds.Proposition 3.4.Under the assumptions of Theorem 3.3, if {x k } k≥0 is any sequence, generated by Algorithm 3.2, with λ k = 1, for all k ≥ 0, then, for any x ∈ R n and for every k ≥ 0, Proof.The proof follows the same lines as the proof of Lemma 4.1 in Censor et al. [15].By (2.2), (3.6), and (3.7), we have (3.9) Using (3.5) and the fact that, due to m i=1 G k i = i∈B(k) G k i = I, we have, for any the result follows.
Definition 3.5.A sequence {x k } k≥0 is called Fejér-monotone with respect to a nonempty set Ω ⊆ R n if, for any x ∈ Ω, It is easy to verify that any Fejérmonotone sequence is bounded.We have now all tools in place to prove the convergence theorem.
Proof of Theorem 3.3.From (3.8), we have, for every k ≥ 0, (3.12) Using (3.7) and (3.10), the right-hand side of (3.12) may be rewritten, for some x = x ∈ C, as The first sum on the right-hand side of (3.13) is nonnegative and each term in the second sum of (3.13) is nonnegative by application of Proposition 2.8 with x, and y = x k .Thus, we obtain from (3.12) that, for every x ∈ C, that is, Fejér-monotonicity of {x k } k≥0 with respect to C which implies its boundedness and, by monotonicity and nonnegativity, we obtain that the limit Y. Censor and T. Elfving 397 exists.Therefore, the left-hand side of (3.12) tends to zero, as k → ∞, which implies, by (3.13), that lim and that, for any x ∈ C, These limits imply the next two limits.By nonnegativity of the seminorms, (3.16) implies that, for all i = 1,2,...,m, and the nonnegativity of the summands in (3.17), noted earlier, implies that, for all i = 1,2,...,m and any x ∈ C, Since {x k } k≥0 is bounded and hence has a cluster point, the following two observations imply the required convergence result: (i) if there exists a cluster point x * of {x k } k≥0 in C, then {x k } k≥0 has only one cluster point, and (ii) every cluster point of {x k } k≥0 must be in C.
To prove (i), let x * ∈ C be a cluster point and assume that x * * is another cluster point, that is, lim k→∞,k∈K1 Then, taking the limit of for k → ∞, k ∈ K 2 , and using (3.20) and (3.21), for the second and first summands on the right-hand side of (3.22), respectively, implies that x * = x * * .Next, we prove (ii).Let lim l→∞ x kl = x * and assume that x * / ∈ C. Defining this means that I out = ∅.Since, by assumption, every index i, 1 ≤ i ≤ m, appears in infinitely many index sets B(k), we may assume, without loss of generality (passing to a subsequence if necessary), that for every l = 1,2,..., For every l, l = 1,2,..., let µ l be the smallest element in the set such that (such an element exists by (3.24) and since I out = ∅).We want to show that the subsequence {x µl } l≥0 also converges to x * .By definition, k l ≤ µ l , for all l = 1,2,....If k l < µ l , then and since x * ∈ ∩ i∈Iin C i , we have, from an appropriate variant of (3.14), Letting l → ∞ in (3.28), we obtain lim l→∞ x µl = x * .From (3.26), it follows that there exists an index t ∈ I out such that t ∈ B(µ l ) for infinitely many indices l.
Removing from the sequence {µ l } l≥0 all elements µ l for which t / ∈ B(µ l ), we end up with a new infinite sequence {µ l } l≥0 such that Pick a G k t = 0 (there must be infinitely many such nonzero matrices for any t), denote w t,k = P G k t Ct (x k ) − x k , and write . By (2.2), Definitions 2.1 and 3.1, and (3.30), we obtain because when g k t j = 0 then, by Definition 2.1, w t,k j = 0 so that, using Definition 3.1, Once convergence of Algorithm 3.2 has been proved for unity relaxation, λ k = 1, for all k ≥ 0, it is possible to introduce underrelaxation parameters in the following manner.Theorem 3.6.Under the assumptions of Theorem 3.3, any sequence {x k } k≥0 , generated by Algorithm 3.2 with relaxation parameters for some arbitrarily small but fixed , converges to a point x * ∈ C.
Proof.The idea of this proof was developed during discussions with Charles Byrne.Define the m + 1 diagonal n × n matrices 400 Seminorm-induced oblique projections and let C m+1 = R n .Then (3.3) takes the form . Also, the convex feasibility problem obviously remains unchanged, and {{Γ k i } m+1 i=1 } k≥0 is a fair sequence according to Definition 3.1.Thus, Theorem 3.3 applies and the result follows.
Our Algorithm 3.2 could be derived from Byrne's [9, Algorithm 4.2] for only the special case of fixed blocks and a constant sequence of vectors of diagonal matrices, that is, for It is, therefore, natural to ask whether the BIMSGP method of [9] can be extended to accommodate diagonal weighting matrices which vary with the iteration index k, but we do not address this question here.The BIMSGP method is a multiprojection scheme and, as such, it also contains, as a special case, the forerunner [12] of multiprojection methods.

Analysis of the fully simultaneous algorithm in the inconsistent case
The fully simultaneous algorithm with seminorm-induced oblique projections for the convex feasibility problem is obtained from Algorithm 3.2 by imposing the condition that all diagonal matrices of (3.1) are nonzero matrices which do not change as iterations proceed.So, we assume throughout this section that, for all k ≥ 0, A separate treatment is applied in this section, which allows us to prove that, in this case, Algorithm 3.2 generates sequences {x k } k≥0 which always converge, regardless of the initial point x 0 ∈ R n and independently from the consistency C = ∅ or inconsistency C = ∅ of the underlying system C := ∩ m i=1 C i .Moreover, it will always converge to a minimizer of a certain proximity function assuming that this function has minimizers.Our analysis follows the same pattern of proof of the CAV algorithm given in [15].We treat first the case λ k = 1, for all k ≥ 0, and then expand the result to underrelaxation.Proposition 4.2.Under the conditions of Proposition 4.1, if Φ = ∅, then any sequence {x k } k≥0 , generated by Algorithm 3.2, is Fejér-monotone with respect to Φ.
Proof.Take any x ∈ Φ and use (4.5) with x 0 = x and x 1 = T( x), then we get Since x is a minimizer of F, F( x) ≤ F(T( x)) and so (4.7) shows that T( x) = x.Using Proposition 2.8 with y = x k , G = G i , Q = C i , and z = P Gi Ci ( x), we obtain, for i = 1,2,...,m, that for any sequence {x k } k≥0 , generated by Algorithm 3.2 with λ k = 1 for all k ≥ 0, and assuming (4.1), Summing up all these inequalities, using Proposition 3.4 with (4.1) for the resulting left-hand side, and using (4.2), we obtain where Using (4.2) again and the fact that T( x) = x, (4.9) can be rewritten as we have, by (4.3), which, along with (4.11) and x k+1 = T(x k ), proves the required result.The inequality in (4.14) follows from convexity considerations: due to m i=1 G i = I, we have, for every j = 1,2,...,n, that m i=1 g i j = 1 while g i j ≥ 0, for all i and all j.Convexity of the real-valued function f (x) = x 2 then implies that, for every Summing up all these inequalities over j yields the inequality in (4.14).Now, we are ready to prove the convergence theorem.The fact that lim k→∞ x k+1 − x k 2 = 0 (see Proposition 4.1) guarantees that T(x * ) = x * because x * is a cluster point.Thus, (4.16) and (4.14) show that F( x) ≥ F(x * ).Finally, if x * is a cluster point of {x k } k≥0 , then x * ∈ Φ and Proposition 4.2 guarantees that, for all k ≥ 0, 0 ≤ x * − x k+1 2 ≤ x * − x k 2 .(4.17)

Call for Papers
Thinking about nonlinearity in engineering areas, up to the 70s, was focused on intentionally built nonlinear parts in order to improve the operational characteristics of a device or system.Keying, saturation, hysteretic phenomena, and dead zones were added to existing devices increasing their behavior diversity and precision.In this context, an intrinsic nonlinearity was treated just as a linear approximation, around equilibrium points.Inspired on the rediscovering of the richness of nonlinear and chaotic phenomena, engineers started using analytical tools from "Qualitative Theory of Differential Equations," allowing more precise analysis and synthesis, in order to produce new vital products and services.Bifurcation theory, dynamical systems and chaos started to be part of the mandatory set of tools for design engineers.
This proposed special edition of the Mathematical Problems in Engineering aims to provide a picture of the importance of the bifurcation theory, relating it with nonlinear and chaotic dynamics for natural and engineered systems.Ideas of how this dynamics can be captured through precisely tailored real and numerical experiments and understanding by the combination of specific tools that associate dynamical system theory and geometric tools in a very clever, sophisticated, and at the same time simple and unique analytical environment are the subject of this issue, allowing new methods to design high-precision devices and equipment.
Authors should follow the Mathematical Problems in Engineering manuscript format described at http://www .hindawi.com/journals/mpe/.Prospective authors should submit an electronic copy of their complete manuscript through the journal Manuscript Tracking System at http:// mts.hindawi.com/according to the following timetable:

First
Round of ReviewsMarch 1, 2009 .29) Y. Censor and T. Elfving 399 Write (3.19) for that t, and use for it Proposition 2.8 with It follows from the Fejér-monotonicity, established in Proposition 4.2, that {x k } k≥0 is bounded, thus it has at least one cluster point.We show now that any cluster point of {x k } k≥0 is a minimizer of F. Let x * be a cluster point of {x k } k≥0 and let x ∈ Φ.Using Proposition 2.8 with y = x * , G = G i , Q = C i , and z = P Gi Ci ( x), for i = 1,2,...,m, summing up all inequalities, and using Proposition 3.4 with (4.1), (4.2), and (4.10), we obtainF( x) + x − x * 2 2 ≥ F x Theorem 4.3.Under assumptions (i), (iii), (iv), and (v) of Theorem 3.3, if the proximity function F has minimizers, that is, Φ = ∅, then any sequence {x k } k≥0 , generated by Algorithm 3.2 with λ k = 1 for all k ≥ 0, and assuming (4.1), converges to a minimizer of F. Proof.* + Γ x * .(4.16)