Geometrical and Spectral Properties of the Orthogonal Projections of the Identity

We analyze the best approximationAN (in the Frobenius sense) to the identitymatrix in an arbitrarymatrix subspaceAS (A ∈ Rn×n nonsingular, S being any fixed subspace of R). Some new geometrical and spectral properties of the orthogonal projection AN are derived. In particular, new inequalities for the trace and for the eigenvalues of matrixAN are presented for the special case that AN is symmetric and positive definite.


Introduction
The set of all  ×  real matrices is denoted by R × , and  denotes the identity matrix of order .In the following,   and tr() denote, as usual, the transpose and the trace of matrix  ∈ R × .The notations ⟨⋅, ⋅⟩  and ‖ ⋅ ‖  stand for the Frobenius inner product and matrix norm, defined on the matrix space R × .Throughout this paper, the terms orthogonality, angle, and cosine will be used in the sense of the Frobenius inner product.
Our starting point is the linear system where  is a large, nonsingular, and sparse matrix.The resolution of this system is usually performed by iterative methods based on Krylov subspaces (see, e.g., [1,2]).The coefficient matrix  of the system (1) is often extremely ill-conditioned and highly indefinite, so that in this case, Krylov subspace methods are not competitive without a good preconditioner (see, e.g., [2,3]).Then, to improve the convergence of these Krylov methods, the system (1) can be preconditioned with an adequate nonsingular preconditioning matrix , transforming it into any of the equivalent systems the so-called left and right preconditioned systems, respectively.In this paper, we address only the case of the right-hand side preconditioned matrices , but analogous results can be obtained for the left-hand side preconditioned matrices .
The preconditioning of the system (1) is often performed in order to get a preconditioned matrix  as close as possible to the identity in some sense, and the preconditioner  is called an approximate inverse of .The closeness of  to  may be measured by using a suitable matrix norm like, for instance, the Frobenius norm [4].In this way, the problem of obtaining the best preconditioner  (with respect to the Frobenius norm) of the system (1) in an arbitrary subspace  of R × is equivalent to the minimization problem; see, for example, [5] min ∈ ‖ − ‖  = ‖ − ‖  . ( The solution  to the problem (3) will be referred to as the "optimal" or the "best" approximate inverse of matrix  in the subspace .Since matrix  is the best approximation to the identity in subspace , it will be also referred to as the orthogonal projection of the identity matrix onto the subspace .Although many of the results presented in this paper are also valid for the case that matrix  is singular, from now on, we assume that the optimal approximate inverse  (and thus also the orthogonal projection ) is a nonsingular matrix.The solution  to the problem (3) has been studied as a natural generalization of the classical Moore-Penrose inverse in [6], where it has been referred to as the -Moore-Penrose inverse of matrix .
The main goal of this paper is to derive new geometrical and spectral properties of the best approximations  (in the sense of formula (3)) to the identity matrix.Such properties could be used to analyze the quality and theoretical effectiveness of the optimal approximate inverse  as preconditioner of the system (1).However, it is important to highlight that the purpose of this paper is purely theoretical, and we are not looking for immediate numerical or computational approaches (although our theoretical results could be potentially applied to the preconditioning problem).In particular, the term "optimal (or best) approximate inverse" is used in the sense of formula (3) and not in any other sense of this expression.
Among the many different works dealing with practical algorithms that can be used to compute approximate inverses, we refer the reader to for example, [4,[7][8][9] and to the references therein.In [4], the author presents an exhaustive survey of preconditioning techniques and, in particular, describes several algorithms for computing sparse approximate inverses based on Frobenius norm minimization like, for instance, the well-known SPAI and FSAI algorithms.A different approach (which is also focused on approximate inverses based on minimizing ‖ − ‖  ) can be found in [7], where an iterative descent-type method is used to approximate each column of the inverse, and the iteration is done with "sparse matrix by sparse vector" operations.When the system matrix is expressed in block-partitioned form, some preconditioning options are explored in [8].In [9], the idea of "target" matrix is introduced, in the context of sparse approximate inverse preconditioners, and the generalized Frobenius norms ‖‖ 2 , = tr(  ) ( symmetric positive definite) are used, for minimization purposes, as an alternative to the classical Frobenius norm.
The last results of our work are devoted to the special case that matrix  is symmetric and positive definite.In this sense, let us recall that the cone of symmetric and positive definite matrices has a rich geometrical structure and, in this context, the angle that any symmetric and positive definite matrix forms with the identity plays a very important role [10].In this paper, the authors extend this geometrical point of view and analyze the geometrical structure of the subspace of symmetric matrices of order , including the location of all orthogonal matrices not only the identity matrix.
This paper has been organized as follows.In Section 2, we present some preliminary results required to make the paper self-contained.Sections 3 and 4 are devoted to obtain new geometrical and spectral relations, respectively, for the orthogonal projections  of the identity matrix.Finally, Section 5 closes the paper with its main conclusions.

Some Preliminaries
Now, we present some preliminary results concerning the orthogonal projection  of the identity onto the matrix subspace  ⊂ R × .For more details about these results and for their proofs, we refer the reader to [5,6,11].
Taking advantage of the prehilbertian character of the matrix Frobenius norm, the solution  to the problem (3) can be obtained using the orthogonal projection theorem.More precisely, the matrix product  is the orthogonal projection of the identity onto the subspace , and it satisfies the conditions stated by the following lemmas; see [5,11].Lemma 1.Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3).Then, An explicit formula for matrix  can be obtained by expressing the orthogonal projection  of the identity matrix onto the subspace  by its expansion with respect to an orthonormal basis of  [5].This is the idea of the following lemma.
Lemma 2. Let  ∈ R × be nonsingular.Let  be a linear subspace of R × of dimension  and { 1 , . . .,   } a basis of  such that { 1 , . . .,   } is an orthogonal basis of .Then, the solution  to the problem (3) is and the minimum (residual) Frobenius norm is Let us mention two possible options, both taken from [5], for choosing in practice the subspace  and its corresponding basis {  }  =1 .The first example consists of considering the subspace  of × matrices with a prescribed sparsity pattern, that is, Then, denoting by  , , the  ×  matrix whose only nonzero entry is   = 1, a basis of subspace  is clearly { , : (, ) ∈ }, and then { , : (, ) ∈ } will be a basis of subspace  (since we have assumed that matrix  is nonsingular).In general, this basis of  is not orthogonal, so that we only need to use the Gram-Schmidt procedure to obtain an orthogonal basis of , in order to apply the orthogonal expansion (6).
For the second example, consider a linearly independent set of  ×  real symmetric matrices { 1 , . . .,   } and the corresponding subspace which clearly satisfies Hence, we can explicitly obtain the solution  to the problem (3) for subspace   , from its basis { 1   , . . .,     }, as follows.If { 1   , . . .,     } is an orthogonal basis of subspace   , then we just use the orthogonal expansion (6) for obtaining .Otherwise, we use again the Gram-Schmidt procedure to obtain an orthogonal basis of subspace   , and then we apply formula (6).The interest of this second example stands in the possibility of using the conjugate gradient method for solving the preconditioned linear system, when the symmetric matrix  is positive definite.For a more detailed exposition of the computational aspects related to these two examples, we refer the reader to [5].Now, we present some spectral properties of the orthogonal projection .From now on, we denote by {  }  =1 and {  }  =1 the sets of eigenvalues and singular values, respectively, of matrix  arranged, as usual, in nonincreasing order, that is, The following lemma [11] provides some inequalities involving the eigenvalues and singular values of the preconditioned matrix .Lemma 3. Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Then, The following fact [11] is a direct consequence of Lemma 3. Lemma 4. Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Then, the smallest singular value and the smallest eigenvalue's modulus of the orthogonal projection  of the identity onto the subspace  are never greater than 1.That is, The following theorem [11] establishes a tight connection between the closeness of matrix  to the identity matrix and the closeness of   (|  |) to the unity.Theorem 5. Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3).Then, Remark 6. Theorem 5 states that the closer the smallest singular value   of matrix  is to the unity, the closer matrix  will be to the identity, that is, the smaller ‖ − ‖  will be, and conversely.The same happens with the smallest eigenvalue's modulus |  | of matrix .In other words, we get a good approximate inverse  of  when   (|  |) is sufficiently close to 1.
To finish this section, let us mention that, recently, lower and upper bounds on the normalized Frobenius condition number of the orthogonal projection  of the identity onto the subspace  have been derived in [12].In addition, this work proposes a natural generalization (related to an arbitrary matrix subspace  of R × ) of the normalized Frobenius condition number of the nonsingular matrix .

Geometrical Properties
In this section, we present some new geometrical properties for matrix ,  being the optimal approximate inverse of matrix , defined by (3).Our first lemma states some basic properties involving the cosine of the angle between matrix  and the identity, that is, Lemma 7. Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3).Then, Proof.First, using ( 15) and ( 4) we immediately obtain (16).
As a direct consequence of ( 16), we derive that cos(, ) is always nonnegative.Finally, using ( 5) and ( 16), we get and the proof is concluded.
Remark 8.In [13], the authors consider an arbitrary approximate inverse  of matrix  and derive the following equality: that is, the typical decomposition (valid in any inner product space) of the strong convergence into the convergence of the norms (‖‖  − ‖‖  ) 2 and the weak convergence (1 − cos(, ))‖‖  ‖‖  .Note that for the special case that  is the optimal approximate inverse  defined by (3), formula (18) has stated that the strong convergence is reduced just to the weak convergence and, indeed, just to the cosine cos(, ).
Remark 9.More precisely, formula (18) Obviously, the optimal theoretical situation corresponds to the case Remark 10.Note that the ratio between cos(, ) and cos(, ) is independent of the order  of matrix .Indeed, assuming that tr() ̸ = 0 and using (16) The following lemma compares the trace and the Frobenius norm of the orthogonal projection .Lemma 11.Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3).Then, Proof.Using (4), we immediately obtain the four leftmost equivalences.Using ( 16), we immediately obtain the two rightmost equivalences.
The next lemma provides us with a relationship between the Frobenius norms of the inverses of matrices  and its best approximate inverse  in subspace .Lemma 12. Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3) and the proof is concluded.
The following lemma compares the minimum residual norm ‖ − ‖  with the distance (with respect to the Frobenius norm) ‖ −1 − ‖  between the inverse of  and the optimal approximate inverse  of  in any subspace  ⊂ R × .First, note that for any two matrices ,  ∈ R × ( nonsingular), from the submultiplicative property of the Frobenius norm, we immediately get (29) However, for the special case that  =  (the solution to the problem (3)), we also get the following inequality.Lemma 13.Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3).Then, (31)

‖𝐴𝑁 − 𝐼‖
The following extension of the Cauchy-Schwarz inequality, in a real or complex inner product space (, ⟨⋅, ⋅⟩), was obtained by Buzano [14].For all , ,  ∈ , we have The next lemma provides us with lower and upper bounds on the inner product ⟨, ⟩  , for any  ×  real matrix .Lemma 14.Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3) The next lemma provides an upper bound on the arithmetic mean of the squares of the  2 terms in the orthogonal projection .By the way, it also provides us with an upper bound on the arithmetic mean of the  diagonal terms in the orthogonal projection .These upper bounds (valid for any matrix subspace ) are independent of the optimal approximate inverse , and thus they are independent of the subspace  and only depend on matrix .Lemma 15.Let  ∈ R × be nonsingular with tr() ̸ = 0 and let  be a linear subspace of R × .Let  be the solution to the problem (3).Then, (36) Remark 16.Lemma 15 has the following interpretation in terms of the quality of the optimal approximate inverse  of matrix  in subspace .The closer the ratio ‖‖  /| tr()| is to zero, the smaller tr() will be, and thus, due to (5), the larger ‖ − ‖  will be, and this happens for any matrix subspace .

Spectral Properties
In this section, we present some new spectral properties for matrix ,  being the optimal approximate inverse of matrix , defined by (3).Mainly, we focus on the case that matrix  is symmetric and positive definite.This has been motivated by the following reason.When solving a large nonsymmetric linear system (1) by using Krylov methods, a possible strategy consists of searching for an adequate optimal preconditioner  such that the preconditioned matrix  is symmetric positive definite [5].This enables one to use the conjugate gradient method (CG-method), which is, in general, a computationally efficient method for solving the new preconditioned system [2,15].
Our starting point is Lemma 3, which has established that the sets of eigenvalues and singular values of any orthogonal projection  satisfy Let us particularize (38) for some special cases.
The rest of the paper is devoted to obtain new properties about the eigenvalues of the orthogonal projection  for the special case that this matrix is symmetric positive definite.
First, let us recall that the smallest singular value and the smallest eigenvalue's modulus of the orthogonal projection  are never greater than 1 (see Lemma 4).The following theorem establishes the dual result for the largest eigenvalue of matrix  (symmetric positive definite).
Theorem 19.Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3).Suppose that matrix  is symmetric and positive definite.Then, the largest eigenvalue of the orthogonal projection  of the identity onto the subspace  is never less than 1.That is, Proof.Using (41), we get Now, since   ≤ 1 (Lemma 4), then  2  −   ≤ 0. This implies that at least one summand in the rightmost sum in (48) must be less than or equal to zero.Suppose that such summand is the th one (1 ≤  ≤  − 1).Since  is positive definite, then   > 0, and thus and the proof is concluded.
In Theorem 19, the assumption that matrix  is positive definite is essential for assuring that | 1 | ≥ 1, as the following simple counterexample shows.Moreover, from Lemma 4 and Theorem 19, respectively, we have that the smallest and largest eigenvalues of  (symmetric positive definite) satisfy   ≤ 1 and  1 ≥ 1, respectively.Nothing can be asserted about the remaining eigenvalues of the symmetric positive definite matrix , which can be greater than, equal to, or less than the unity, as the same counterexample also shows.
Example 20.For  = 3, let let  3 be identity matrix of order 3, and let  be the subspace of all 3×3 scalar matrices; that is,  = span{ 3 }.Then the solution   to the problem (3) for subspace  can be immediately obtained by using formula (6) as follows: and then we get Let us arrange the eigenvalues and singular values of matrix     , as usual, in nonincreasing order (as shown in (11)).
The following corollary improves the lower bound zero on both tr(), given in (4), and cos(, ), given in (17).Let us mention that an upper bound on all the eigenvalues moduli and on all singular values of any orthogonal projection  can be immediately obtained from (38) and ( 4 Our last theorem improves the upper bound given in (60) for the special case that the orthogonal projection  is symmetric positive definite.
Theorem 22.Let  ∈ R × be nonsingular and let  be a linear subspace of R × .Let  be the solution to the problem (3).Suppose that matrix  is symmetric and positive definite.Then, all the eigenvalues of matrix  satisfy   =   ≤ 1 + √ 2 ∀ = 1, 2, . . ., .
Proof.First, note that the assertion is obvious for the smallest singular value since |  | ≤ 1 for any orthogonal projection  (Lemma 4).For any eigenvalue of , we use the fact that  −  2 ≤ 1/4 for all  > 0. Then from (41), we get (62)

Conclusion
In this paper, we have considered the orthogonal projection  (in the Frobenius sense) of the identity matrix onto an arbitrary matrix subspace  ( ∈ R × nonsingular,  ⊂ R × ).Among other geometrical properties of matrix , we have established a strong relation between the quality of the approximation  ≈  and the cosine of the angle ∠(, ).Also, the distance between  and the identity has been related to the ratio ‖‖  /| tr()| (which is independent of the subspace ).The spectral analysis has provided lower and upper bounds on the largest eigenvalue of the symmetric positive definite orthogonal projections of the identity.