Two-Step Relaxation Newton Method for Nonsymmetric Algebraic Riccati Equations Arising from Transport Theory

We propose a new idea to construct an effective algorithm to compute the minimal positive solution of the nonsymmetric algebraic Riccati equations arising from transport theory. For a class of these equations, an important feature is that the minimal positive solution can be obtained by computing the minimal positive solution of a couple of fixed-point equations with vector form. Based on the fixed-point vector equations, we introduce a new algorithm, namely, two-step relaxation Newton, derived by combining two different relaxation Newton methods to compute the minimal positive solution. The monotone convergence of the solution sequence generated by this new algorithm is established. Numerical results are given to show the advantages of the new algorithm for the nonsymmetric algebraic Riccati equations in vector form.


Introduction
In this paper, we are interested in iteratively solving the algebraic Riccati equation arising from transport theory see, e.g., 1-5 and references therein : where A, B, C, E ∈ R n×n are given by A Δ − eq T , B ee T , C qq T , E D − qe T .

1.5
It has been shown that problem 1.1 has positive solution in the sense of componentwise; see 2, 4 for details.Since only the minimal positive solution is physically meaningful, the research in the field of iteration methods centers around the computation of the minimal positive solution of 1.1 ; see, for example, 3, 5-12 .For the discussion on more general nonsymmetric algebraic Riccati equations arising in real world, we refer the interested reader to 13-17 .In the seminal paper by Lu 9 , it was shown that the solution X of 1.1 can be written as where • denotes the Hadamard product, T is the matrix with elements T i,j 1/ δ i d j , and u, v are two vectors satisfying the following vector equations: where P P i,j q j / δ i d j and Q Q i,j q j / δ j d i .
Let w u T , v T T ∈ R 2n with w i u i and w n i v i , i 1, 2, . . ., n, and let g w g 1 w , g 2 w , . . ., g 2n w T with 1.8 Then the vector equations 1.7 can be uniformly rewritten as w w • g w e. 1.9 Based on the vector equations 1.7 , Lu 9 investigated the simple iteration SI method to compute the minimal positive solution as follows: v 0 0, u 0 0.

1.10
It was shown that the solution sequence {u k , v k } generated by 1.10 converges monotonically to the minimal positive solution of 1.7 .We note that at each step the SI method costs about 4n 2 flops for the definition of flops, see, e.g., 18 , while the Gauss-Jacobi GJ scheme defined by Juang 2 11 costs about 6n 2 flops.Hence, the SI method is more efficient than the GJ method.In 1.10 , we note that before computing v k 1 , we have obtained u k 1 , which should be a better approximation to u than u k .Based on this consideration, Bao et al. 7 constructed the modified simple iteration method as v 0 0, u 0 0.

1.12
It was shown theoretically and numerically that the modified simple iteration procedure 1.12 is more efficient than the original one 1.10 .
Recently, the so-called nonlinear splitting iteration methods defined as were investigated by Bai et al. 6 and independently by Lin 19 .In 6 , methods 1.13a and 1.13b were called nonlinear block Jacobi NBJ iteration method and nonlinear block Gauss-Seidel NBGS iteration method, respectively, and it was shown that the solution sequence {u k , v k } generated by the NBJ method and the NBGS method converges monotonically to the Mathematical Problems in Engineering minimal positive solution of 1.7 .Moreover, numerical experiments given in 6, 19 show that the convergence speed of these two methods is higher than that of the SI method 1.10 .
In this paper, based on the fixed-point equations 1.7 , we present an accelerated version of the NBJ scheme, namely, two-step relaxation Newton method, to compute the minimal positive solution.The construction of the new method is based on the relaxation Newton method introduced by Wu et al. 20, 21 .At each iteration, the new algorithm is composed of two-steps: we use the NBJ method as a simple relaxation Newton method to obtain a coarse approximation of the solution of 1.7 , and then with this coarse approximation at hand we use a relative complicated relaxation Newton method to get a finer approximation.It is shown that the solution sequence generated by this method converges monotonically to the minimal positive solution of 1.7 .We also prove that the new method is more efficient than the NBJ method and its two-step version.
The remainder of this paper is organized as follows.In Section 2, we introduce the relaxation Newton method and the its two-step version.Section 3 is concerned with the monotone convergence of the new method.In Section 4, we test some numerical experiments to compare the performance of the new method with the SI method and the NBJ method in the sense of iteration number and CPU time.In the end of the work of this paper, we have constructed another two-step relaxation Newton algorithm which performs much better than the SI, NBJ methods; unfortunately, at the moment we cannot theoretically prove the convergence of this method to the minimal positive solution of the vector equations 1.7 .Therefore, we will just report in Section 4 the numerical results of this method.

The Two-Step Relaxation Newton Method
In this section, we focus on introducing the basic idea of the relaxation Newton method investigated by 20, 21 and its two-step version for the following general nonlinear equations:

The Relaxation Newton Algorithm
The key idea of the relaxation Newton algorithm is to choose some splitting function F:R n × R n → R n which is minimally assumed to satisfy a consistency condition for any x ∈ R n .Then with an initial guess x 0 of the unknown solution x * , we start with the previous approximation x k to compute the next approximation x k 1 by solving the following problem: with some conventional method, such as the classical Newton's method, quasi-Newton methods, and Conjugate-Gradient method.Obviously, the generated sequence {x k } ∞ 0 will upon convergence approach to some value x * which satisfies F x * , x * 0, that is, F x 0.
Wu et.al. 20, 21 used the classical Newton's method to solve 2.3 , which explains the name: relaxation Newton; the deduced algorithm written compactly is shown in Algorithm 1.
In Algorithm 1 and what follows

2.4
If we set x 0 x k and M 1 in the relaxation algorithm shown in Algorithm 1, by the consistency condition 2.2 , the method can be written as With a special choice of F, the Jacobi matrix F 2 x, x will be a diagonal or block diagonal matrix and invertible in R n , and thus iteration method 2.5 can be processed stably and simultaneously with less storage compared with the classical Newton's method.We will see in the next section how to select the splitting function F such that the Jacobi matrix F 2 x, x is a diagonal or block diagonal matrix.

The Two-Step Relaxation Newton Method
Now, suppose that we have two splitting functions F x, y , G x, y, z and an initial approximation x 0 of the solution of 2.1 at hand.We start with the previous approximation x k to compute a mid-approximation x k 1/2 by solving the equations and then with x k , x k 1/2 at hand we compute x k 1 by solving the equations The splitting functions F and G are assumed to satisfy the consistency conditions Similar to the relaxation Newton algorithm described above, we use Newton's method with a single iteration to solve equations 2.6a , 2.6b , and the deduced iteration scheme is where F 2 is defined by 2.4 , and the Jacobi matrix G 3 x, y, z of the function G is defined by

2.8c
Throughout this paper, the iteration scheme 2.8a , 2.8b , 2.8c is called two-step relaxation Newton denoted by "TSRN" algorithm.Specially, for the vector equations 1.7 we consider in this paper the following splitting functions: here and hereafter T and w k u T k , v T k T for all k 0, 1, 2 . ..; Φ and Ψ are diagonal matrices and their diagonal elements are determined as

2.10
Generally speaking, the splitting function G is a three variables function, but here we consider a special case-only the second and third variables are involved; see 2.9b .In the end of Section 4, we will see another TSRN algorithm where the function G is defined with 3 arguments see 4.5b .Define Then it is clear that Therefore, it follows by substituting the function f defined in 1.7 and the splitting functions F, G defined in 2.9a , 2.9b into 2.8a , 2.8b , 2.8c that 13a

2.13b
We note that with the special splitting functions F and G defined in 2.9a , 2.9b , equations 2.13a , 2.13b implies Furthermore, by 2.10 we know that A k B k must be zero matrix for all k 0, 1, 2 . ..; hence

2.16
From 2.9a , 2.9b , 2.10 , and 2.16 we obtain the algorithm for implementing the TSRN method as follows.
Algorithm 1 Two-step relaxation Newton iteration .Starting with the initial value u 0 , v 0 , the solution {u k , v k } is defined, for k 0, 1, 2, . .., by 1 computing explicitly in the following elementwise fashion:

Convergence Analysis
For our proof of the monotone convergence of the TSRN method, we need the following results proved by Lu 9 .
Lemma 3.1 see 9 .With the function g defined in 1.8 , one has The first conclusion can be obtained straightforwardly from 1.7 .Theorem 3.2.Let u 0 , v 0 0, 0 be the initial point of the TSRN method.The solution sequence {u k , v k } ∞ k 1 generated by the TSRN method is strictly and monotonically increasing and converges to the minimal positive solution u * , v * of the vector equations 1.7 ; particularly, it holds that Proof.We first prove the first conclusion of Theorem 3.2 by induction rule and this is completed in two-steps: first, we prove the correctness of this conclusion for k 0, and then under the induction assumption with index k j, we prove the case for k j 1.

3.5
and u 1/2 < u * , v 1/2 < v * , and we know that the right hand of 3.5 is positive, and this implies G w 0 , w 1/2 , w * > 0. Therefore, from 3.3 we get w 1 < w * .We have proved that the first conclusion of Theorem 3.2 is correct for k 0.
2 Assume that the first conclusion of Theorem 3.2 is correct for k j, j ≥ 0 and we will prove that it is also correct for k j 1.To this end, we note that under the induction assumption it holds where the inequalities e − Qu * > 0 and e − Pv * > 0 follow directly from equality 3.4 .Consider and since w j 1/2 < w j 1 , we get
Hence, we arrive at From 3.9 we have 3.12 this coupled with 3.11 gives

3.13
Next, we prove w j 2 < w * .To this end, from 3.2 and 2.14 we have and this implies From 3.11 , we get where the first equality can be easily verified from 2.9b .Consider 3.4 and the following relations: 3.17 Therefore, it follows from 2.9a , 2.9b , and 2.14 that

3.20
This implies that u * , v * is a positive solution of the vector equations 1.7 .According to the minimal property of u * , v * , it must hold u * u * and v * v * .
To finish this section, we compare the efficiency of the TSRN method with the NBJ method.To this end, we rewrite the original NBJ iteration scheme 1.13a into the following form:

3.21
Clearly, the TSRN method will reduce into the two-step version NBJ method 3.21 if Ψ Φ 0 in 2.9b .
The following theorem indicates that the TSRN method is more efficient than the NBJ method.
Theorem 3.3.Let both the NBJ method and the TSRN method start with the initial value 0, 0 , and {u k , v k }, {U k , V k } be the sequences generated by the TSRN method and the NBJ method, respectively.Then it holds that e, e .Hence, we have

3.23
This coupled with 2.10 , the second conclusion of Lemma 3.1, and the first conclusion of Theorem 3.2 gives

3.24
Since P, Q > 0, it follows from 3.24 that

3.25
where in the last inequality we used the relation 3.6 and the fact established by 6 -the sequence {U k , V k } is strictly and monotonically increasing and converges to the minimal positive solution u * , v * .From 3.25 , we have the we arrive at u 3 > U 3 and v 3 > V 3 .Therefore, the proof of 3.22 can be completed by using relations 3.25 and 3.26 repeatedly.
Remark 3.4.Since the NBJ method is more feasible than the NBGS method in parallel environment, in this paper we only focus on comparing the efficiency between the TSRN and the NBJ methods.

Numerical Results
In this section, we compute the minimal positive solution of the vector equations 1.7 by the TSRN method, in order to compare numerically the feasibility and effectiveness of this algorithm with the SI method 1.10 and the NBJ method 1.13a in the sense of number of iteration denoted as "IT" and elapsed CPU time in seconds denoted as "CPU" .Clearly, these methods are very suitable to be performed in parallel environment.We remark that the Newton type methods coupled with LU-factorization 6, 10 , NBGS method 1.13b , and the modified simple iteration 1.12 are sequential methods and not suitable to be implemented in parallel environment.
In all experiments, the constants c i and ω i , i 1, 2, . . ., n, are given by the composite Gauss-Legendre quadrature formula on the interval 0, 1 .More precisely, the interval 0, 1 is first divided into subintervals of equal length, and the composite Gauss-Legendre quadrature formula with 4 nodes is then applied to each subinterval see 14 .In our implementations, all iterations start with initial value 0, 0 and are terminated once the current residual error defined as satisfies ERR k ≤ 10 −13 .All codes are performed in MATLAB version 7.0 on an Intel R Pentium R Dual E2110 @ 1.4 GHz PC with memory 1 GB.
To make a fair comparison, we rewrite equivalently the SI method 1.10 into two-step fashion as follows:

4.2
and the vectors u k 1/2 , v k 1/2 and u k 1 , v k 1 are explicitly computed in elementwise as The NBJ method is also performed in two-step fashion as defined in 3.21 and the vectors are computed in elementwise as:

4.4
Example 4.1.In Table 1, for the fixed problem size n 32 but different pairs of α, c , and in Table 2 for the fixed α, c 10 −7 , 1 but different problem size n, we list ITs and CPUs for the SI, NBJ and TSRN methods, respectively.
We see clearly in Table 1 that with fixed problem size, the TSRN method converges faster than the SI and NBJ methods, and as α converges to 0 and c converges to 1, the advantages of the TSRN method are more significant.Moreover, for each pair α, c , the TSRN method needs less CPU time than the SI and NBJ methods, even through the former needs a little more flops at each iteration.When the parameters α and c are fixed, as the problem size n becomes large, the performance of the TSRN method and the NBJ method becomes close, as shown in Table 2.

4.5b
With these two splitting functions, we list in Table 3, for fixed problem size n 256 but different α, c , and in Table 4 for fixed α, c 5 × 10 −9 , 1 − 3 × 10 −9 but different problem size, ITs and CPUs for the NBJ, TSRN and TSRN * methods from the numerical results given in the previous example we believe that the NBJ and TSRN, methods converge faster than the SI method, and thus at the moment we omit the comparison of the TSRN * method with the SI method .
For each problem size n and α, c tested in Tables 3 and 4, it holds that max{ U TSRN − U TSRN * ∞ , V TSRN − V TSRN * ∞ } ≤ 10 −14 , 4.6 where U TSRN , V TSRN and U TSRN * , V TSRN * are the converged vectors generated by the TSRN and TSRN * methods, respectively.Therefore, the TSRN * method really converges to the minimal positive solution of the vector equations 1.7 .Moreover, from the numerical results listed in Tables 3 and 4, we see clearly that the TSRN * method performs much better than the TSRN and NBJ methods.Unfortunately, at the moment we cannot prove theoretically the convergence of the TSRN * method to the minimal positive solution of the vector equations 1.7 .
1 and go to the first step in Algorithm 1.Clearly, compared with the Newton-type methods investigated byLu 10 and Lin  et al. 8  , no LU-factorization is needed at each iteration of the TSRN method.

Example 4 . 2 .F w k 1 / 2 , w k u k 1 / 2 − u k 1 / 2 • 1 / 2 − v k 1 / 2 •
In the end of the work of this paper, we have constructed another two-step relaxation Newton algorithm denoted as TSRN * for the moment with the splitting functions F and G defined as Pv k − e v k Qu k − e , 4.5a . It is easy to verify G 3 w 0 , w 1/2 , w * G 3 w 0 , w 1/2 , w 1/2 and this implies * v * By 3.9 , 3.13 and 3.18 we have proved the validity of the first conclusion of Theorem 3.2 for k j 1.Therefore, by induction rule we have completed the proof of the first conclusion of Theorem 3.2.From the first conclusion of Theorem 3.2, it is obvious that there exist positive vectors u *and v * such that lim

Table 1 :
Numerical results for n 32 and different α, c .

Table 3 :
Numerical results for n 256 and different a,c .