A Twisted Block Tangential Filtering Decomposition Preconditioner

For block-tridiagonal linear system of equations, a variant of tangential filtering preconditioners is proposed in this paper. The new variant is based on a twisted block factorization along with certain filtering property. For practical usage, a class of composite preconditioners tested, which are constructed by combining the twisted tangential filtering decomposition preconditioner with the classical ILU 0 preconditioner in a multiplicative way. The performance of the new preconditioners is compared with other classical preconditioners; the superiority and the weakness of the preconditioners are pointed out.


Introduction
In many scientific and engineer applications, for example, simulation of laser propagation in a plasma 1 and study of transport in highly heterogeneous porous media 2 , we have to numerically solve certain partial differential equations in 2D or 3D.The discretization of these PDEs by finite difference or finite volume schemes usually leads to large block-tridiagonal linear system: Due to the size of the problem, preconditioned Krylov iterative methods have become one of the most popular choices.It is generally recognized that the efficiency of the linear systems solver heavily depends on the property of the preconditioners.Therefore, the construction of robust and efficient preconditioners has become an interesting research topic.Several incomplete block factorization preconditioners have been proposed by many researchers, see, for example, Axelsson 3-5 and Meurant 6-8 .Frequency filtering preconditioners 9-16 advocated by G. Wittum and his successor are a special kind of incomplete block factorization preconditioner.This class of preconditioning techniques has been illustrated particularly efficient for linear systems arising from the discretization of partial differential equations with discontinuous coefficients.
With the development of techniques of parallel computing, developing highperformance preconditioners that are suitable for parallel computing environment is becoming an important topic.In this paper, we propose a tangential filtering preconditioner constructed by the framework of twisted block factorization.Firstly, the constructed preconditioner has a filtering property.Secondly, the construction and solving procedures of the twisted factorization preconditioner are carried out from two sides, which can be done in parallel.The performance of the newly built preconditioner is compared with the tangential filtering preconditioner proposed in 10 .For practical applications, Achdou and Nataf 10 propose to combine the tangential filtering preconditioner with the ILU 0 preconditioner.In this paper, we also consider to combine the twisted tangential filtering preconditioner with the ILU 0 preconditioner in the following way: The performance of several different preconditioners is compared on some linear systems generated from the discretization of boundary value problems with discontinuous coefficients.The results show that the twisted block factorization preconditioner and its corresponding preconditioner output other preconditioners on some problems.In Section 2, we give a brief introduction of twisted block tangential filtering decomposition and then introduce the twisted block tangential filtering decomposition preconditioner.
In Section 3, we analyze the properties of the preconditioner.In Section 4, we give numerical experiments to compare the performances of different types of preconditioners.

A Twisted Block Tangential Filtering Decomposition
The block LDU factorization of A is where The matrices T i satisfy the induction formula

2.2
The block LDU factorization can be written as Similar to the block LDU factorization, the block UDL factorization of A has the form where

2.5
Then the twisted block factorization can be written as where

Mathematical Problems in Engineering
The index j satisfies 1 < j < m, the matrix T Blockdiag T 1 , T 2 • • • T m with the diagonal block T i satisfies the following relationship:

2.8
Different from diagonal block matrix D i of A, the matrices T i becomes dense quickly.Therefore, factorization 2.6 cannot be used for large problems in practice.However, the framework can be used to build an incomplete twisted block-factorization preconditioner for A. Precisely, we can replace the blocks T i by suitably chosen sparse or block-sparse approximations T i , i 1, 2, . . ., m .Then an incomplete factorization preconditioner M is constructed, which has the following form: From 2.9 it is easy to see that solving linear system Mx f is equivalent to solving the following two linear systems

2.10
By exploiting the structure, both of the linear systems can be solved by the forward and backward sweeps.Suppose y y and x x T 1 , x T 2 , . . ., x T m T according to the block structure of T .Then the process of solving Mx f can be described in Algorithm 1.
Remarks 2.1.Each of the solvers for E T y f and T −1 F I x y described in Algorithm 1 involves forward and backward sweeps, and the two sweeps have no relationship with each other, so the forward and backward sweeps can be run in parallel.The procedure of constructing of T is consistent with the idea presented in 10 .Suppose we have then Algorithm 1: Solving Mx f. which implies

2.13
It means that According to 2.8 , we have the following formula for Then the new block factorization preconditioner M based on the twisted factorization can be constructed by choosing β i−1 properly.Following the tangential filtering condition proposed in 10 , a diagonal approximation β i−1 can be determined such that where t is a filtering vector.

Mathematical Problems in Engineering
Lemma 2.2.If the matrices with 2.17 Proof.Consider the matrix M − A and observe that thus 2.17 holds.

Now we consider how to form a diagonal matrix
T be a given vector.If there are no zero entries in the vectors U i−1 t i 2 ≤ i ≤ j and L i t i j ≤ i ≤ m − 1 , then it is possible to find diagonal matrices β i−1 such that M produces the same effect with A when operating on the filtering vector t, that is, From 2.17 , we can see that it is sufficient to make These requirements can be satisfied by setting β i−1 as follows: where ./designs the pointwise vector division, and Diag v is the diagonal matrix constructed from the vector v.We refer to the preconditioner constructed by the above procedure as twisted block tangential filtering decomposition preconditioner.

Analysis of the Twisted Tangential Filtering Preconditioner
In this section, we restrict A to be symmetric positive definite, and use A B A B to denote that A − B is symmetric positive definite semidefinite .Consider the twisted tangential filtering preconditioner M formed by 2.9 , which ensures the filtering property 2.19 .Furthermore, the following lemma holds and it has been established in 10 .
Lemma 3.1.If A 0, then matrices T i T i , 1 ≤ i ≤ m.Moreover, M 0 and M − A 0 hold.The proof is similar to the proof of Lemma 2.1 of [10], so it is omitted here.From Lemma 3.1, one has the following result.The proof can be found in [17,18].
be the splitting of coefficient matrix A induced by the twisted tangential filtering decomposition preconditioner M, then ρ M −1 N < 1.

Numerical Expriments
In this section, we present some numerical results to test the performance of preconditioners discussed in this paper.The performance of composite preconditioners is compared with M ILU .Two kinds of approaches of constructing the filtering preconditioner M are considered.The combination approach 1.3 is used for all the composite preconditioners.Consider the boundary value problem used in 10 where Ω 0, 1 n n 2, or 3 , ∂Ω N ∂Ω \ ∂Ω D .The function η, the vector field a, and the tensor κ are the given coefficients of the partial differential operator.In 2D case, we have ∂Ω D 0, 1 × {0, 1}, and in 3D case, we have ∂Ω D 0, 1 × {0, 1} × 0, 1 .Due to the discontinuous coefficients in the PDE equation and the size of A, an efficient preconditioner plays an important role in solving 4.1 by preconditioned iterative methods.
Several types of preconditioners tested in our numerical experiments, we outline the notations as follows, M ILU : the ILU 0 preconditioner; M TFFD : the tangential frequency filtering decomposition preconditioner; M TBTD : the twisted block tangential filtering decomposition preconditioner; M ITF : the composite preconditioner generated by M ILU and M TFFD ; M ITB : the composite preconditioner generated by M ILU and M TBTD .Two filtering vectors are tested, the Ritz vector of A used in 10 , and 1, 1, . . ., 1 T which is used as a filtering vector in 19 .The index j is set to be j m/2, where α denotes the largest integer not exceeding α.The linear systems are solved by FGMRES 20 method preconditioned by the previously mentioned preconditioners.The algorithm is unrestarted and the maximum Krylov subspace is set to be 200.For comparison reasons, the number of iterations of the ILU 0 preconditioned FGMRES method for constructing the filtering test-vector has been chosen to be 20.The algorithm is stopped whenever the relative norm b − Ax k / b is less than 10 −12 .The exact solution is generated randomly.In the following tables, iters denotes the number of iterations, error denotes the infinite norm of the difference between the final approximate solution and the exact solution.We use "×" to denote that the method fails to converge within 200 iterations.For preconditioners M ILU , M TFFD , and M TBTD , every iteration requires only one preconditioner solve, so the total preconditioner solves are equal to the iteration number.For the composite preconditioners, assuming that the ILU 0 preconditioner has the same cost with the filtering preconditioner, so the costs for composite preconditioner is twice of the iteration.All the experiments are performed in MATLAB 21 .The codes have not been optimized for the highest efficiency and therefore we do not report the time, but we outline the number of iterations.
The considered boundary value problems 4.1 are discretized on a regular Cartesian grid with a cell-centred finite volume scheme.Full up-winding is used for the convective term in the partial differential equation.The following five different cases are considered.

Case 1. The advection-diffusion problem with a rotating velocity in two dimensions.
The tensor κ is the identity, and the velocity is a 2π x 2 − 0.5 , 2π x 1 − 0.5 T .The function η is zero.The uniform grid with n × n nodes n 50, 100, 200, 300 are tested respectively.Table 1 displays the results obtained by using different preconditioners.The ILU 0 preconditioner needs the most iterations to converges.The preconditioners M TFFD and M TBTD have better performances compared with ILU 0 .The composite preconditioners are more efficient and M ITB works a little better than M ITF .When changing the filtering vector, the iteration numbers have a small change, but the Ritz vector needs additional 20 steps to calculate.Figure 1 depicts the convergence curves of FGMRES method preconditioned by several different preconditioners.The filtering vector is set to be t 1, 1, . . ., 1 T .We can see that the composite preconditioners are efficient, the FGMRES method preconditioned by M TFFD and M TBTD produces nearly the same convergence curve.Case 2. Nonhomogenous problems with large jumps in the coefficients in two dimensions.

Case 3. Skyscraper problems.
The tensor κ is isotropic and discontinuous.The domain contains many zones of high permeability which are isolated from each other.Let x denote the integer value of x.In 2D, we have and in 3D The coefficients η and a are both zero.Tables 3 and 4 display the results obtained by using different preconditioners for 2D and 3D problems.The same happens with the Skyscraper problems except that the velocity field is changed to be a 1000, 1000, 1000 T .The tested results are displayed in Tables 5 and 6.The domain is made of 10 anisotropic layers with jumps of up to four orders of magnitude and an anisotropy ratio of up to 10 3 in each layer.For 3D problem, the cube is divided in to 10 layers parallel to z 0, of size 0.1, in which the coefficients are constant.The coefficient κ x in the ith layer is given by v i , the latter being the ith component of the vector v α, β, α, β, α, β, γ, α, α , where α 1, β 10 2 and γ 10 4 .We have κ y 10κ x and κ z 1000κ x .The velocity field is zero.Numerical results are shown in Tables 7 and 8. From the tests results presented in this paper, we can see that the composite preconditioners have better performance than using just a single preconditioner.The FGMRES method preconditioned by M TBTD produces nearly the same results as by preconditioner M TFFD , and also for the composite preconditioners M ITB and M ITF .However, the preconditioner proposed in this paper has the advantage of parallel computation.For Advection-diffusion and nonhomogeneous problems, there is a little difference between using ones or Ritz vector as the filtering vector.Considering the additional costs for Ritz vector, it is reasonable to use the ones as filtering vector.

Conclusion
In this paper, we introduce a new variant of tangential filtering decomposite preconditioner M TBTD , which is based on the twisted factorization of the coefficient matrix A. The new one is comparable to the preconditioner M TFFD presented in 10 .Considering the process of preconditioning with M TBTD described in Algorithm 1, the preconditioner M TBTD is superior to M TFFD for its parallel property.And for the same reason, the composite preconditioner M ITB surpasses M ITF .

Figure 1 :
Figure 1: Convergence history for FGMRES when 1/h 100 for the advection-diffusion problem with a rotating velocity in two dimensions.

Table 1 :
Results for Case 1: the advection-diffusion problem with a rotating velocity in two dimensions.Top resuls are using t 1, 1, . . ., 1 T , bottom resuls are using Ritz vector corresponding to the smallest eigenvalue of A.

Table 2 :
Results for Case 2: Nonhomogenous problems with large jumps in the coefficients in two dimensions.Top results are using t 1, 1, . . ., 1 T as filtering vector, bottom results are using Ritz vector corresponding to the smallest eigenvalue of A.

Table 4 :
Results for Case 3: skyscraper problems.Top results are in 2D and bottom results are in 3D, t is set to be the Ritz vector corresponding to the smallest eigenvalue of A.

Table 6 :
Results for Case 4: convective skyscraper problems.Top results are in 2D, bottom results are in 3D, t is set to be the ritz vector corresponding to the smallest eigenvalue of A.

Table 8 :
Results for Case 5: anisotropic layers.Top results are in 2D, bottom results are in 3D, t is set to be the Ritz vector corresponding to the smallest eigenvalue of A.