This study develops the novel fourth-order iterative alternating decomposition explicit (IADE) method of Mitchell and Fairweather (IADEMF4) algorithm for the solution of the one-dimensional linear heat equation with Dirichlet boundary conditions. The higher-order finite difference scheme is developed by representing the spatial derivative in the heat equation with the fourth-order finite difference Crank-Nicolson approximation. This leads to the formation of pentadiagonal matrices in the systems of linear equations. The algorithm also employs the higher accuracy of the Mitchell and Fairweather variant. Despite the scheme’s higher computational complexity, experimental results show that it is not only capable of enhancing the accuracy of the original corresponding method of second-order (IADEMF2), but its solutions are also in very much agreement with the exact solutions. Besides, it is unconditionally stable and has proven to be convergent. The IADEMF4 is also found to be more accurate, more efficient, and has better rate of convergence than the benchmarked fourth-order classical iterative methods, namely, the Jacobi (JAC4), the Gauss-Seidel (GS4), and the successive over-relaxation (SOR4) methods.
1. Introduction
Numerical methods with accuracy of the order, O(Δh)n, n>2, are referred to as higher-order methods (h = mesh size). Recent developments seem to desire methods of higher-order for achieving higher accuracy numerical solutions for problems involving partial differential equations. When higher-order methods are evaluated, factors such as rate of convergence, stability, and boundary conditions have to be considered too. There are mainly two categories of finite difference higher-order schemes.
The first category is the noncompact stencils that utilize any grid point surrounding the grid points of interest where the difference schemes are implemented. The method involves larger matrix bandwidth, but in many cases, the increase is not large [1]. There will be a probable increase in execution time as more grid points are used. However, increasing the number of grid points provides advantages in terms of enhancement in accuracy and improvement in resolution [2]. To enhance accuracy, the approach should consider proper treatment of boundary conditions of comparable accuracy.
The second category is the compact stencils that use Lax-Wendroff’s idea [3] proposed by MacKinnon and Carey [4]. The method uses smaller number of stencils, making it computationally efficient and highly accurate. However, it requires inversion of matrix to obtain spatial derivative at each point. Also, the boundary stencil has a large effect on the stability and accuracy of the scheme [5]. Usually, compact schemes of order higher than four require formation of auxiliary equations due to boundary conditions. This results in large bandwidth matrices which are not symmetric [6]. Furthermore, the schemes can get fairly complicated for more complex equations.
Some of the current findings on higher-order methods include the work by Sulaiman et al. [7] who suggested a fourth-order quarter sweep modified successive over-relaxation (QSMSOR) iterative method for solving a one-dimensional parabolic equation. It is found to be superior in terms of rate of convergence and execution time as compared to other SOR methods. Jha [8] formulated the six-order accurate quarter sweep alternating group explicit (QSAGE) iterative finite difference method for solving nonlinear singular two-point boundary value problems. The method can be implemented in parallel and is found to be superior to the corresponding full sweep alternating group explicit (AGE) and SOR methods. Jin et al. [9] proposed an AGE iteration method of the fourth-order accuracy by integrating the grouping explicit method with numerical boundary conditions. The method was used to solve initial-boundary value problem of convection equations. Fu and Tan [10] showed that an unconditionally stable split-step FDTD method with higher-order spatial accuracy is more accurate than the lower-order methods. The dispersion error of the proposed method is comparable with the higher-order ADI-FDTD method.
Higher-order methods are also studied by Mohebbi and Deghan [11] who applied a compact finite difference approximation of fourth-order and the cubic C^{1} spline collocation method to some one-dimensional heat and advection-diffusion equations. The scheme has fourth-order accuracy in both space and time and unconditionally stable. Liao [12] proposed an efficient and high-order accuracy of the fourth-order compact finite difference method to solve one-dimensional Burgers’ equation. Tian and Ge [13] studied a stable fourth-order compact ADI method for solving two-dimensional unsteady convection diffusion problems. The method is temporally second-order and spatially fourth-order accurate, which requires only a regular five-point 2D stencil similar to that in the standard second-order methods. Chun [14] solved some nonlinear equations by applying the fourth-order iterative methods containing the King’s fourth-order family. It was observed that the proposed method has at least equal performance compared to other methods of the same order. Zhu et al. [15] presented a high-order parallel finite difference algorithm based on the domain decomposition strategy. The study used the classical explicit scheme to calculate the interface values between subdomains and the fourth-order compact schemes for the interior values. The method has high accuracy and is convergent and stable. Gao and Xie [16] devised a fourth-order alternating direction implicit compact finite difference schemes for the solution of two-dimensional Schrödinger equations. The method is highly competitive as compared to other existing methods, and it achieves the expected convergence rate.
This study develops a higher-order finite difference algorithm, with noncompact stencils, that is capable of delivering highly accurate solutions for a one-dimensional heat equation with Dirichlet boundary conditions. The study focuses on modifying the unconditionally stable and convergent second-order iterative alternating decomposition explicit (IADE) method of Mitchell and Fairweather (IADEMF2). The IADEMF2, which was originally proposed by Sahimi et al. [17], employs the fractional splitting of the Mitchell and Fairweather (MF) variant that has an accuracy of the order O((Δt)2+(Δx)4). The scheme executes a two-stage process involving the solution of sets of tridiagonal equations along lines parallel to the first and second time steps, respectively. It is found to be unconditionally stable, convergent, and more accurate than the classical AGE class of methods, namely, the AGE method based on the Peaceman and Rachford variant (AGE-PR), whose spatial accuracy is of second order, and the AGE method based on the Douglas and Rachford variant (AGE-DR), whose temporal accuracy is only of first order. The detailed derivation of the IADEMF2 can be obtained from [17].
In this paper, a fourth-order Crank-Nicolson (CN) difference approximation is applied to the spatial derivative in the heat equation, and the MF variant is employed, leading to the formation of the fourth-order IADEMF (IADEMF4) numerical algorithm. The convergence of the IADEMF4 is analyzed and proven. Numerical experiments verify the potential of the IADEMF4 in enhancing the accuracy of the IADEMF2. The results of the proposed higher-order scheme are also compared with the benchmarked fourth-order classical iterative methods, such as the fourth-order Gauss-Seidel (GS4), the fourth-order Jacobi (JAC4), and the fourth-order successive over-relaxation (SOR4) methods.
This paper is organized as follows. Section 2 discusses the implementation and stability of the fourth-order CN approximation on the heat equation. In Section 3, the IADEMF4 algorithm is formulated. Section 4 analyses the convergence of the IADEMF4. Section 5 provides the equations for the benchmarked fourth-order classical iterative methods. The computational complexity of the methods considered in this paper is given in Section 6, while the pseudocode for the IADEMF4 sequential algorithm is presented in Section 7. The experiments conducted are given in Section 8. Sections 9 and 10 provide some discussion and conclusion based on the obtained numerical results.
2. Fourth-Order Crank-Nicolson Approximation to the Spatial Derivative in the Heat Equation
Consider the following one-dimensional parabolic heat equation (1) that has been suitably assumed to be nondimensionalised. It models the flow of heat in a homogeneous unchanging medium of finite extent in the absence of heat source
(1)∂U∂t=∂2U∂x2
subject to given initial and Dirichlet boundary conditions
(2)U(x,0)=f(x),0≤x≤1,U(0,t)=g(t),0<t≤T,U(1,t)=h(t),0<t≤T.
For the problem in (1), the finite difference approach discretizes the time-space domain by placing a rectangular grid over the domain, with grid spacing of Δtand Δx in the t- and x-directions, respectively. The grid consists of the set of lines parallel to thet-axis given by xi=iΔx, i=0,1,…,m,m+1 and a set of lines parallel to the x-axis given by tk=kΔt, k=0,1,…,n,n+1. For simplicity, the grid spacing is taken to be uniform, so that Δx=1/(m+1), and Δt=T/(n+1). At a grid-point P(xi,tk) in the solution domain, the dependent variable U(x,t)which represents the nondimensional temperature at time t and at position x is approximated by uik.
At the grid-pointP(xi,tk+1/2), the IADEMF4 scheme replaces the spatial derivative in the heat equation with a higher-order, particularly the fourth-order Crank-Nicolson (CN) difference approximation [18]. This is shown in the expression given in (3), with the central difference operators defined as δx2uik=ui-1k-2uik+ui+1k and δx2uik+1=ui-1k+1-2uik+1+ui+1k+1. The approach gives the fourth-order scheme a spatial truncation error of the order O(Δx)4. It enhances the accuracy of its second-order counterpart, which has a larger error of the order O(Δx)2(3)1Δt(uik+1-uik)=12(Δx)2(δx2-112δx4)(uik+1+uik).
To determine the stability of (3), the von Neumann stability analysis can be applied. Let λ=Δt/(Δx)2, and since δx4uik=δx2(δx2uik), the discretization of (3) becomes
(4)uik+1-uik=λ2(-112ui-2k+1+43ui-1k+1-52uik+1+43ui+1k+1-112ui+2k+1-112ui-2k+43ui-1k-52uik+43ui+1k-112ui+2k).
The discretized equation in (4) is assumed to have a solution in the form of a Fourier harmonic function; that is, uik=ρkesβiΔx, where ρ is referred to as the amplification factor, β is an arbitrary constant, and s=-1. The amplification factor represents the time dependence of the solution. If the Fourier function is substituted into (4) and then solve for ρ, the result will be
(5)ρ=1-(λ/6)((cosβ~-4)2-9)1+(λ/6)((cosβ~-4)2-9),withβ~=βΔx.
Since |ρ|≤1, then the fourth-order CN approximation is unconditionally stable for any choice of β, Δt, and Δx.
3. The Formulation of the IADEMF4
The IADEMF4 is firstly developed based on the execution of the unconditionally stable fourth-order CN approximation (3) on the heat equation.
Equation (4) can also be expressed as in (6), using the definitions of the constants given in (7)
(6)aui-2k+1+bui-1k+1+cuik+1+dui+1k+1+eui+2k+1=-aui-2k-bui-1k+c^uik-dui+1k-eui+2k,i=2,3,…,m-1,(7)a=λ24,b=-2λ3,c=4+5λ4,d=-2λ3,e=λ24,c^=4-5λ4.
The approximation in (6) can be displayed in a matrix form such as Au=f (8), where A is a sparse pentadiagonal coefficient matrix and u=(u2,u3,…,um-2,um-1)T is the column vector containing the unknown values of u at the time level k+1. The column vector f=(f2,f3,…,fm-2,fm-1)T consists of boundary values and known u values at the previous time level k. The definitions for every entry in f are given in (9)
(8)Au=f[cdebcdeOabcde⋱⋱⋱⋱abcdeOabcdabc](m-2)×(m-2)[u2u3⋮um-2um-1]k+1=[f2f3⋮fm-2fm-1],(9)f2=-b(u1k+u1k+1)+c^u2k-du3k-eu4k,f3=-a(u1k+u1k+1)-bu2k+c^u3k-du4k-eu5k,fi=-aui-2k-bui-1k+c^uik-dui+1k-eui+2kfori=4,5,…,m-3,fm-2=-aum-4k-bum-3k+c^um-2k-dum-1k-e(umk+umk+1),fm-1=-aum-3k-bum-2k+c^um-1k-d(umk+umk+1).
The evaluations of f2, f3, fm-2, and fm-1 require the values ofu at the boundaries i=1 and i=m. However, these values cannot be obtained numerically because their computations involve nodes at i=-1 and i=m+2, which are exterior to the considered solution domain. If the exact solutions of u are available at i=1 and i=m, then it is appropriate to consider them as the required boundary values. Otherwise, the boundary conditions have to be formulated, bearing in mind that they should be of comparable accuracy [1].
The IADEMF4 scheme secondly employs the higher-order accuracy formula of MF [19]. The variant, whose accuracy is of the order O((Δt)2+(Δx)4), is as given in (10) and (11)
(10)(rI+G1)u(p+1/2)=(rI-gG2)u(p)+f,(11)(rI+G2)u(p+1)=(rI-gG1)u(p+1/2)+gf,
where r, p, and I represent an acceleration parameter, the iteration index, and an identity matrix, respectively. G1 and G2 are two constituent matrices. The vectors u(p+1) and u(p+1/2) represent the required solution at the iteration level (p+1) and at some intermediate level (p+1/2), respectively. The relation of g and r is given by g=(6+r)/6, r>0.
Substitute the following expression obtained from (10) into (11)
(12)u(p+1/2)=(rI+G1)-1[(rI-gG2)u(p)+f],
and then simplify to obtain
(13)(rI+G2)u(p+1)=(rI-gG1)(rI+G1)-1(rI-gG2)u(p),+[(rI-gG1)(rI-gG1)-1+g]f.Aspin (13) becomes sufficiently large, the temperature solution reaches a steady state that is, as p→∞, then u(p+1)→u and u(p)→u. Simplify the equation, and then multiply by (rI+G1). After some algebraic manipulations, the following expression is obtained:
(14)[r(G1+G2)(1+g)+(1-g2)G1G2]u=r(1+g)f.Multiply (14) by 1/r(1+g), and use the definition of g to finally obtain the following form:
(15)[16(G1+G2)-16G1G2]u=f.
The comparison between the expression Au=f in (8) and the form given in (15) suggests that the coefficient matrixAfor the IADEMF4 can be decomposed into
(16)A=G1+G2-16G1G2.The IADEMF4 requires the constituent matrices G1 and G2 in (16) to be in the form of lower and upper tridiagonal matrices, respectively, in order to retain the pentadiagonal structure of A. Thus,
(17)G1=[1l11Om^1l2⋱m^2⋱⋱⋱lm-41Om^m-4lm-31](m-2)×(m-2),G2=[e^1u^1v^1e^2u^2v^2O⋱⋱⋱Oe^m-4u^m-4v^m-4e^m-3u^m-3e^m-2](m-2)×(m-2).
By substituting G1 and G2 in (17) into the formula for the decomposition of A, the entries of the resultant matrix are compared with the entries of A in (8), yielding the following definitions:
(18)e^1=6(c-1)5,u^1=6d5,l1=6b6-e^1,e^2=6(c-1)+l1u^15,v^i=6e5fori=1,2,…,m-4.
And for i=2,3,…,m-3,
(19)u^i=6d+li-1v^i-15,m^i-1=6a6-e^i-1,li=6b+m^i-1u^i-16-e^i,e^i+1=6(c-1)+liu^i+m^i-1v^i-15.
Since G1 and G2 are three banded matrices, then (rI+G1) and (rI+G2) can be inverted easily. The equations in (10) and (11) are rearranged as in (20) and (21), respectively,
(20)u(p+1/2)=(rI+G1)-1(rI-gG2)u(p)+(rI+G1)-1f,(21)u(p+1)=(rI+G2)-1(rI-gG1)u(p+1/2)+g(rI+G2)-1f.The above two equations are computed and simplified, leading to the computational formulae at each of the half iteration levels as given in (22) and (23).
At the (p+1/2) iteration level,
(22)ui(p+1/2)=1R(Ei-1ui(p)+Wi-1ui+1(p)+Vi-1ui+2(p)-m^i-3ui-2(p+1/2)-li-2ui-1(p+1/2)+fi),i=2,3,…,m-2,m-1
At the (p+1) iteration level,
(23)ui(p+1)=1Zi-1(Si-3ui-2(p+1/2)+Qi-2ui-1(p+1/2)+Pui(p+1/2)-u^i-1ui+1(p+1)-v^i-1ui+2(p+1)+gfi),i=m-1,m-2,…,3,2
with
(24)m^-1=m^0=l0=Vm-2=Vm-3=Wm-2=u^m-2=v^m-2=v^m-3=Q0=S-1=S0=0,R=1+r,P=r-g,Ei=r-ge^i,Zi=r+e^i,i=1,2,…,m-2,Wi=-gu^i,Qi=-gli,i=1,2,…,m-3,Vi=-gv^i,Si=-gm^i,i=1,2,…,m-4.
The IADEMF4 algorithm is regarded as a two-stage process involving two iteration levels, (p+1/2) and (p+1). It is completed explicitly by using the required equations at the two levels in alternate sweeps along all the grid points in the interval (0,1) until convergence is reached. In (22), the calculation to determine the unknown ui(p+1/2) begins at the left boundary and then moves to the right. In a similar manner, ui(p+1) in (23) is calculated by proceeding from the right boundary towards the left (Figure 1).
The two-stage IADEMF4 algorithm. The directions of the sweeps at the (p+1/2) and (p+1) iteration levels.
The computational molecules are depicted in Figures 2 and 3.
Computational molecule of the IADEMF4 at the (p+1/2) iteration level.
Computational molecule of the IADEMF4 at the (p+1) iteration level.
At each level of iteration, the computational molecules involve two known grid points at the new level and another three known ones at the old level. Clearly, the method is explicit.
4. Convergence Analysis of the IADEMF4
This section proves the convergence of the IADEMF4. Since λ>1 will not guarantee an accurate approximation for ∂U/∂t [18], the values of λ that are considered appropriate for the proof are 0<λ≤1.
From the definitions in ((18) and (19)), the following results are obtained:
(25)e^1=6(c-1)5=3λ2,implyingthat0<e^1≤32,u^1=6d5<0,l1=6b6-e^1<0,v^i=6e5>0fori=1,2,…,m-4,u^2=6d+l1v^15=u^1+l1v^15<u^1,sincel1v^1<0,e^2=6(c-1)+l1u^15=e^1+l1u^15>e^1,sincel1u^1>0.
Simple computation shows that l1u^1/5<1. Clearly, e^2<6.
Lemma 1.
e^i in ((18) and (19)) is such that, 0<e^i<6, for all i=1,2,3,…,m-2.
Proof.
The results in (25) show that 6>e^2>e^1>0. Assume it is true that 6>e^k>e^k-1>0 for all k=3,4,5,…,m-3. Since the assumption implies that 6>e^k-1>e^k-2>0, then m^k-1=6a/(6-e^k-1)>m^k-2=6a/(6-e^k-2)>0. Therefore, m^k-1v^k-1>m^k-2v^k-2.
It has also been shown that u^2<u^1<0. Assume it is true that u^j<u^j-1<0for all j=3,4,…,m-4. The assumption implies that u^j-1<u^j-2<0. Thus, m^j-1u^j-1<m^j-2u^j-2<0(26)u^j+1=6d+ljv^j5=u^1+ljv^j5=u^1+(6b+m^j-1u^j-1)v^j5(6-e^j)<u^1+(6b+m^j-2u^j-2)v^j-15(6-e^j-1)=u^1+lj-1v^j-15=u^j<0.
Since it is true that, for j+1, u^j+1<u^j<0, then, by induction, the assumption u^j<u^j-1<0 is true for all j. An equivalent to the preceding statement would be u^k<u^k-1<0 for all k=3,4,5,…,m-3. It follows that
(27)lku^k=(6b+m^k-1u^k-1)u^k6-e^k>(6b+m^k-2u^k-2)u^k-16-e^k-1=lk-1u^k-1,e^k+1=6(c-1)+lku^k+m^k-1v^k-15=e^1+lku^k+m^k-1v^k-15>e^1+lk-1u^k-1+m^k-2v^k-25=e^k>0.
Since it is true that, for k+1, e^k+1>e^k>0, then, by induction, e^k>e^k-1>0 is also true for all k.
Suppose there is a k such that e^k>6. Then,
(28)e^k+1=e^1+lku^k+m^k-1v^k-15=e^1+6b+u^k5(6-e^k)+m^k-15(u^k-1u^k6-e^k+v^k-1)<e^1.
This is a contradiction since e^k+1>e^1. This verifies that 6>e^k>e^k-1>0, k=3,4,5,…,m-3. Let i=2,3,4,…,m-2, and then 6>e^i>e^i-1>0. Therefore, 0<e^i<6,i=1,2,3,…,m-2. Lemma 1 is proved.
Lemma 2.
If r>0 and g=(6+r)/6, then max|(r-ge^i)/(r+e^i)|<1,i=1,2,3,…,m-2.
Proof.
Let max|(r-ge^i)/(r+e^i)|=|(r-ge^j)/(r+e^j)| for somej.
Assume |(r-ge^j)/(r+e^j)|≥1. Since r>0 and e^j>0, then r+e^j>0.
If (r-ge^j)/(r+e^j)≥1, then -((6+r)/6)e^j≥e^j, which implies that r≤-12. This contradicts the fact that r>0.
If (r-ge^j)/(r+e^j)≤-1, then 2r-((6+r)/6)e^j≤-e^j, which implies that e^j≥12. This contradicts Lemma 1.
So, the assumption that |(r-ge^j)/(r+e^j)|≥1 is false. Hence, max|(r-ge^i)/(r+e^i)|=|(r-ge^j)/(r+e^j)|<1.
Proposition 3.
∥(rI-gG1)(rI+G1)-1∥2<1.
Proof.
Let F=(rI-gG1)(rI+G1)-1; that is,
(29)F=[r-gd⋱r-gdO⋱⋱⋱⋮⋮⋱⋱⋯⋯⋯⋯r-gd].F is a lower triangular matrix, with all the diagonal entries (whence the eigenvalues of F) equal to (r-g)/d, where d=r+1. Denote all the eigenvalues of F by λF. If ρ[F] is defined as the spectral radius of F, then
(30)ρ[F]=max|λF|=|λF|=|r-gr+1|≤∥F∥2.But, by definition of 2-norm,
(31)∥F∥2=max∥x∥2≠0∥Fx∥2∥x∥2=max∥y∥2=1∥Fy∥2∥y∥2=max∥y∥2=1∥Fy∥2≤max∥w∥2=1∥Fw∥2=max∥w∥2=1∥λFw∥2.Since all the eigenvalues of F are equal, then
(32)∥F∥2≤|λF|max∥w∥2=1∥w∥2=|λF|=ρ[F].Thus, from (30) and (32), ∥F∥2=|λF|=|(r-g)/(r+1)|. By Lemma 2 with ej replaced by 1, |(r-g)/(r+1)|<1 is obtained, leading to the result of Proposition 3, which is ∥F∥2=∥(rI-gG1)(rI+G1)-1∥2<1.
Proposition 4.
∥(rI-gG2)(rI+G2)-1∥2<1.
Proof.
LetK=(rI-gG2)(rI+G2)-1; that is,
(33)K=[r-ge^1z1⋱⋯⋯⋯r-ge^2z2⋱⋯⋮⋱⋮⋮O⋱⋱⋮r-ge^mzm].K is an upper triangular matrix, with all the diagonal entries equal to (r-ge^i)/zi,i=1,2,…,m-2, where zi=r+e^i. Since all the eigenvalues of K are distinct, then K is similar to a diagonal matrix D.
By the Schur triangularization theorem [20], there is an orthogonal matrix O such that OTKO=D. The diagonal entries of D are the eigenvalues of K(34)∥K∥2=∥OTKO∥2=∥D∥2=maximumeigenvalueofDTD=maximumeigenvalueofD2=ρ(D2)=ρ(K2)=ρ(K).
By Lemma 2,
(35)∥K∥2=ρ(K)=max|r-ge^ir+e^i|=|r-ge^jr+e^j|<1,
for some j. Hence, this proves Proposition 4.
From (20) and (21),
(36)u(p+1)=(rI+G2)-1(rI-gG1)×[(rI+G1)-1(rI-gG2)u(p)+(rI+G1)-1f]+g(rI+G2)-1f.
Let(37)M(r)=(rI+G2)-1(rI-gG1)(rI+G1)-1(rI-gG2),
And let
(38)q(r)=[(rI+G2)-1(rI-gG1)(rI+G1)-1+g(rI+G2)-1]f.
Then, u(p+1)=M(r)u(p)+q(r).
Theorem 5.
The IADEMF4 is convergent if ρ[M(r)]<1, for r>0.
Proof.
Define M~(r)=(rI+G2)M(r)(rI+G2)-1; then
(39)M~(r)=(rI-gG1)(rI+G1)-1(rI-gG2)(rI+G2)-1.Thus, by similarity, M(r) and M~(r) have the same set of eigenvalues. Therefore,
(40)ρ[M(r)]=ρ[M~(r)]≤∥M~(r)∥2=∥(rI-gG1)(rI+G1)-1(rI-gG2)(rI+G2)-1∥2≤∥(rI-gG1)(rI+G1)-1∥2∥(rI-gG2)(rI+G2)-1∥2<1.The last inequality is due to Propositions 3 and 4. The proven Theorem 5 assures the convergence of the IADEMF4.
5. The Fourth-Order Classical Iterative Methods
The system of linear equations in (8) may also be solved by using the classical iterative methods of the fourth order. They include the JAC4, the GS4, and the SOR4 methods. These iterative methods are capable of exploiting the sparse structure of the pentadiagonal matrix.
The JAC4 algorithm can be represented by
(41)ui(p+1)=(fi-aui-2(p)-bui-1(p)-dui+1(p)-eui+2(p))c,i=2,3,…,m-1.
The approximation of ui(p+1) at the (p+1)th iteration level is computed using the relevant values in thepth iteration.
The GS4 method uses the most recent values of ui-2(p+1) and ui-1(p+1) to update the approximation value of ui(p+1). The algorithm for the GS4 is as expressed in (42)
(42)ui(p+1)=(fi-aui-2(p+1)-bui-1(p+1)-dui+1(p)-eui+2(p))c,i=2,3,…,m-1.
The SOR4 iterative method accelerates the convergence rate of the GS4. Ifωis a relaxation parameter, then for any ω≠0, (42) can be rewritten as
(43)ui(p+1)=(1-ω)ui(p)+ω(fi-aui-2(p+1)-bui-1(p+1)-dui+1(p)-eui+2(p))c,i=2,3,…,m-1.
The SOR4 algorithm reduces to the GS4 if ω=1. Except for very special cases, it is difficult to obtain the analytic expression for ω. According to Young [21], the precise determination of the optimal ω in the SOR is only known for a small class of matrices. The value of ω generally lies in the range of 1<ω<2.
The computational molecules for the JAC4 and the GS4 are as illustrated in Figures 4 and 5, respectively. The SOR4 has the same form as the GS4.
Computational molecule of the JAC4.
Computational molecule of the GS4/SOR4.
6. Computational Complexity
The cost of implementing an algorithm can be assessed by examining its computational complexity. The number of arithmetic operations such as additions (including subtractions) and multiplications (including divisions) that is needed to perform by the algorithm can be straightforwardly counted. Table 1 gives the number of sequential arithmetic operations per iteration that is required to evaluate the algorithms.
Sequential arithmetic operations per iteration (m: problem size).
Method
Number of additions
Number of multiplications
Total operation count
IADEMF4
10(m-2)
13(m-2)
23(m-2)
IADEMF2
6m
9m
15m
JAC4/GS4
4(m-2)
5(m-2)
9(m-2)
SOR4
5(m-2)
7(m-2)
12(m-2)
For the higher-order schemes, trade-off between accuracy and speed usually happens. The computational complexity has an effect on the efficiency of a particular scheme. Thus, this factor will be taken into account when discussing the results of the numerical experiments conducted in this study.
7. The Pseudocode of the IADEMF4 Sequential Algorithm
Algorithm 1 illustrates the pseudocode of the IADEMF4 sequential algorithm. This algorithm can also be generalized and applied to other methods discussed in this paper.
Two experiments were conducted to test the sequential numerical performance of the proposed IADEMF4 method against those of the benchmarked classical iterative methods, namely, the JAC4, the GS4, and the SOR4. Comparison is also made with the corresponding IADE method of the second order.
Experiment 1.
This problem was taken from Saulev [22]
(44)∂U∂t=∂2U∂x2,0≤x≤1
subject to the initial condition
(45)U(x,0)=4x(1-x),0≤x≤1
and the boundary conditions
(46)U(0,t)=U(1,t)=0,t≥0.
The exact solution to the given problem is given by
(47)U(x,t)=32π3∑k=1,(2)∞1k3e-π2k2tsin(kπx).
Experiment 2.
This problem was taken from Johnson and Reiss [23],
(48)∂U∂t=∂2U∂x2,0≤x≤1
subject to the initial condition
(49)U(x,0)=sin(πx)(1+6cos(πx)),0≤x≤1
and the boundary conditions
(50)U(0,t)=U(1,t)=0,t≥0.
The exact solution to the given problem is given by
(51)U(x,t)=sin(πx)e-π2t+3sin(2πx)e-4π2t.
9. Results and Discussion
In each experiment, the exact solutions at i=1 and i=m were taken as boundary values for the IADEMF4 and the other fourth-order classical iterative methods. The convergence criterion used in the testing of each method was taken as ∥u(p+1)-u(p)∥∞≤ε, whereεis the convergence tolerance. The selections of the optimumr or ωwere determined by experiments. As for the IADEMF2, the CN scheme with θ=1/2 was employed.
Figures 6, 7, 8, and 9 visualize the behavior of the one-dimensional parabolic heat solutions for both experiments. By using m=10 and a tolerance requirement of ε=10-4, two different mesh sizes, λ=0.5 and λ=1.0, were considered in each experiment. The exact solutions are compared with the IADEMF4 and the IADEMF2 numerical solutions. The optimum value ofrchosen for each method, as well as their corresponding outcome of the number of iterations, n, is stated in the legend of each figure. Every figure reveals that the IADEMF4 is more accurate than the IADEMF2 and the former converges with fewer numbers of iterations in comparison to the latter. The numerical solutions of the IADEMF4 seem to be in very good agreement with the exact solutions. For example, in Figure 7, at x=0.5, the difference between the exact solution and the numerical solution using the IADEMF2 and using the IADEMF4 is about 3.6% and 0%, respectively. These results imply that the accuracy and convergence rate of the second-order IADE method are enhanced by the implementation of the fourth-order CN approximation that leads to the formation of the corresponding fourth-order IADE scheme.
Numerical and exact solutions for Experiment 1. λ=0.5, Δx=0.1, Δt=0.005, t=0.25, and ε=10-4.
Numerical and exact solutions for Experiment 1. λ=1.0, Δx=0.1, Δt=0.01, t=0.5, and ε=10-4.
Numerical and exact solutions for Experiment 2. λ=0.5, Δx=0.1, Δt=0.005, t=0.25, and ε=10-4.
Numerical and exact solutions for Experiment 2. λ=1.0, Δx=0.1, Δt=0.01, t=0.5, and ε=10-4.
Figure 10 displays the graph of log (RMSE) versus log(Δx) for decreasing values of Δx implemented on Experiment 2. By considering ε=10-4 and fixing Δt=0.0001, the value of Δx was initially taken as Δx=0.125. It was then successively halved into Δx/2, Δx/4, and Δx/8.
Log(RMSE) versus Log(Δx) Experiment 2. Δt=0.0001, t=0.002, and ε=10-4.
The figure shows that amongst the tested methods, the root mean square error (RMSE) of the IADEMF4 and the IADEMF2 decreases linearly as the values of Δx decreases. The slope of the IADEMF4 is approximately equal to 4, which corresponds to its fourth-order spatial accuracy. The IADEMF2 with a second-order spatial accuracy has a slope that is approximately equal to 2. It is clear that the IADEMF4 is always more accurate than the IADEMF2, for the different considered mesh sizes. The graphs of the SOR4 (ω=1.05), the GS4, and the JAC4 show that their accuracies of fourth order tend to lack as Δx decreases, largely due to the effect of increasing round-off errors as the value of m increases.
Tables 2, 3, 4, 5, 6, 7, 8, and 9 provide numerical results in terms of the average absolute error (AAE), root mean square error (RMSE), maximum error (ME), number of iterations (n), and execution time (ET) measured in seconds (s). The results are obtained from both experiments for two different values of m and mesh size, λ. It is generally observed that when m=700, λ=0.5, and ε=10-6, the IADEMF4 has the least average absolute error, root mean square error, and maximum error in comparison with the other methods under consideration (Tables 2–5). When the size of m was ten times bigger (m=7000) and a more stringent tolerance criterion was set (ε=10-10), the accuracy of the IADEMF4 clearly outperforms the other methods for a mesh size of λ=0.5 (Tables 6 and 8). However, the difference in the errors amongst the tested methods is not so obvious for the case of λ=1.0 (Tables 7 and 9). The achievement of the fourth-order IADE method in Experiments 1 and 2 can be clearly seen from the results in Tables 2 and 8, respectively, where it has caused a huge 94% reduction of RMSE from its corresponding second-order IADE method.
Experiment 1. m=700, λ=0.5, Δx=1.43×10-3, Δt=1.02×10-6, t=5.09×10-5, and ε=10-6.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=0.7)
2.67e-9
3.30e-9
2.12e-8
100
4.40e-2
IADEMF2(r=0.8)
1.88e-8
4.95e-8
3.23e-7
100
6.35e-3
SOR4(ω=1.2)
5.40e-7
5.44e-7
5.52e-7
108
4.60e-2
GS4
5.40e-6
5.43e-6
3.32e-6
150
4.72e-2
JAC4
2.28e-5
2.29e-5
2.32e-5
150
4.73e-2
Experiment 1. m=700, λ=1.0, Δx=1.43×10-3, Δt=2.04×10-6, t=5.09×10-5, and ε=10-6.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=1.0)
2.97e-7
3.01e-7
3.01e-7
50
2.01e-2
IADEMF2(r=0.9)
5.57e-7
5.59e-7
8.48e-7
50
3.17e-3
SOR4(ω=1.2)
3.29e-6
3.31e-6
3.35e-6
76
2.25e-2
GS4
8.95e-6
9.00e-6
9.00e-6
100
2.38e-2
JAC4
2.14e-5
2.15e-5
2.17e-5
125
2.50e-2
Experiment 2. m=700, λ=0.5, Δx=1.43×10-3, Δt=1.02×10-6, t=5.09×10-5, and ε=10-6.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=0.7)
2.07e-8
2.31e-8
3.45e-8
100
9.03e-3
IADEMF2(r=0.8)
1.06e-7
1.75e-7
1.75e-7
100
6.47e-3
SOR4(ω=1.1)
1.54e-6
1.71e-6
2.55e-6
200
1.10e-2
GS4
2.94e-6
3.27e-6
4.87e-6
250
1.18e-2
JAC4
1.24e-5
1.38e-5
2.06e-5
300
1.27e-2
Experiment 2. m=700, λ=1.0, Δx=1.43×10-3, Δt=2.04×10-6, t=5.09×10-5, and ε=10-6.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=0.9)
1.38e-6
1.54e-6
2.29e-6
75
6.50e-3
IADEMF2(r=0.5)
1.87e-6
2.07e-6
3.09e-6
75
4.72e-3
SOR4(ω=1.2)
3.12e-6
3.47e-6
5.17e-6
125
9.06e-3
GS4
6.67e-6
7.42e-6
1.11e-5
175
9.32e-3
JAC4
1.26e-5
1.41e-5
2.10e-5
250
1.05e-2
Experiment 1. m=7000, λ=0.5, Δx=1.43×10-4, Δt=1.02×10-8, t=5.10×10-7, and ε=10-10.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=0.8)
3.67e-10
4.33e-10
3.66e-9
173
1.46e-1
IADEMF2(r=0.5)
3.75e-10
4.35e-10
3.76e-9
200
1.24e-1
SOR4(ω=1.12)
4.55e-10
5.07e-10
3.67e-9
257
1.47e-1
GS4
1.10e-9
1.11e-9
3.76e-9
300
1.60e-1
JAC4
2.30e-9
2.31e-9
3.78e-9
400
1.63e-1
Experiment 1. m=7000,
λ=1.0, Δx=1.43×10-4, Δt=2.04×10-8, t=5.1×10-7, and ε=10-10.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=1.0)
1.63e-7
1.63e-7
1.63e-7
112
9.60e-2
IADEMF2(r=1.2)
1.63e-7
1.63e-7
1.63e-7
120
7.36e-2
SOR4(ω=1.15)
1.63e-7
1.63e-7
1.64e-7
171
9.99e-2
GS4
1.64e-7
1.64e-7
1.64e-7
216
1.07e-1
JAC4
1.65e-7
1.65e-7
1.65e-7
312
1.11e-1
Experiment 2. m=7000, λ=0.5, Δx=1.43×10-4, Δt=1.02×10-8, t=5.10×10-7, and ε=10-10.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=0.7)
5.67e-13
3.21e-12
1.30e-10
150
1.31e-1
IADEMF2(r=0.7)
1.99e-11
4.60e-11
1.82e-9
200
1.27e-1
SOR4(ω=1.15)
2.45e-10
2.73e-10
3.99e-10
260
1.51e-1
GS4
3.98e-10
4.43e-10
6.61e-10
400
1.71e-1
JAC4
1.05e-9
1.17e-9
1.74e-9
550
1.76e-1
Experiment 2. m=7000, λ=1.0, Δx=1.43×10-4, Δt=2.04×10-8, t=5.1×10-7, and ε=10-10.
Method
AAE
RMSE
ME
n
ET (s)
IADEMF4(r=1.0)
1.54e-6
1.71e-6
2.55e-6
96
9.12e-2
IADEMF2(r=1.0)
1.54e-6
1.71e-6
2.55e-6
96
6.07e-2
SOR4(ω=1.2)
1.54e-6
1.71e-6
2.55e-6
192
9.97e-2
GS4
1.54e-6
1.71e-6
2.55e-6
288
1.28e-1
JAC4
1.54e-6
1.72e-6
2.56e-6
408
1.46e-1
The IADEMF4 has the advantage of featuring higher accuracy due to the execution of the fourth-order CN approximation coupled with the fourth-order accurate MF variant. The IADEMF2 is only derived from the second-order CN approximation, but its combination with its corresponding fourth-order MF variant managed to produce errors which are better than the GS4, JAC4, and the SOR4. Even though the classical iterative methods are also derived from the fourth-order accurate CN type approximation, they are lacking in accuracy due to the round-off errors that have been accumulated from the time the execution starts till it ends.
With regards to the rate of convergence, the results from each table demonstrate that the number of iterations produced by the IADEMF2 and the fourth-order classical iterative methods is greater than or at least equal to that of the IADEMF4. The convergence rate of the latter surpasses the others in the case of m=7000 and ε=10-10 (Tables 6–8). Even though the operational count of the IADEMF4 is relatively quite large (Table 1), due to its higher level of accuracy, its increasing number of correct digits at each iteration causes it to converge at a faster rate. The IADEMF2 has the advantage of having less mathematical operations; thus, it is competitive in terms of convergence rate but at the expense of accuracy. The application of the fourth-order CN approximation on the heat equation proves that the three classical iterative methods also converge, with JAC4 appearing to be the slowest and the least accurate amongst all.
It is found that the convergence rate of the fourth-order and second-order methods improves with the application of larger mesh size, that is, from λ=0.5 to λ=1.0. The coarser meshes cause a reduction in the computational operations, thus giving better rate of convergence. However, a more accurate solution is obtained by using finer mesh. For example, Table 6 shows that, for λ=0.5, the IADEMF4 has RMSE=4.33e-10 and n=173, whereas Table 7 shows that, for λ=1.0, its RMSE=1.63e-7 and n=112. In general, amongst all the tested methods, the IADEMF4 still maintains its greater accuracy characteristic, even with coarser meshes.
In terms of execution time, the results from every table display shorter execution time for the IADEMF2 in comparison to the IADEMF4. This is expected, since the IADEMF2 has lower computational complexity (Table 1). Despite the achievement in accuracy, the IADEMF4 has to perform more computational work than the IADEMF2 since the former has to utilize values of u at two grid points on either side of the point (iΔx,kΔt) along the kth time level. Thus, if accuracy is desired, then the preferred sequential numerical algorithm would be the IADEMF4. On the other hand, if execution time matters, then the choice would be the IADEMF2.
Amongst the fourth-order methods, the IADEMF4 executes in the least amount of time. Even though its computational complexity is relatively large, it operates with the least number of iterations, thus enabling it to be the most efficient.
10. Conclusion
This paper proposes the development of the novel fourth-order IADEMF4 finite difference scheme. The higher-order scheme is developed by representing the spatial derivative in the heat equation with the fourth-order finite difference CN approximation. This leads to the formation of pentadiagonal matrices in the systems of linear equations with larger computational stencils. The algorithm also employs the higher accuracy of the Mitchell and Fairweather variant. Despite the fourth-order IADE scheme’s higher computational complexity, this technique is proved to be valuable because it enhances the accuracy of its second-order counterpart, namely, the IADEMF2. In addition, the higher accuracy of IADEMF4 is verified as a convergent and unconditionally stable scheme and is superior in terms of rate of convergence. It has also been proven to execute more efficiently in comparison to the other benchmarked fourth-order classical iterative methods, such as the GS4, the SOR4, and the JAC4. The increasing number of correct digits at each iteration serves as an advantage for the IADEMF4, thus yielding faster rate of convergence with higher level of accuracy.
In conclusion, the proposed IADEMF4 scheme affords users many advantages with respect to higher accuracy, stability, and rate of convergence, and it serves as an alternative, efficient technique for the solution of a one-dimensional heat equation with Dirichlet boundary conditions.
The proposed fourth-order scheme can be modified and adapted to more general multidimensional linear and nonlinear parabolic, elliptic, and hyperbolic partial differential equations, using different types of boundary conditions. Besides, it can also be considered for applications in problems that require higher-order accuracy with high resolution, such as problems in nanocomputing that require the solution of very large sparse systems of equation.
The explicit and high accuracy feature of the IADEMF4 can be exploited in two- and three-dimensional heat problems, by applying the scheme as pentadiagonal solvers for the system of linear equations arising in the sweeps of a higher-order alternating direction implicit (ADI) scheme. The lower-order ADI scheme was initially proposed by Peaceman and Rachford [24]. The higher space dimensions are expected not to cause unsatisfactory performance for the ADI-IADEMF4, as long as the proposed scheme is stable, convergent, and efficient as the process of its implementation continues to advance from one-time step to another. Pathirana et al. [25] implemented the ADI in developing a two-dimensional model for incorporating flood damage in urban drainage planning. The proposed model is found to be stable, numerically accurate, and computationally efficient. Mirzavand et al. [26] proposed the ADI-FDTD method for the physical modeling of high-frequency semiconductor devices. The approach is able to reduce significantly the full-wave simulation time.
It is possible to parallelize the IADEMF4 algorithm since the calculation of the new iteration only depends on the known values from the last iteration (Figures 2 and 3). Future work is to exploit the explicit computational properties and efficiency of the IADEMF4 for parallelization and execution on distributed parallel computing systems. The idea is to speed up the execution time without compromising its accuracy, especially on problems involving very large linear systems of equations.
NoyeJ.RecktenwaldG. W.Finite difference approximations to the heat equationLaxP.WendroffB.Systems of conservation lawsMacKinnonR. J.CareyG. F.Analysis of material interface discontinuities and superconvergent fluxes in finite difference theoryHixonR.TurkelE.LinY.SulaimanJ.HasanM. K.Othman MM.Abdul KarimS. A.Fourth-order QSMOR iterative method for the solution of one-dimensional parabolic PDE’sProceedings of the International Conference on Industrial and Applied Mathematics (CIAM '10)2010Bandung, IndonesiaITB3439JhaN.The application of sixth order accurate parallel quarter sweep alternating group explicit algorithm for nonlinear boundary value problems with singularityProceedings of the 2nd International Conference on Methods and Models in Computer Science (ICM2CS '10)December 201076802-s2.0-7995257099610.1109/ICM2CS.2010.5706722JinY.JinG.LiJ.A class of high-precision finite difference parallel algorithms for convection equationsFuW.TanE. L.Development of split-step FDTD method with higher-order spatial accuracyMohebbiA.DehghanM.High-order compact solution of the one-dimensional heat and advection-diffusion equationsLiaoW.An implicit fourth-order compact finite difference scheme for one-dimensional Burgers' equationTianZ. F.GeY. B.A fourth-order compact ADI method for solving two-dimensional unsteady convection-diffusion problemsChunC.Some fourth-order iterative methods for solving nonlinear equationsZhuS.YuZ.ZhaoJ.A high-order parallel finite difference algorithmGaoZ.XieS.Fourth-order alternating direction implicit compact finite difference schemes for two-dimensional Schrödinger equationsSahimiM. S.AhmadA.BakarA. A.The Iterative Alternating Decomposition Explicit (IADE) method to solve the heat conduction equationSmithG. D.MitchellA. R.FairweatherG.Improved forms of the alternating direction methods of Douglas, Peaceman, and Rachford for solving parabolic and elliptic equationsDattaB. N.YoungD. M.SaulevV. K.JohnsonL. W.RiessR. D.PeacemanD. W.Rachford,H. H.Jr.The numerical solution of parabolic and elliptic differential equationsPathiranaA.TsegayeS.GersoniusB.VairavamoorthyK.A simple 2-D inundation model for incorporating flood damage in urban drainage planningMirzavandR.AbdipourA.MoradiG.MovahhediM.Full-wave semiconductor devices simulation using ADI-FDTD method