MPEMathematical Problems in Engineering1563-51471024-123XHindawi Publishing Corporation12402910.1155/2012/124029124029Research ArticleHigh Accurate Simple Approximation of Normal Distribution IntegralVazquez-LealHector1Castaneda-SheissaRoberto1Filobello-Nino Uriel1Sarmiento-ReyesArturo2Sanchez OreaJesus1NoharaBen T.1Electronic Instrumentation and Atmospheric Sciences School, University of Veracruz, Cto. Gonzalo Aguirre Beltrán S/N, Zona Universitaria Xalapa, 91000 Veracruz, VERMexicouv.mx2Electronics Department, National Institute for Astrophysics, Optics and Electronics, Luis Enrique Erro No.1, 72840 Tonantzintla, PUEMexicoinaoep.mx201282201220120809201115102011181020112012Copyright © 2012 Hector Vazquez-Leal et al.This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

The integral of the standard normal distribution function is an integral without solution and represents the probability that an aleatory variable normally distributed has values between zero and x. The normal distribution integral is used in several areas of science. Thus, this work provides an approximate solution to the Gaussian distribution integral by using the homotopy perturbation method (HPM). After solving the Gaussian integral by HPM, the result served as base to solve other integrals like error function and the cumulative distribution function. The error function is compared against other reported approximations showing advantages like less relative error or less mathematical complexity. Besides, some integrals related to the normal (Gaussian) distribution integral were solved showing a relative error quite small. Also, the utility for the proposed approximations is verified applying them to a couple of heat flow examples. Last, a brief discussion is presented about the way an electronic circuit could be created to implement the approximate error function.

1. Introduction

Normal distribution is considered as one of the most important distribution functions in statistics because it is simple to handle analytically, that is, it is possible to solve a large number of problems explicitly; the normal distribution is the result of the central limit theorem. The central limit theorem states that in a series of repeated observations, the precision of the approximation improves as the number of observations increases . Besides, the bell shape of the normal distribution helps to model, in a practical way, a variety of random variables. The normal (or gaussian) distribution integral has a wide use on several science branches like: heat flow, statistics, signal processing, image processing, quantum mechanics, optics, social sciences, financial mathematics, hydrology, and biology, among others. Normal distribution integral has no analytical solution. Nevertheless, there are several methods  which provide an approximation of the integral by numerical methods: Taylor series, asymptotic series, continual fractions, and some other more. Numerical integration is expensive in computation time and it is only good for numerical simulations; in several cases the power series require many terms to obtain an accurate approximation; this fact makes them a numerical tool exclusive for computational calculations, which are not adequate for quick calculations done by hand.

The Homotopy Perturbation Method  was proposed by He, it was introduced like a powerful tool to approach various kinds of nonlinear problems. As is well known, nonlinear phenomena appear in a wide variety of scientific fields, such as applied mathematics, physics, and engineering. Scientists in these disciplines are constantly faced with the task of finding solutions of linear and nonlinear ordinary differential equations, partial differential equations, and systems of nonlinear ordinary differential equations. In fact, there are several methods employed to find approximate solutions to nonlinear problems such as variational approaches , Tanh method , exp-function , Adomian’s decomposition method [19, 20], and parameter expansion . Nevertheless, the HPM method is powerful, is relatively simple to use, and has been successfully tested in a wide range of applications [15, 2229]. The homotopy perturbation method can be considered as combination of the classical perturbation technique and the homotopy (whose origin is in the topology), but has eliminated the limitations of the traditional perturbation methods . The HPM method does not need a small parameter or linearization, in fact it only requires few iterations for getting highly accurate solutions. Besides, the HPM method has been used successfully for solving integral equations, for instance, the case of the Volterra integral equations . The method requires an initial approximation, which should contain as much information as possible about the nature of the solution. That often can be achieved through an empirical knowledge of the solution. Therefore, we propose the use of the homotopy perturbation method to calculate an approximate analytical solution of the normal distribution integral.

In this work the HPM method is applied to a problem having initial conditions. Nevertheless, it is also possible to apply it to the case where the problem has boundary conditions; here, the differential equation solutions are subject to satisfy a condition for different values of the independent variable (two for the case of a second order equation) .

This paper is organized as follows. In Section 2, we solve the normal distribution integral (NDI) by HPM method. In Section 3, we use the approximated solution of NDI to establish an approximate version of error function, and the result is compared to other analytic approximations reported in the literature. In Section 4, a series of normal distribution related integrals are solved. In Section 6, two application problems for the error function are solved. In Section 7, we summarize our findings and suggest possible directions for future investigations. Finally, a brief conclusion is given in Section 8.

2. Solution of the Gaussian Distribution Integral

This section deals with the solution of the Gaussian distribution integral; it is represented as follows:y(x)=0xexp(-πt2)dt,xR.

This integral can be reformulated as a differential equationy-exp(-πx2)=0,xR.

Evaluating the integral (2.1) at x=0, the result will be zero, therefore, the initial condition for the differential equation (2.2) should be y(0)=0.

To apply the homotopy perturbation method, the next equation is created(1-p)(G(y)-G(yi))+p(y-exp(-πx2))=0, where p is the homotopy parameter. The solution for (2.2) is similar, qualitatively, to a hyperbolic tangent because when x tends to ±, the derivative y tends to zero, hence by symmetry, y tends to the same constant (absolute value) on both directions. Therefore, it is desirable that the first approach of the HPM method contains a hyperbolic term. In consequence, a differential equation is established (G(y)) that may be solved using hyperbolic tangent, which is given byG(y)=(1+c2x2)ddxy(x)-(d-y(x)2d)(a+ac2x2+bc).

Also, the initial approximation for homotopy would beyi(x)=dtanh(ax+barctan(cx)), where a, b, c, and d are adjustment parameters.

Now, by the HPM it is assumed that (2.3) has the following form:y(x)=y0(x)+py1(x)+p2y2(x)+.

Adjusting p=1, the approximate solution is obtainedy(x)=y0(x)+y1(x)+y2(x)+.

Substituting (2.6) into (2.3) and equating terms with powers equals to p, it can be solved for y0(x), y1(x), y2(x), and so on. In order to fulfill the initial condition from (2.2) (y(0)=0), it follows that y0(0)=0,  y1(0)=0, y2(0)=0, and so on. Thus, the result isy0(x)=dtanh(ax+barctan(cx)), where a=-39/2, b=111/2, c=35/111, and d=1/2; these parameters are adjusted (using the NonlinearFit command from Maple Release 15) to obtain a good approximation; this adjustment allows to ignore successive terms. (Given a total of k-samples from the exact model, the NonlinearFit command finds values of the approximate model parameters such that the sum of the squared k-residuals is minimized.) Therefore, the proposed approximation for (2.1) isy(x)=0xexp(-πt2)dt12tanh(39x2-1112arctan(35x111)),  -x.

Also, it is known that solution for (2.1) may be expressed in terms of the error functiony(x)=12erf(πx), where the error function is defined byerf(x)=(2π)0xexp(-t2)dt.

Figure 1(a) shows the approximate solution (2.9) (continuous line) and the exact solution (2.1) (diagonal cross), and Figure 1(b) shows the relative error curve.

(a) Gaussian distribution integral and approximation. (b) Relative error.

In Figure 1 it can be seen that the maximum relative error is negligible for many practical applications in fields like engineering and applied sciences. In fact, the precision of this approximation allows to derive (2.9) with respect to x and graphically compare it to the exact function exp(-πx2), as shown in Figure 2(a), reaching an acceptable relative error (see Figure 2(b)) in the range of -1.3x1.3. Therefore,y(x)3(16428+15925x2)[exp(-39x+111arctan(35x/111))](12321+1225x2)[exp(-39x+111arctan(35x/111))+1]2,  -x.

(a) (2.11) and exact function exp(-πx2). (b) Relative error.

From this approximation it is possible to generate normal distribution functions like the error function or the cumulative distribution function.

3. Error Function

From (2.9) and (2.10) it can be concluded thaterf(x)=2y(xπ)tanh(39x2π-1112arctan(35x111π)),  -x.

Now, function (3.1) is readjusted using the NonlinearFit command, then the command “convert” (with option “rational”) (by using Maple 15) is applied in order to obtain an optimized version of (3.1):erf(x)tanh(4907x446-(17753)arctan(34x191)),  -x.

Next, three different approximations for the error function are presented; all of them are compared to the approximations proposed in this work ((3.1) and (3.2)).

(1) In  it was presented the following approximationerf(x)1-exp[(-4π+0.14x2)(1+0.14x2)x2],  x0.

Due to the fact that (3.3) is only useful for values x0, it is considered a disadvantage because it limits variation over x axis.

(2) In , it was reported an approximation related to the function error, which is expressed asF(x)=xexp(-t22)dt,xR.

The approximation for (3.4) isP(x)=P0+x-1{exp(-x22)-[P02x2+exp(-x2/2)(1+bx2)1/21+ax2]1/2}, whereP0=(π2)1/2,a=(1+(1-2π2+6π)1/2)2π,b=2πa2.

In order to compare our work with  (3.5), considerF(x)=xexp(-t22)dt=-π2erf(x2)+π2.

Replacing P(x) for F(x), it is obtained thatP(x)-π2erf(x2)+π2,

for solving erf(x/2), erf(x2)1-2πP(x).

From (3.9), the conclusion iserf(x)1-2πP(2x), where P(2x) is calculated from (3.5).

(3) In  was reported an approximation of the normal distribution integral, which is transformed into an error functionerf(x)tanh(2x(1+0.089430x2)π).

Figure 3(a) shows the exact error function contrasted to (3.1), (3.2), (3.3), (3.10), and (3.11), finding a high level of accuracy for all five approximations. Besides, Figure 3(b) shows the relative error for the five approximations, where (3.10) has the lowest relative error followed by (3.1) and (3.2), while (3.11) has the highest relative error.

(a) Error function erf(x) and approximations. (b) Relative error for each approximation.

4. Cumulative Distribution Function

The cumulative distribution function  is defined asΦ(x)=-xϕ(t)dt=12(1+erf(x2)), where ϕ(x) is the standard normal probability density function defined as follows:ϕ(x)=12πexp(-x22).

From (3.1) and (4.1) it can be concluded thatΦ(x)12tanh(39x22π-1112arctan(35x1112π))+12,-x.

Then, the process applied to (3.1) is repeated to convert coefficients of (4.3) into fractions. The result is an approximate version of (4.3) now in fractions, which is given byΦ(x)12tanh(179x23-1112arctan(37x294))+12,  -x,where, converting the result into exponential terms and performing simplification, the expression becomesΦ(x)[exp(-358x23+111arctan(37x294))+1]-1,  -x.

Figure 4(a) shows the cumulative distribution function versus (4.3) and (4.5), it can be seen a higher level of accuracy in both approximations, except for region x<-3, where values of Φ(x) are close to zero.

(a) Cumulative distribution function (4.1) and approximations (4.3) and (4.5). (b) Relative error for each approximation x[-3,0]. (c) Relative error for each approximation x[0,4].

5. Table for Normal Distribution Function-Related Integrals

In  are shown a series of integrals without exact solution, these involve the use of the cumulative distribution function (4.1), and the standard normal probability density function (4.2). The cumulative distribution function is replaced for its approximate version (4.5) to generate the integrals in Table 1.

Gaussian function-related integrals.

Integral of Gaussian functionApproximate and exact solutionRelative error figure
(T.1) ϕ(x)dxC=0(exp(-35823x+111arctan(37294x))+1)-1+C=Φ(x)+C

(T.2) x2ϕ(x)dxC=0(exp(-35823x+111arctan(37294x))+1)-1-xϕ(x)+C=Φ(x)-xϕ(x)+C

(T.3) x2k+2ϕ(x)dxk=1,  C=0-ϕ(x)j=0k(2k+1)!!(2j+1)!!x2j+1  +(2k+1)!!(exp(-35823x+111arctan(37294x))+1)-1+C=-ϕ(x)j=0k(2k+1)!!(2j+1)!!x2j+1+(2k+1)!!Φ(x)+C

(T.4) ϕ(x)2dxC=012π(exp(-358x223+111arctan(37x2294))+1)-1+C=12πΦ(x2)+C

(T.5) ϕ(x)ϕ(a+bx)dxa=1,  b=2,  C=01tϕ(at)[(exp(-35823  (tx+abt)+111arctan(37294(tx+abt)))+1)-1-12]  +C,  t=1+b2=1tϕ(at)[Φ(tx+abt)-12]+C,  t=1+b2

(T.6) xϕ(a+bx)dxa=1,  b=2,  C=0-1b2ϕ(a+bx)  -ab2(exp(-35823(a+bx)+111arctan(37294(a+bx)))+1)-1+C=-1b2ϕ(a+bx)-ab2Φ(a+bx)+C

(T.7) x2ϕ(a+bx)dxa=2,  b=2,  C=0a2+1b3(exp(-35823(a+bx)+111arctan(37294(a+bx)))+1)-1  +a-bxb3ϕ(a+bx)+C=a2+1b3Φ(a+bx)+a-bxb3ϕ(a+bx)+C

(T.8) ϕ(a+bx)ndxa=1,  b=2,  n=2,  C=0(2π)-(n-1)/2bn  ×[exp(-35823(n(a+bx))+111arctan(37294(n(a+bx))))+1]-1  +C=(2π)-(n-1)/2bnΦ(n(a+bx))+C

(T.9) Φ(a+bx)dxa=1,  b=2,  C=01b(a+bx)(exp(-35823  (a+bx)+111arctan(37294(a+bx)))+1)-1  +1bϕ(a+bx)+C=1b(a+bx)Φ(a+bx)+1bϕ(a+bx)+C

(T.10) xΦ(a+bx)dxa=1,  b=2,  C=012b2((b2x2-a2-1)  ×(exp(-35823(a+bx)+111arctan(37294(a+bx)))+1)-1    +(bx-a)ϕ(a+bx))+C=12b2((b2x2-a2-1)Φ(a+bx)+(bx-a)ϕ(a+bx))+C

(T.11) x2Φ(a+bx)dxa=1,  b=2,  C=013b3((b3x3+a3+3a)    ×(exp(-35823(a+bx)+111arctan(37294(a+bx)))+1)-1  +(b2x2-abx+a2+2)ϕ(a+bx))+C=13b3((b3x3+a3+3a)Φ(a+bx)+(b2x2-abx+a2+2)ϕ(a+bx))+C

(T.12) xϕ(x)Φ(a+bx)dxa=1,  b=2,  C=0btϕ(at)(exp(-35823  (xt+abt)+111arctan(37294(xt+abt)))+1)-1  -ϕ(x)(exp(-35823  (a+bx)+111arctan(37294(a+bx)))+1)-1  +C,  t=1+b2=btϕ(at)Φ(xt+abt)-ϕ(x)Φ(a+bx)+C,  t=1+b2

(T.13) Φ(x)2dxC=0x(exp(-35823x+111arctan(37294x))+1)-2  +2ϕ(x)(exp(-35823x+111arctan(37294x))+1)-1  -1π(exp(-35823(x2)+111arctan(37294(x2)))+1)-1+C=xΦ(x)2+2Φ(x)ϕ(x)-1πΦ(x2)+C

(T.14) ecxϕ(bx)ndxc=3,  n=2,  b=2,  C=01bn(2π)n-1exp(c22nb2)  ×[exp(-35823  (bxn-cbn)+111  arctan(37294  (bxn-cbn)))+1]-1  +C,  b0,n>0=1bn(2π)n-1exp(c22nb2)Φ(bxn-cbn)+C,  b0,n>0

(T.15) -Φ(a+bx)ϕ(x)dxa=1,  b=2  ,  C=0(exp(-358a231+b2  +111arctan(37a2941+b2  ))+1)-1=Φ(a1+b2)-0.3691843425E-4

(T.16) 0xΦ(a+bx)ϕ(x)dxa=1,  b=2,  C=0btϕ(at)(exp(358ab23t  +111arctan(-37ab294t  ))+1)-1  +(2π)-1/2(exp(-358a23  +111arctan(37a294  ))+1)-1,  t=1+b2=btϕ(at)Φ(-abt)+(2π)-1/2Φ(a),  t=1+b2  -0.1961451412E-4

For illustrative purposes, we have chosen values for a, b, c, C, k, and n to be able to calculate the relative error between the exact and approximate solutions. Thus, to obtain a new solution, it can be done assigning a new value to the desired constant, or constants, and perform the required calculations. In this case, the approximate version with coefficients expressed as fractions (4.5) for the cumulative distribution function (4.1) was employed in order to make simple manual calculations easier. Nevertheless, it is also possible to use approximation (4.3).

6. Application Cases

As an example to apply the error function, two cases are considered for the unidimensional heat flow equation.

6.1. Example 1

Consider the case for a thin semi-infinite bar (x0) whose surface is isolated (see Figure 5); it has a constant initial temperature To. Suddenly, zero temperature is applied at the x=0 end and is kept at that value. It is attempted to determine the temperature distribution for the bar, T(x,t), at any point x and time t. This problem will be solved using techniques from Fourier integral .

Scheme for Example  1.

From heat flow theory, it is known that T(x,t) should satisfy (heat conduction equation)a22Tx2=Tt, where a2=k, known as thermal diffusivity. It is subject to initial condition u(x,0)=To and boundary condition u(0,t)=0. The method of separation of variables is applied. This technique assumes that it is possible to obtain solutions in a product fashionT(x,t)=X(x)Y(t).

Replacing (6.2) in (6.1) results in T(x,t) becomingT(x,t)=exp(-kλ2t)(Acos(λx)+Bsin(λx)), where λ is a separation constant, A and B are constants to be determined.

From boundary conditions follows that A=0, and since there is no restriction for λ, it is possible to integrate over λ, replacing B in function B(λ) such that solution to the problem takes the shapeT(x,t)=0B(λ)exp(-kλ2t)sin(λx)dλ, where B(λ) is determined from the following integral equation:To=0B(λ)sin(λx)dλ,

now, it is possible to express the heat distribution as follows:T(x,y)=2T0π00exp(-kλ2t)sin(λV)sin(λx)dλdV,

usingsin(λV)sin(λx)=12[cosλ(V-x)-cosλ(V+x)],0exp(-αx2)cos(βλ)dλ=12  παexp(-β24α).

It is possible to express (6.6) in terms of the error functionT(x,t)=2Toπ  0x/2ktexp(-w2)dw, orT(x,t)=Toerf(x2kt).

Expressing (6.9) in terms of the approximate solution, given by (3.1), we obtainT(x,t)Totanh(19.5x2πkt-55.5arctan(35x222πkt)).

Figure 6 shows how temperature varies for a certain distance (expressed in meters) and time (measured in seconds); temperature is expressed in Kelvin. Thermal diffusivity parameter (k) employed for this example is 1.6563E-4m2/s which belongs to silver, pure (99.9%).

(a) Graphical comparison between exact (6.9) and approximate (6.10) results. (c)–(h) Relative error for the considered distance x={0.1m,  0.2m,  0.3m,  0.4m,  0.5m,  0.6m}.

Time range between 0 s and 2000 s

Time range between 0 s and 250 s

x  =  0.1m

x  =  0.2m

x  =  0.3m

x  =  0.4m

x  =  0.5m

x  =  0.6m

6.2. Example 2

Another interesting example of heat flow is the nonstationary flux in an agriculture field due to the sun (Figure 7). Suppose that the initial distribution of temperature on the field is given by T(x,0)=Tf, and the superficial temperature Ts is always constant .

Scheme for Example  2.

Let the origin be on the surface of the field, in such a way that the positive end for x axis points inward the field. For symmetry reasons, it is possible to consider T as a function of x and time t  T(x,t). The temperature distribution is governed, again, by (6.1). It is subject to conditions T(0,t)=Ts and T(x,0)=Tf. Unlike Example  1, (6.1) is expressed as an ordinary single variable second-order differential equation using the substitutionV(u)=T(x,t), whereu=u(x,t)=x2at.

This substitution immediately converts (6.1) intod2V(u)du2=-2udV(u)du.

Integrating (6.13) leads to the following:V(u)=C1ouexp(-p2)dp+C2, where C1 and C2 are integration constants.

Using (2.11) and (6.12), it is possible to express (6.14) in terms of the error function byT(x,t)=C1+C2erf(12xat).

Determining C1 and C2 constants results from applying the initial and boundary conditions, such that the final result is given asT(x,t)=(Tf-Ts)erf(12xat)+Ts.

The approximate solution is obtained by substituting (3.1) in (6.16) as follows:T(x,t)(Tf-Ts)tanh(19.5x2aπt-55.5(35x222aπt))+Ts.

Figure 8 shows how temperature varies in depth; values are in meters. The temperature is in Kelvin, and time is measured in seconds. For this case, field temperature is Tf=285K, surface temperature is Ts=300K, and the thermal conductivity value is k=a2=0.003m2/s.

(a) Graphical comparison between exact (6.16) and approximate (6.17) results. (b)–(f) Relative error for the considered distance x={0.01m,  0.05m,  0.1m,  0.2m,  0.3m} expressed in meters.

Time range between 0 s and 1800 s

x  =  0.01m

x  =  0.05m

x  =  0.1 m

x  =  0.2m

x  =  0.3m

7. Discussion

This paper presents the normal distribution integral as a differential equation. Then, instead of using a traditional linear function L in HPM, a nonlinear differential equation G is used which has analytic solution (qualitatively related to the solution of the normal distribution integral). This is done to initiate the HPM method at the “closest” point to the solution. In fact, in the research area of homotopy continuation methods , it is well known that when the homotopy parameter is p=0, the solution of the homotopy function must be trivial or simple to solve. Hence, it is possible to use a nonlinear differential equation G instead of L as long as the differential equation G has an analytic solution (at p=0). The arctan function nested into the tanh function helps to establish a better approximation for the normal distribution integral. Then, the HPM method was successfully applied on the treatment of the normal distribution integral. From this result other related integrals were calculated, finding that the maximum relative error for the Gaussian integral was less than 180E-6 (Figure 1(b)). For the case of the approximate error function (3.1) and (3.2), the relative error was less than 200E-6  (Figure 3(b)), and for the cumulative error function the maximum error for region x>0 was less than 90E-6 (Figure 4(c)). Therefore, those approximations have a high level of accuracy, comparable to other approximations found in the literature; nevertheless, the proposed approximations in this work have such mathematical simplicity that allows to be used on practical engineering applications where the relative error does not represent a severe constraint.

The approximate solution of the cumulative distribution function was used to solve some defined and undefined integrals without known analytic solution, showing a low order relative error. Besides, the approximate error function (3.1) was employed to express, analytically, the solution for two heat flow problems, obtaining results with low order relative error.

Approximations (2.9), (3.2), and (4.4) have a low order of mathematical complexity so they become susceptible to be implemented in hardware for analog circuits. Therefore, approximations for the Gaussian, error, and cumulative functions may be part of a circuit for analog signal processing. Finally, the necessary blocks to implement some of the normal distribution functions in a circuit using the current-current mode are as follows:

The hyperbolic tangent was implemented in analog circuits .

Arctangent function was implemented in .

To multiply a current by a factor, it is only necessary to use current mirrors .

The addition or subtraction of constant values is achieved by connecting to the terminal a positive or negative current, respectively .

The addition of constants is equivalent to add constant current sources by using current mirrors .

Finally, after analyzing qualitatively the proposed approximation of this work, it is possible to establish a better approximation (see Figure 9) to the error function with hyperbolic tangents nested, resulting in the following:erf(x)tanh(77x75+(11625)tanh(147x73-(767)tanh(51x278))), where values for constants were calculated by means of the same numerical adjustment procedure employed to calculate (2.9). This approximation for the error function may be implemented in a circuit easily by means of the repeated use of the analog block for the hyperboloidal tangent  and some current mirrors .

Relative error for approximation (7.1) of error function erf(x).

8. Conclusion

This work presented an approximate analytic solution for the Gaussian distribution integral (by using HPM method), the error function and the cumulative normal distribution providing low order relative errors. Besides, the relative error for the error function is comparable to other approximations found in the literature and has the advantage of being a simple expression. Also, it was possible to solve, satisfactorily, a series of normal distribution-related integrals, which may have potential applications in several areas of applied sciences. It was also demonstrated that the approximate error function can be employed on the practical solution of engineering problems like the ones related to heat flow. Besides, given the simplicity of the approximations, these are susceptible of being implemented in analog circuits focusing on the analog signal processing area.