The Irwin-Hall distribution is the distribution of the sum of a finite number of independent identically distributed uniform random variables on the unit interval. Many applications arise since round-off errors have a transformed Irwin-Hall distribution and the distribution supplies spline approximations to normal distributions. We review some of the distribution’s history. The present derivation is very transparent, since it is geometric and explicitly uses the inclusion-exclusion principle. In certain special cases, the derivation can be extended to linear combinations of independent uniform random variables on other intervals of finite length. The derivation adds to the literature about methodologies for finding distributions of sums of random variables, especially distributions that have domains with boundaries so that the inclusion-exclusion principle might be employed.

1. Introduction

The simple continuous uniform or rectangular distribution Uniform(0, 1) with probability density function (PDF) f(x)=1 for 0<x<1 and f(x)=0 otherwise is very important. Two applications arise in numerical simulation and Bayesian analysis of proportions. If F is the cumulative distribution function (CDF) of the continuous random variable X, then the random variable Y=F(X) has a Uniform(0,1) distribution. The random variable X can be simulated by first simulating Y and then letting X=F–1(Y). This is called the inversion method ([1, page 295], [2, pages 194–196]). The transformation is called the probability integral transformation ([3], [4, pages 203-204]). The uniform distribution is a Bayesian noninformative prior distribution for the distribution of a random variable defined on the unit interval, such as a beta distribution for a proportion ([2, page 33], [5, pages 82–90]). For other applications and generalizations of the uniform distribution, see [6–8].

The present goal is to derive the CDF and the PDF of the sum T=∑i=1nXi, where Xi are independent identically distributed Uniform(0, 1) random variables for i=1,2,…,n. The CDF and PDF are(1)Ft=∑i=0n-1init-inn!sit,(2)ft=∑i=0n-1init-in–1n-1!sit,respectively, where sa(t) is the unit step function(3)sat=0t<a1a≤t.The derivation in Section 2 is geometric and explicitly uses the inclusion-exclusion principle.

Derivations of the distribution, which more recently acquired its name Irwin-Hall, go back to Lagrange and Laplace in the latter 18th century and the early 19th century. Lagrange used generating functions based on ax to obtain the distribution of T ([9, pages 603–612], [10, page 283]). Those generating functions are a predecessor of characteristic functions [10, page 286]. Laplace often revisited the problem of finding the distribution of T and employed many methods ([9, pages 714-715], [10, pages 286–301]). The distribution is described in [1, pages 296–300], where it is called the Irwin-Hall distribution.

Some derivations employ characteristic functions in a variety of ways, since the characteristic function of a sum of independent random variables is the product of each summand’s characteristic function and the inverse transform is not intractable ([11, pages 188-189], [12–14], [15, pages 362-363], [16, 17]). Others utilize the convolution integral for sums and mathematical induction ([4, page 225], [11, pages 190-191 and 244–246], [18]). The distribution of the sum of uniform random variables that may have differing domains is found in [18–21]. Sums of dependent uniform random variables are examined in [22, 23].

Direct integration techniques can be used to obtain the distribution of a linear combination of Uniform(0, 1) random variables ([15, pages 358–360], [24, 25]). Similar techniques are used in [26] for uniform distributions whose domains are intervals with zero as their left endpoints. The distribution of the mean is obtained when all the constants are 1/n. In this case, the distribution is called the Bates distribution ([1, page 297], [27]), which can also be found by a simple transformation of the Irwin-Hall distribution ([15, page 359], [25, page 241]). Using moment generating functions, instead of characteristic functions, Gray and Odell [28] found the distribution of any linear combination of uniform random variables with different domains allowed. In Section 3, the present method or style of proof is extended to those cases giving the same distributions.

Because T is a sum, the Irwin-Hall distribution approximates a normal distribution with a spline, since the Irwin-Hall distribution in (2) is composed of polynomials. The support of T is the interval [0,n]; the mean, mode, and median of T are n/2; and its variance is n/12. By symmetry, all odd central moments are zero, including skewness. The kurtosis is 3-6/(5n) [1, page 300]. This is the measure of kurtosis that is 3 for a normal distribution, so Irwin-Hall distributions are platykurtic, and the kurtosis is close to 3 for large n. According to the Central Limit Theorem,(4)Z=T-n/2n/12→DDINormal0,1asn⟶∞([4, pages 280–283], [11, pages 213–218 and 245], [29, pages 220–222]). Figure 1 contains a normal distribution with mean n/2=3/2 and variance n/12=3/12=1/4 and its approximating Irwin-Hall distribution with n=3. The approximation is very good even for this small value of n [30]. The uniform error bound for the normal(0, 1) CDF Φ(z) is(5)Fz-Φz≤320n([31], [32, page 51]). Approximations with spline fitting can be useful with or without complete information about the distributional shape [33, 34].

Irwin-Hall distribution with n=3 and the matching normal distribution with mean 3/2 and variance 1/4.

Since round-off errors for random variables that are rounded to the nearest integer are distributed Uniform(−1/2, 1/2), the sum of round-off errors is a linearly transformed Irwin-Hall distribution [12]. For large n, the sum of round-off errors is easily described with a normal distribution [29, page 222]. For small n, the Irwin-Hall distribution is also appropriate and not too complicated.

Lee et al. [35] use the Irwin-Hall distribution to examine the efficacy of goodness-of-fit tests. Heinrich et al. [36] adapt the Irwin-Hall distribution in consideration of the accumulated accuracy of round-off errors. Inequalities for linear combinations of independent random variables whose domains have an upper bound are given in [37].

2. Derivation of the Irwin-Hall DistributionTheorem 1.

Let Xi for i=1,2,…,n be independent random variables, each having the continuous uniform distribution on the unit interval, and let T=∑i=1nXi. Then, the CDF and PDF of T are given by (1) and (2), respectively.

Proof.

For m∈{0,1,2,…,n-1} and t∈[m,m+1), let(6)Ant=x1,x2,…,xn:xi≥0fori∈1,2,…,n,∑i=1nxi≤t,Bjt=x1,x2,…,xn∈Ant:xj>1,Cn=x1,x2,…,xn:0≤xi≤1,which is the n-dimensional unit cube. The set complement of Cn with respect to Rn is denoted by Cn′.

The hypervolume of the n-dimensional solid An(t) has value(7)VolAnt=tnn![38], since the solid is a standard orthogonal simplex from the corner of an n-cube. Similarly, if k∈{1,2,…,m}, then the hypervolume of ⋂j=1kBj(t) is(8)Vol⋂j=1kBjt=t-knn!.For k∈{m+1,m+2,…,n},(9)⋂j=1kBjt=φ,Vol⋂j=1kBjt=0,since the sum of nonnegative coordinates exceeds the number of coordinates which are greater than 1.

By the inclusion-exclusion principle,(10)Ft=PT≤t=VolAnt∩Cn=VolAnt-VolAnt∩Cn′=VolAnt-Vol⋃j=1nBjt=tnn!-∑k=1m–1k–1∑1≤j1<j2<⋯<jk≤nVolBj1t∩Bj2t∩⋯∩Bjkt=tnn!-∑k=1m–1k–1nkVolB1t∩B2t∩⋯∩Bkt=tnn!-∑k=1m-1k–1nkt-knn!=∑k=0m-1knkt-knn!.

In (1), F(n) is the Stirling number of the second kind with both parameters equal to n and has numerical value 1 [39, pages 38-39]. If t≥n, then Cn⊂An(t), so F(t)=1 in this case. Since F is a polynomial, ∑k=0n(–1)knk(t–k)n/n!=1 for all real-valued t. Introducing the unit step function gives (1), and differentiation with respect to t gives (2).

3. Discussion and a Generalization

Figures 2 and 3 reveal the structure of the CDF(11)Ft=12t2s0t-t-12s1t+12t-22s2tfor n=2. Figure 2 demonstrates how the hyperplane (line), which is the line of a constant sum of the values of the random variables and is perpendicular to the n-cube’s (square’s) main diagonal, accrues volume (area) below it. Figure 3 illustrates the regions that are included and excluded for various positions of the hyperplane (line) and how vertices are meet in sets. For n=2, the binomial coefficients, which provide the counts of the vertices, are 1 for (0,0), 2 for (1,0) and (0,1), and 1 for (1,1), as seen in Figures 2 and 3. In (11), the first term is the area of the large triangle in Figures 3(a), 3(b), and 3(c); the second term is the sum of the areas of the two hatched triangles in Figure 3(b), where exactly one of {x1,x2} is greater than 1, and in Figure 3(c); and the third term is the area of the crosshatched triangle in Figure 3(c), where both x1 and x2 are greater than 1.

The CDF F(t) increases as t increases.

Computing the CDF for n=2 for increasing values of t.

0≤t<1

1≤t<2

t≥2

Figure 4 shows the same geometric interpretation for n=3. In its CDF(12)Ft=16t3s0t-12t-13s1t+12t-23s2t-16t-33s3t,the first term is the volume using (7) of the large orthogonal simplex in Figures 4(a), 4(b), and 4(c) with edges of length t. The second term is the sum of the volumes using (8) of the three orthogonal simplexes, where exactly one of {x1,x2,x3} is greater than 1. In Figure 4(b), the vertices P1,P2,P3, and P4 of the simplex with x1>1 are labeled. Their coordinates are P1:(t,0,0), P2:(1,0,t-1),P3:(1,t-1,0), and P4:(1,0,0). The lengths of the edges P1P4,P2P4, and P3P4 are t-1. The third term of (12) is the sum of the three volumes using (8), where exactly two of {x1,x2,x3} are greater than 1. In Figure 4(c), the vertices are labeled P3,P5,P6, and P7 in the region where both x1 and x2 are greater than 1. Their coordinates are P3:(1,t-1,0), P5:(t-1,1,0),P6:(1,1,t-2), and P7:(1,1,0). The lengths of the edges P3P7,P5P7, and P6P7 are t-2. The fourth term is the region that is shared by all the other regions, analogous to the crosshatched region in Figure 3(c).

Computing the CDF for n=3 for increasing values of t.

0≤t<1

1≤t<2

2≤t<3

In the same way, for any n, the terms are the n-volumes of orthogonal n-simplexes, whose multiplicity is counted by binomial coefficients determined by the number of vertices of the n-cube in sets as the “moving” n-1-dimensional hyperplane “passes” them as t increases. The hyperplane is perpendicular to the diagonal line x1=x2=x3=⋯=xn. The volumes of the simplexes are computed using (7) and (8).

The Website [40] has a free simulator for T, where selecting n yields the PDF (2). Other calculators are at [41, 42].

The method of proof in Section 2 can be extended to linear combinations of uniform random variables on different intervals. Suppose that X1,X2,…,Xn are independent, that Xk is uniformly distributed on the interval [ak,bk], and that c1,c2,…,cn are real constants. Also, (13)P∑k=1nckXk≤t=P∑k=1ndkYk≤t′,where(14)Yk=Xk-akbk-ak,dk=ckbk-ak,t′=∑k=1nakck.Then, Y1,Y2,…,Yn are independent uniform random variables on [0,1], and P(∑k=1nckXk≤t) can be interpreted as the hypervolume of the solid that consists of all points that lie inside the unit hypercube [0,1]n and on one side of the hyperplane ∑k=1ndkYk=t′. Now, proceed by inclusion-exclusion as in Section 2. In general, the formula for P(∑k=1nckXk≤t) is complicated because of the lack of symmetry that is caused by the presence of d1,d2,…,dn. This increases the number cases and removes the congruence of the solids of each size whose hypervolumes need to be added or subtracted at each stage of the inclusion-exclusion process. Nevertheless, the correct distribution is obtained in this manner. A special case in which these problems disappear is d1=d2=⋯=dn=d, so that(15)P∑k=1nckXk≤t=Ft′dford>01-Ft′dford<0,where F is given in (1).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

JohnsonN. L.KotzS.BalakrishnanN.KochK. R.QuesenberryC. P.KotzS.JohnsonN. L.ReadC. B.Probability integral transformationsRohatgiV. K.BergerJ. O.SilvaP. C.CerdeiraJ. O.MartinsM. J.Monteiro-HenriquesT.Data depth for the uniform distributionJayakumarK.SankaranK. K.On a generalization of uniform distribution and its propertiesDettmannC. P.RoychowdhuryM. K.Quantization for uniform distributions on equilateral trianglesPearsonE. S.SheyninO. B.Finite random sums (a historical essay)CramerH.MitraS. K.BanerjeeS. N.On the probability distribution of round-off errors propagated in tabular differencesIrwinJ. O.On the frequency distribution of the means of samples from a population having any law of frequency with finite moments, with special reference to Pearson's Type IILowanA. N.LadermanJ.On the distribution of errors in Nth tabular differencesStuartA.OrdJ. K.KruglovV. M.On one identity for distribution of sums of independent random variablesPotuschakH.MüllerW. G.More on the distribution of the sum of uniform random variablesOldsE. G.A note on the convolution of uniform distributionsMitraS. K.On the probability distribution of the sum of uniformly distributed random variablesBradleyD. M.GuptaR. C.On the distribution of the sum of n non-identically distributed uniform random variablesSadooghi-AlvandiS. M.NematollahiA. R.HabibiR.On the distribution of the sum of independent uniform random variablesMurakamiH.A saddlepoint approximation to the distribution of the sum of independent non-identically uniform random variablesLoG. S.SangareH.NdiayeC. .A review on asymptotic normality of sums of associated random variablesBarrowD. L.SmithP. W.Classroom Notes: spline notation applied to a volume problemHallP.The distribution of means for samples of size N drawn from a population in which the variate takes values between 0 and 1, all such values being equally probableRoachS. A.The frequency distribution of the sample mean where each member of the sample is drawn from a different rectangular distributionBatesG. E.Joint distributions of time intervals for the occurrence of successive accidents in a generalized Polya schemeGrayH. L.OdellP. L.On sums and products of rectangular variatesHoggR. V.McKeanJ. W.CraigA. T.HoytJ. P.The teacher's corner: a simple approximation to the standard normal probability density functionAllasiaG.Approximation of the normal distribution function by means of a spline functionPatelJ. K.ReadC. B.MuminovM. S.SoatovK.A note on spline estimator of unknown probability density functionMuminovM. S.SoatovK. S.On the approximation of maximum deviation spline estimation of the probability density Gaussian processLeeC.KimS.JeongJ.A view on the validity of central limit theorem: an empirical study using random samples from uniform distributionHeinrichL.PukelsheimF.WachtelV.The variance of the discrepancy distribution of rounding procedures, and sums of uniform random variablesRioE.Exponential inequalities for weighted sums of bounded random variablesSteinP.Classroom Notes: a note on the volume of a simplexLiuC. L.2017, http://www.math.uah.edu/stat/special/IrwinHall.html2017, http://www.distributome.org/V3/calc/IrwinHallCalculator.html2017, http://randomservices.org/distributions/IrwinHall/Calculator.html