Moments of Distance from a Vertex to a Uniformly Distributed Random Point within Arbitrary Triangles

. We study the cumulative distribution function (CDF), probability density function (PDF), and moments of distance between a given vertex and a uniformly distributed random point within a triangle in this work. Based on a computational technique that helps us provide unified formulae of the CDF and PDF for this random distance then we compute its moments of arbitrary orders, based on which the variance and standard deviation can be easily derived. We conduct Monte Carlo simulations under various conditions to check the validity of our theoretical derivations. Our method can be adapted to study the random distances sampled from arbitrary polygons by decomposing them into triangles.


Introduction
A Voronoi diagram is a partitioning of a space into convex polygons called Voronoi cells based on prespecified points (called seeds), such that each cell contains exactly one seed and the interior points of a cell are closer to this seed than any other ones.
Voronoi diagram was formally defined and studied for a general -dimensional Euclidean space in 1908 [1].Since its inception, it has been widely used in a variety of research fields.Below we list several applications as examples.A classical application of Voronoi diagram is in the field of forest ecology.To make forestry inventories, the distances from sampling points to seedlings are often recorded [2].A better understanding of this random distance will help ecologists estimate the average number of plants per unit area [3].Another application involves in distribution management [4], in which it is often necessary to study the expected random distances that result from dispatching vehicles to meet customers' demand.In recent years, Voronoi diagram was used to estimate the Shannon entropy of multidimensional probability densities [5], to analyze the complex nature of biomolecular structure [6], to enhance the posterior distribution estimation via nonparametric sampling approximation [7], to model stochastic foam geometries [8,9], to simulate granular materials [10], and so forth.
Perhaps the most successful application of Voronoi diagram is in the field of wireless communications.The analysis of the distance between given wireless base stations and random receivers is an important problem in wireless communication networks [11][12][13].In fact, assuming that a random receiver always connects to the nearest base station, we can decompose the entire space into disjoint Voronoi cells [14] and simply study the properties of this random distance within one Voronoi cell, as shown in Figure 1(a).Because all Voronoi cells are convex polygons that can be further decomposed into disjoint triangles with the seed of Voronoi cell which serve as their common vertex, it suffices to study the distance between a given vertex (the seed of the Voronoi cell) and a uniformly distributed random point inside a triangle, for example, △ 1  2 in Figure 1(b).
All of these applications require researchers to study the distribution or the moments of a random distance between a given vertex and a random point that is uniformly distributed within an arbitrary triangle.Statistical properties such as the cumulative distribution function (CDF) and probability density function (PDF) of such random distance have been actively studied in recent years [15][16][17].One drawback of  these prior works is that the resulting CDF and PDF formulae depend on the geometric properties of the triangle, such as whether the altitude from the fixed vertex of the triangle to the opposite side is inside or outside the triangle.It makes the use of CDF/PDF in studying moments (e.g., the mean, variance, and skewness) of this random distance difficult.In this paper, we use a parametrization tactic to reformulate the CDF and PDF of this random distance so that they share a consistent form for all cases, which enables us to provide a general formula for moments of arbitrary order.We conducted Monte Carlo simulations with various configurations and found that the empirical results match the theoretical results well.

The Unified Formulae of CDF and PDF of the Random Distance
For the ease of presentation, we adopt the following symbols throughout the discussion as follows.We denote the triangle as △, and the vertex  is always considered as the reference point.We denote  ∈ △ as the random location of a receiver within this triangle.We assume that  is uniformly distributed within triangle △; namely, Here Ω denotes the area of △.
Our goal is to study the statistical properties of  fl ||, the Euclidean distance between the reference point  and the random point .To achieve this, we first compute (), the CDF of .
In [15], the vertex  is considered as the reference point, and we may assume that || ≤ || without loss of generality.Two cases need to be treated separately according to ∠: (1) If 0 < ∠ ≤ /2, the altitude with the edge  as base is inside of the triangle, called the acute or inside altitude case, as shown in Figure 2(a).Note that the right triangle case (∠ = /2) is considered as a special example of this case.
(2) If ∠ > /2, the altitude with the base edge  is outside of the triangle, called the obtuse or outside altitude case (Figure 2(b)).
We adopt a computational technique to merge those two cases into one form by using axial symmetry and employing an additional coefficient .As shown in Figure 2(c), the line  is used as the symmetry axis.If △ is acute (the inside altitude case), we use information from △ 1  (Figure 2(c)), where  =  1 .Similarly, we use △ 2 , where  =  2 if △ is obtuse (outside altitude case).Points  1 and  2 are symmetric with respect to the axis .The additional coefficient  will be discussed later when the CDF () of  is calculated.
As in [15], we draw a circle () centered at  with radius , to compute the distribution of the distance between  and .The CDF function, (), is the area of △ ∩ () divided by Ω, the total area of △.Four possible cases are discussed below corresponding to different ranges of  (Figure 3).
Case 1.Let ℎ be the length of altitude .When 0 ≤  ≤ ℎ, as shown in Figure 3(a), the CDF of  is the area of a sector divided by Ω.
We would like to point out that the above formula is valid for both the inside and outside altitude cases.
In formula (3),  1 = ∠ 2 ,   = ∠,   −  1 is ∠ 2 , and   = ||.Let  2 = ∠, and then   and   are variables related to  by the following formulae: In this case, the acute/obtuse cases are unified by using coefficient  defined as follows: where In summary, the CDF of  is where The PDF, (), is the derivative of () with respect to  and is provided as follows: where It is not hard to derive that Consequently the PDF formula (10) can be simplified as

Moments Calculation
To facilitate the computation of moments, we first need to compute the following important integrals.Recall that   := ∠ − arccos (ℎ/), and we have Here (, ) fl ∫  −1 (1 − ℎ 2 / 2 ) −1/2  is a special case of the incomplete beta function that can be calculated by trigonometric substitution.Let  = ℎ/ sin , and we have where After using the inverse trigonometric substitution, including sin  = ℎ/ and cos  = √  2 − ℎ 2 /, we have where For convenience, we provide some special cases in the following equation: By using formulae ( 14)-( 17), we can derive the general formula for moments of the random distance  as follows: Below we provide four low-order moments as special cases of the above general formula: With the above formulae, commonly used statistical quantities such as variance, skewness, and kurtosis of  can be calculated easily.
Example 1 (rectangle).For a rectangle  with side lengths  and , the mean Euclidean-distance  from the center  of the rectangle to a uniformly distributed random point in the rectangle is where ( | △) and ( | △) are the conditional expectations of this random distance within isosceles triangles △ and △O, respectively.The results presented above is computed by our formula (23) and is equivalent to the results derived in [18, Formula (1)].
Example 2 (square).If the polygon is a square with length , then the square can be split to 4 isosceles right triangles by the diagonals.For each triangle,  = /2,  1 = 0,  2 = /4, and ℎ = /2.Based on formula (26), we derive the formula for (), which is the CDF of the distance from the center  to a random point that is uniformly distributed in this square, as follows: where   = (1/2) arcsin ( 2 /2 2 −1).Formula (30) is identical to Formula (14) given in [19].
With formulas ( 27) and ( 28), we can calculate the first two moments of  as follows: The result of the first moment was also reported in several earlier studies [2,4]; to the best of our knowledge, the result of the second-order moment is novel.

Simulation Studies
We conduct several Monte Carlo studies to compare the theoretical results derived in Sections 2 and 3 to those obtained from simulations.For a given triangle △, we sample point  from a uniform distribution on △ with the following formula: where  1 and  2 are generated from uniform distribution (0, 1) subject to condition  1 +  2 ≤ 1.We would like to point out that this technique was first used in [20].A total number of  = 5, 000 points  1 ,  2 , . . .,   are generated with formula (33), and we compute the distance Five cases of triangles are used in this simulation study and they are illustrated in Figure 4. Technical details about these triangles are given as follows: (a) The generic inside altitude (acute) case: we select  = (0, 0),  = (5, 0), and  = (2, 3.5), as shown in Figure 4(a).(b) The generic outside altitude (obtuse) case: we select  = (0, 0),  = (7, 4), and  = (15, 30), as shown in Figure 4(b).(c) The isosceles triangle case: we select  = (0, 0),  = (2, 5), and  = (5, 2), as shown in Figure 4(c).
We compute the theoretical CDF and PDF of  by formulae ( 8) and ( 13), respectively.The theoretical CDF and PDF are compared with the empirical CDF and PDF estimated by an adaptive Kernel Density Estimator [21] as implemented in MATLAB package "Kernel Density Estimator".
As shown in Figures 5(a)-5(e), the empirical CDF and PDF match their theoretical counterparts very closely in all five cases.
Next, we compare the empirical moments with theoretical results defined by (21).Here we define the empirical moment as the maximum likelihood estimator of the th population moment, which is calculated by   Relative errors of sample moments and the sample standard deviation (as compared to their theoretical counterparts) are summarized in Table 1.We decide to summarize relative errors with respect to standard deviation in this table as well because it is a commonly used characteristic.
More specifically, we define the relative error rate as The relative error rate for the standard deviation,   (%), is defined similarly.We then repeat the above simulation experiments for  = 100 time and report  , (%) (the mean of  , (%)),   (%) (the mean of   (%)),  max , (%) (the maximum of  , (%)), and  max  (%) (the maximum of   (%)) over  repetitions in Table 1.From this table, we observed that the largest average error rate is less than 2% and the largest maximum error rate is less than 10%.All these simulation results show the consistence between the theoretical results and that estimated from the simulations.
As in any simulation studies, the error rate is a decreasing function of the number of sampling points () and it is important to understand the rate of convergence in a simulation study.Since we observe from Table 1 that the error rate is the largest for the obtuse case, we focus on this case in the following additional simulation studies.More specifically, we let  change from 500 to 20,000 and recorded the observed average and maximum error rates in Figure 6.We see that the error rates decrease at ( −1/2 ) rate approximately, which is consistent with the central limit theorem.We also observed that, in general, the higher order a moment is, the larger its error rate is.Based on these empirical evidences, if we hope the maximum error rate to be no more than 5% for the 4th-order moment (the most difficult case), we should choose  no less than 13,000.As a remark, we also repeated these simulation studies for other shapes of triangles and the patterns of the error rate curves are very similar.

Conclusion
To study the statistical properties of the distance (denoted as  throughout this manuscript) from a fixed vertex of a triangle to a random point that is uniformly distributed on the interior of this triangle is important because many distanceoptimization problems depend on Voronoi decomposition, in which the entire plane is divided into polygons called Voronoi cells, which can be further decomposed into triangles.This type of optimization has many applications in areas such as wireless communications, ecology, and distribution management.
In recent years researchers have developed distinct CDF and PDF formulae for two types of triangles, namely, the inside altitude (acute) and outside altitude (obtuse) cases.Without a unified formula, it is difficult to derive useful statistical quantities such as moments and standard deviation of .In this paper, we consolidated the two special cases and give a unified formula for the exact CDF and PDF of .Our formula is consistent with the results obtained in [15].The unified CDF/PDF formulae reduce computational burden significantly and help us derive population moments of  with arbitrary orders (see (20)) and we gave the exact formula for the first four moments (see (21)).The reparametrization  technique we use in consolidating the acute and obtuse cases may also be useful in similar research projects.
With our new PDF formula, the distribution of the distance between any point to a random point within an arbitrary polygon can be easily built using a method based on piecing triangles together [15,16,22,23].In this manuscript, we derived the moments formula for some polygons with special shapes, such as rectangles, regular -polygons, and discs as limiting cases of regular -polygons.Our results are consistent with those derived from prior studies [2,4,18].
We conduct Monte Carlo simulation studies to verify the consistency of the theoretical results derived in this study and give some empirical evidences about how fast the observed error rate converges to zero.These results conform to the convergence rate predicted by the central limit theorem.

Figure 1 :
Figure 1: For wireless communication networks, the entire space can be decomposed into disjoint triangles for the analysis of distance in two steps: (a) build a Voronoi diagram of which the base stations are the seeds and (b) decompose each Voronoi cell into disjoint triangles with the base station (black point, ) which serve as their common vertex.

Figure 2 :
Figure 2: An illustration of the acute and obtuse cases.(a) The acute (inside altitude) case.(b) The obtuse (outside altitude) case.(c) Two cases can be merged.

Case 2 .
If ℎ <  ≤ |  |, as shown in Figure 3(b), the CDF of  is the intersection area divided by Ω.

Case 3 .
, the acute case.(5)When  = 0, △∩() is the deep blue area in Figure3(b).When  = 2, it is the union of both light and deep blue areas.If |  | <  ≤ ||, as shown in Figure 3(c), the CDF of  is the shadow area divided by Ω.

Figure 4 :
Figure 4: An illustration of five different shapes of triangles used in the simulation studies.(a) The generic inside altitude (acute) case.(b) The generic outside altitude (obtuse) case.(c) The isosceles triangle case.(d) The generic right triangle case.(e) The isosceles right triangle case.

Figure 5 :
Figure 5: Simulation versus theory curve.(a) The acute triangle.(b) The obtuse triangle.(c) The isosceles triangle.(d) The right triangle.(e) The isosceles right triangle.

Figure 6 :
Figure 6: Average and maximum error of moments calculated with simulation method for obtuse triangle case.Curves with "zig-zags" are the observed error rates and the corresponding smooth curves are trajectories of functions Ĉ  −1/2 representing the ( −1/2 ) rate of convergence as predicted by the central limit theorem.Here Ĉ are estimated by the least square fitting criterion.(a) Average error.(b) Maximum error.

Table 1 :
Error of the simulation results and the theory results.Here IsosRight means the isosceles right triangle case.All error rates are reported here in percentiles.In each round of simulation studies,  = 5000 points are sampled within each triangle, and the summary statistics reported in this table are calculated from  = 100 repetitions.