On the One dimensional Poisson Random Geometric Graph

Given a Poisson process on a bounded interval, its random geometric graph is the graph whose vertices are the points of the Poisson process and edges exist between two points if and only if their distance is less than a fixed given threshold. We compute explicitly the distribution of the number of connected components of this graph. The proof relies on inverting some Laplace transforms.


Motivation
As technology goes on [1,2,3], one can expect a wide expansion of the so-called sensor networks. Such networks represent the next evolutionary step in building, utilities, industrial, home, agriculture, defense and many other contexts [4].
These networks are built upon a multitude of small and cheap sensors which are devices with limited transmission capabilities. Each sensor monitors a region around itself by measuring some environmental quantities (e.g., temperature, humidity), detecting intrusion, etc, and broadcasts its collected informations to other sensors or to a central node. The question of whether information can be shared among the whole network is then of crucial importance.
Many researches have recently been dedicated to this problem considering a variety of situations. It is possible to categorize three main scenarios: those where it is possible to choose the position of each sensor, those where sensors are arbitrarily deployed in the target region with the control of a central station and those where the sensor locations are random in a decentralized system.
The problem of the first scenario is that, in many cases, placing the sensors is impossible or implies a high cost. Sometimes this impossibility comes from the fact that the cost of placing each sensor is too large and sometimes the network has an inherent random behavior (like in the ad hoc case, where users move). In addition, this policy cannot take into account the configuration of the network in the case of failure of some sensor.
The drawback of the second scenario is a higher unity cost of sensors, since each one has to communicate with the central station. Besides, the central station itself increases the cost of the whole system. Moreover, if sensors are supposed to know their positions, an absolute positioning system has to be included in each sensor, making their hardware even more complex and then more expensive.
It is thus important to investigate the third scenario: randomly located sensors, no central station. Actually, if we can predict some characteristics of the topology of a random network, the number of sensors (or, as well, the power supply of them) can be a priori determined such that a given network may operate with high probability. For instance, we can choose the mean number of sensors such that, if they are randomly deployed, there is more than 99% of probability the network to be completely connected.
Usually, sensors are deployed in the plane or in the ambient space, thus mathematically speaking, one has to deal with configurations in R 2 , R 3 or a manifold. The recent works of Ghrist and his collaborators [5,6] show how, in any dimension, algebraic topology can be used to compute the coverage of a given configuration of sensors. Trying to pursue their work for random settings, we quickly realized that the dimension of the ambient space played a key role. We then first began by the analysis of dimension 1, which appeared as the most simple situation. There is here no need of the sophisticated tools of algebraic topology. However, it doesn't seem that the problem of coverage on a finite length interval has already been solved in the full extent we do here. Higher dimensions will be the object of forthcoming papers.
We here address the situation where the radio communications are sufficently polarized so that we can consider we have some privileged dimension. Random coverage in one dimension has been already studied in different contexts. Some years ago already, several analysis were done on the circle ( [7,8] and references therein) for a fixed number of points and uniform distribution of points over the circle. The question addressed was that of full coverage. More recently, in [9], efficient algorithms to determine whether a region is covered considering the sensors are deployed over a circle and distributed as a Poisson point process are given. In [7,10], the distribution of a fixed number of clusters (see below for the definition) is given. In [11], sensors are actually placed in a plan, have a fixed radius of observation. The trace of the covered regions over a line is then studied.
Our main result is the distribution of the number of connected components for a Poisson distribution of sensors in a bounded interval. Our method is very much related to queueing theory. Indeed, clusters, i.e., sequence of neighboring sensors, are the strict analogous of busy periods. As will appear below, our analysis turns down to be that of an M/D/1/1 queue with preemption: when a customer arrives during a service, it preempts the server and, since there is no buffer, the customer who was in service is removed from the queuing system. To the best of our knowledge, such a system has never been studied but the usual methods of Laplace transform, renewal processes, work perfectly and with a bit of calculus, one can compute all the characteristics we are interested in.
The paper is organized in the following way: Section II presents the physical and random assumptions and defines the relevant quantities to be calculated. The calculations and analytical results are presented in Section III. In section IV, two other scenarios are presented, considering the number of incomplete clusters and clusters placed in a circle. In Section V, numerical examples are presented and analyzed.

Problem Formulation
Let L > 0 be the length of the domain in which sensors are located. We assume that sensors are distributed according to a Poisson process of intensity λ. Let (X i , 1 ≤ i ≤ n) be the positions of the sensors. We thus know that the random variables, ∆X i = X i+1 − X i are i.i.d. and exponentially distributed. Due to their technological limitations, each sensor can communicate only with other sensors within a range ǫ: two sensors, located respectively at x and y, are said to be directly connected whenever |x − y| ≤ ǫ. For i < j, two sensors located at X i and X j are indirectly connected if X l and X l+1 are directly connected for any l = i, · · · , j − 1. A set of sensors directly or indirectly connected is called a cluster and the connectivity of the whole network is measured by the number of clusters.
The number of points in the interval [0, x] is denoted by N x = ∞ n=0 1 {Xn≤x} . The random variable A i given by represents the beginning of the i-th cluster, denoted by C i . In the same way, the end of this same cluster, E i , is defined by So, the i-th cluster, C i , has a number of points given by N Ei − N Ai . We define the length B i of C i as E i − A i . The intercluster size, D i , is the distance between the end of C i and the beginning of C i+1 , which means that D i = A i+1 − E i and ∆A i is the distance between the first points of two consecutive clusters C i , given  Figure 1. Queueing representation of the proposed problem. A down arrow denotes that user i starts to be served. An up arrow indicates that user i leaves the system without have finished the service. A double up arrow illustrates that the service of user i finishes. It is also shown the beginning and the end of the ith busy period, respectively, A i and E i .

. In this non-conservative system, the service time is deterministic and given by ǫ. When a customer arrives during a service, the served customer is removed from the system and replaced by the arriving customer. Within this framework, a cluster corresponds to what is called a busy period, the intercluster size is the idle time and
The number of complete clusters in [0, L] corresponds to the number of connected components β 0 (L) (since in dimension 1, it coincides with the Euler characteristics of the union of intervals, see [12,5]) of the network. The distance between the beginning of the first cluster and the beginning of the (i + 1)-th one is defined as U i = i k=1 ∆A k . We also define ∆X 0 = D 0 = X 1 . Fig. 2 illustrates these definitions. Lemma 1. For any i ∈ N * + , A i and E i are stopping times. Figure 2. Definitions of the relevant quantities of the network: distance between points, distance between clusters, the size of clusters, the size interclusters, the beginning of clusters and the end of clusters.
Proof. Let us consider the filtration F t = σ{N a , a ≤ t}. For i = 1, we have Thus, A 1 is a stopping time. For A 2 , we have so A 2 is also a stopping time. We proceed along the same line for others A i and as well for E i to prove that they are stopping times.
Since N is a strong Markov process, the next corollary is immediate.
is a set of independent random variables. Moreover, D i is distributed as an exponential random variable with mean 1/λ and the random variables

Calculations
Theorem 3. The Laplace transform of the distribution of B i , is given by Proof. Since ∆X j is an exponentially distributed random variable, Hence, the Laplace transform of the distribution of B 1 is given by Using Collorary 2, we have E e −sB1 = E e −sBi , which concludes the proof.
From this result, we can immediately calculate the Laplace transform of the distribution of ∆A i . Since and using Corollary 2: Proof. We use Corollaries 2 and 3 to calculate the Laplace transform of the distribution of U n , since Let us define the function p n as i.e., p n (x) is the probability of having n clusters in the interval [0, x]. Since for all x ∈ R + , 0 ≤ p n (x) ≤ 1, the Laplace transform of p n with respect to x, is well defined.
Theorem 5. For any n ≥ 0, the Laplace transform of p n is given by Figure 3. Illustration of the condition equivalent to β 0 ≥ n.
The proof is thus complete. , for almost all sample-paths, there exists η > 0 such that ∆X j ≥ η for any j = 1, · · · , N x . Hence for ǫ < η, β 0 (x) = N x . This implies that β 0 (x) tends almost surely to N x as ǫ goes to 0. Moreover, it is immediate by the very definition of β 0 (x) that β 0 (x) ≤ N x . Since for any m, E [N m x ] is finite, the proof follows by dominated convergence.
Let Li t (z), z, t ∈ R, z < 1, be the polylogarithm function with parameter t, defined by For m a positive integer, consider the function of x Its Laplace transform is given by: Corollary 7. Let α be defined as follows: The Laplace transform of the m-th moment of β 0 (L) is: which converges, provided that α > 0.
Proof. Applying the Laplace transform of both sides of Eq. (4), we get: concluding the proof.
We define m k as the Stirling number of second kind [13], i.e., m k is the number of ways to partition a set of m objects into k groups. They are intimately related to polylogarithm by the following identity (see [14]) valid for any positive integer m, Proof. Using (6) in the result of Corollary 7, we get: where the coefficients c k,m are integers given by: Using the following identity of Stirling numbers [15], Hence, for any λ > 0, which shows that Thus, we have proved (7) for any positive integer m.
Theorem 9. For any n, L, λ and ǫ, we have: Proof. Since β 0 (L) ≤ N L and since E e sNL is finite for any s ∈ R, we have, for any s ≥ 0: Rearranging the terms of the right-side hand and substituting M m β0 (L), by the result of Eq. (7), we obtain: Furthermore, it is known (see [15]) that Hence, By inverting the Laplace transforms, we get: where δ a is the Dirac measure at point a. After some simple algebra, we find the expression of the probability that an interval contains n complete clusters: concluding the proof.
Lemma 10. For x ≥ 0, p n (x) has the three following properties: Proof. Let j be a non-negative integer. The function is obviously differentiable when x/ǫ = j. Besides, we have Since the right-hand term function of x is zero as well as its derivative for all j, the function is also derivable when x/ǫ = j, which proves i). Items ii) and iii) are direct consequences of Final Value theorem in the Laplace transform of p n and its derivative.
The expression of p n gives us a Laplace pair between the x and s domains: We can use this relation to find the distributions of B i and U n . Theorem 11. The distributions of B i and U n , respectively f Bi (x) and f Un (x) are and where the expressions of p 0 (x − ǫ) and d dx p 0 (x − ǫ) are straightforwardly obtained from Eq. (8).
We can also obtain the probability that the segment [0, L] is completely covered by the sensors. To do this, we remember that the first point (if there is one) is capable to cover the interval [X 1 − ǫ, X 1 + ǫ].
Theorem 12. Let R m,n (x) be defined as follows: Then, Proof. The condition of total coverage is the same as which means that: Hence, and since B 1 and X 1 are independent: The result then follows from Lemma 10 and some tedious but straightforward algebra.

Other Scenarios
The method can be used to calculate p n for other definitions of the number of clusters. We consider two other definitions: the number of incomplete clusters and the number of clusters in a circle.

4.1.
Number of incomplete clusters. The major difference with Sec. 3 is that a cluster is now taken into account as soon as one of the point of the cluster is inside the interval [0, L]. So, for instance, in Fig. 3, we count actually n + 1 incomplete clusters. We define β ′ 0 (L) as the number of incomplete clusters on an interval [0, L]. Theorem 13. Let G(k) be defined as Then Proof. The condition of β ′ 0 (L) ≥ n is now given by: We define Y n as Repeating the same calculations, we find the Laplace transform of Pr(β ′ 0 (.) = n): With this expression, following the lines of Lemma 6, we obtain: Then, we write: to find an expression with a well known Laplace transform inverse, and after inverting it, we obtain: Expanding the Laplace transform of the distribution of β ′ 0 (L) in a Taylor series and rearranging terms, we get Now, we use another recurrence that Stirling numbers obey [15], to get: Hence, Inverting this expression for any non-negative integer n, we have the searched distribution.

Number of clusters in a circle.
We investigate now the case where the points of the process are deployed over a circumference and we want to count the number of complete clusters, which corresponds to calculate the Euler's Characteristic of the total coverage, so we call this quantity χ. Without loss of generality , we can choose an arbitrary point to be the origin.
Theorem 14. The distribution of the Euler's Characteristic, χ(L), when the points are deployed over a circumference of length L is given by for n ≥ 0.
Proof. If there is no points on the circle, χ(L) = 0. Otherwise, if there is at least one point, we choose the origin at this point and we have equivalence between the events: In Fig. 4 we present an example of this equivalence. We can define Y n as The number of clusters is almost surely equal to the number of points when ǫ → 0, so Expanding the Laplace transform in a Taylor series and rearranging terms, as we did previously, yields we can directly invert this Laplace transform, add the case where there are no points for χ(L) = 0, and the theorem is proved.

Examples
We consider some examples to illustrate the results of the paper. Here, the behavior of the mean and the variance of β 0 (L) as well as P r(β 0 (L) = n) are presented.
From Eq. (7), we have that E [β 0 (L)] is given by: This expression agrees with the intuition that there are three typical regions given a fixed ǫ. When λ is much smaller than 1/ǫ, the number of clusters is approximatively the number of sensors, since the connections with few sensors will unlikely happen, which can be seen from the fact that E [β 0 (L)] → Lλ when λ → 0. As we increase λ, the mean number of direct connections overcomes the mean number of sensors and, at some value of λ, we expect that E [β 0 (L)] decreases, when adding a point is likely to connect disconnected clusters. We remark that the maximum occurs exactly for ǫ = 1/λ, i.e., when the mean distance between two sensors equals the threshold distance for them to be connected. At this maximum, E [β 0 (L)] takes the value of (L/ǫ − 1)e −1 . Finally, when λ is too large, all sensors tend to be connected and there is only one cluster which even goes beyond L, so there are no complete clusters into the interval [0, L]. This is trivial when we make λ → ∞ in the last equation. Figure 5 shows this behavior when L = 4 and ǫ = 1. The variance can be obtained also by Eq. (7): and under the condition that L > 2ǫ: Var(β 0 (L)) = (L − ǫ)λe −ǫλ + ǫ(3ǫ − 2L)λ 2 e −2ǫλ . Fig. 6 shows a plot of Var(β 0 (L)) in function of λ for L = 4 and ǫ = 1. We can expect that, when λ is small compared to ǫ, the plot should be approximatively linear, since there would not be too much connections in the network and the variance of the number of clusters should be close to the variance of the number of sensors given by λL. Since β 0 (L) tends almost surely to 0 when λ goes to infinity, Var(β 0 (L)) should also tend to 0 in this case. Those two properties are observed in the plot. Besides, we find the critical points of this function, and again, λ = 1/ǫ is λe −λǫ = L − ǫ 2ǫ(2L − 3ǫ) · By using the second derivative, we realize that 1/ǫ is actually a minimum. Besides, if L ≤ 2ǫ, there is just one critical point, a maximum, at λ = 1/ǫ.
Those expressions are simple and they have at most four terms, since L = 4ǫ. We plot these functions in Fig. 7. The critical points on those plots at λ = 1/ǫ are confirmed for the fact that, in function of λ, for every n, Pr(χ(L) = n) can be represented as a sum j i=0 q i,j (λe −λǫ ) i where the coefficients q i,j are constant in relation to λ. However, (λe −λǫ ) i has a critical point at λ = 1/ǫ for all i > 0, so this should be also a critical point of Pr(χ(L) = n). If λ is small, we should expect that Pr(χ(L) = 0) is close to one, since it is likely to N have no points. For this reason, in this region, Pr(χ(L) = n) for n > 0 is small. When λ is large, we expect to have very large clusters, likely to be larger than L, so it is unlikely to have a complete cluster in the interval and, again, Pr(χ(L) = 0) approaches to the unity, while Pr(χ(L) = n) for n > 0 become again small.