Normal Limiting Distribution of the Size of Binary Interval Trees

The limiting distribution of the size of binary interval tree is investigated. Our illustration is based on the contraction method, and it is quite different from the case in one-sided binary interval tree. First, we build a distributional recursive equation of the size. Then, we draw the expectation, the variance, and some high order moments. Finally, it is shown that the size (with suitable standardization) approaches the standard normal random variable in the Zolotarev metric space.


Introduction
Random trees are usually generated based on combinatorics and occur also in the context of algorithms from computer science.There are many kinds of random trees with different structures, such as recursive trees, search trees, binary trees, and interval trees.The asymptotic probability behavior of random variables in random trees has attracted more scholars' attention and has become a popular research area.Drmota [1] introduced some labelled and unlabelled random trees in his book.Devroye and Janson [2] studied the protected nodes in several random trees.Feng and Hu [3] researched the phase changes of scale-free trees.The limiting law for the height, size, and subtree of binary search trees was also considered (see [4][5][6]).There were also some researchers investigating the Zagreb index and nodes of random recursive trees (see [7][8][9]).
Binary interval tree is a random structure that underlies the process of random division of a line interval and parking problems.It has recently been a popular subject.Sibuya and Itoh [10] showed that the number of internal and external nodes in different levels of binary internal tree is asymptotically normal, from which the asymptotic normality of the size of the tree could not be achieved directly.Prodinger [11] looked into various parameters of the incomplete trie, a one-sided version of a random tree with a digital flavor.Fill et al. [12] followed with a study of the nonexistence of limit distribution for the height of the incomplete trie.Itoh and Mahmoud [13] considered five incomplete one-sided variants of binary interval trees and proved that their sizes all approach some normal random variables.Janson [14] drew the same result for a larger scale of one-sided interval trees by the renewal theory, and one kind of fragmentation trees was discussed by Janson and Neininger [15].Javanian et al. [16] investigated the paths in m-ary interval trees.Su et al. [17] studied the complete binary interval trees and got the Law of Large Numbers.In addition, Pan et al. [18] considered the construction algorithm about binary interval trees.
The binary interval tree is a tree associated with repeated divisions of a line interval of length .The process of divisions is as follows.If  < 1, there is no division in effect; the associated interval tree consists only of one terminal node.Supposing that  ≥ 1, we begin with the interval (0, ).Divide the interval (0, ) into two subintervals by choosing   , a point uniformly distributed over the interval (0, ).Then, we get two intervals, (0,   ) and (  , ).Each of the two subintervals is further divided at a uniform point of its length, and two smaller subintervals are got as before.If the length of the subinterval is less than 1, we stop the division.Repeat this 2 Mathematical Problems in Engineering process until the length of every interval (or subinterval) is less than 1.
We take  = 4, for instance.Figures 1(a) and 1(b) show how the above random division process of interval generates a binary interval tree.
If some different conditions are added and those intervals satisfying the conditions are not allowed to be divided (see [13,14]), then we can get different incomplete interval trees.In particular, if we only divide one subinterval of every interval, then the interval tree we get is the so-called one-sided interval tree (see [13]).
It is obvious that interval tree could embody many properties of random division, so it can elicit lots of valuable subjects related to probability.For example, for  > 0, the height of the interval tree is the greatest level of all subintervals after the divisions, denoted by   ; the total number of nodes of an interval tree is the total number of intervals that were got from the random division process, and so on.Let   be the size of the interval trees, that is, the total number of nodes of the binary interval trees.Our intention is to investigate the random variable   , the size of binary interval trees.
In this paper, the central limit theorem of the size of binary interval trees is investigated.In view of the difficulty to calculate the moment generating function of   , the method we used is completely different from that in the case of one-sided interval trees.In Section 2, we build a distributional recursive equation of   and give the expectation, the variance, and some high order moments of   .In Section 3, via the contraction method, the limit law of   is shown to approach the unique solution of a fixedpoint distributional equation in the Zolotarev metric space.Finally, we demonstrate that   , with suitable standardization, converges to a normal limiting random variable, as  → ∞.

The Moments of 𝑆 𝑥
Compared with the one-sided interval trees, the properties of binary interval trees are much more complex.There are a lot of difficulties when it comes to obtaining the moment generating function of   .Therefore, the method used in the case of one-sided interval trees (see [13]) is no longer applicable.Here, we build a distributional recursive equation of   .We can calculate the expectation and the variance of   .Furthermore, we find that the order of the fourth central moment of   is ( 2 ) as  goes to infinity.
From the definition of binary interval tree, it is easy to see that  1 = 3 and   = 1, for  < 1.For our purpose to investigate the case of  ≥ 1, let   denote the point chosen uniformly from interval (0, ); hence,   ∼ (0, ).For any fixed real number 0 <  < , if   = , we denote  (1)    to be the size of the left subtree associated with the interval (0, ).Correspondingly,  (2)  − denotes the size of the right subtree associated with the interval (, ).According to the rule of division, we can see that  (1)   and  (2)  − are mutually independent.Thus, we have This formula implies that if   =  is given,   has the same distribution as 1 +  (1)   +  (2)  − .Obviously, we can rewrite the above formula as Define It is easy to see that From the distributional recursive equation ( 2) and the above boundary conditions, Su et al. [17] calculated the expectation E  and the variance Var   , for any  ≥ 0. Lemma 1.Let   be the size of a binary interval tree.Then Lemma 2. Let   be the size of a binary interval tree.Then In order to prove that the asymptotic distribution of   is normal, we also need the order of E(  − E  ) 4 as  → ∞.The following proposition shows the fourth central moment of   .Proposition 3. Let   be the size of a binary interval tree.Then Proof.See the appendix.

The CLT for 𝑆 𝑥
In this section, we will prove the asymptotic normality of   as  → ∞.The main method is the contraction method and some metrics are needed especially the Zolotarev metrics (see [19]).
First we introduce the Zolotarev metrics.Denote the distribution of the random variable  by L().Let D be the set of the distributions of all real random variables, and define It can be verified that random variable  with L() = N(0,  2 ) satisfies the following formula.For any  ∈ [0, 1], and more generally, we have the following lemma.
Proof.In fact, for any  ∈ [0, 1], we have Therefore, But, we can find that, in the set D * , there is only one distribution, the standard normal N(0, 1), satisfying (10).
Suppose that  is a nonnegative integer.Denote F () by the set of all real functions that are  times continuous and differentiable, defined on the real line.Let where 0 <  ≤ 1 is a fixed real number.Let  =  +  and and then   is the Zolotarev metrics with order  on the set D.
According to the correlative inequality in [21], for any where Γ is the gamma function.Assume that the distribution of random variable  is N(0, 1).It follows from Proposition 3 that Therefore, there exists a constant  > 0 such that sup Denote where Φ is standard normal distribution and  is standard normal random variable; then we can see that Now, we just need to prove that  = 0; then the theorem follows.
As we had pointed out before, the standard normal distribution is the only distribution satisfying (10) in the set D * .From ( 25), ( 14), and Lemma 4, for  > 4, we have 15) and (29)) Given  > 0, let  > 0 be small enough such that  5/2 < /8.For any fixed  > 0, when  is sufficiently large, then where  is the constant as before and  is sufficiently large.It implies that when  is sufficiently large.Therefore, From this equation and the arbitrariness of  > 0, we can conclude  = 0 and lim immediately.By (18), the theorem holds.

Mathematical Problems in Engineering
We need to calculate E 3  first before we get E 4  .For  > 3, we have 3 ) . (A.4) In view of the independence between   and  * − and that E[  ] = E[ *    ] = 0 holds for any 1 ≤  ≤  − 1, we have It is easy to see that and when  > 3, for the part  2 , we have (A.7) Therefore, That is, (A.9) Via differentiation with respect to , we get the differential equation: The solution to this differential equation is where  0 is a constant real number.Similarly, for E[  ] 4 , when  > 4, we have Because   is independent of  * − , and E  = 0 holds for any 1 ≤  ≤  − 1, we get (A.13) In particular, for the part  1 , we have where  0 is the same as that in (A.11).

Figure 1 :
Figure 1: (a) The division process.(b) The binary interval tree.

Lemma 4 .
If  and  are standard normal random variables,  is uniformly distributed over interval [0, 1], and (, , ) are mutually independent and then one has