An Iterative Scheme to Compute Size Probabilities in Random Graphs and Branching Processes

We deal with a functional equation that plays an important role in random graphs and in branching processes. In branching processes, the functional equation relates offspring probabilities to population size probabilities, while in random graph it relates degree probabilities to small component size probabilities. We present an iterative scheme that allows computing the size probabilities numerically. It is also theoretically possible to invert the iteration, although this inverse iteration is numerically unstable.


Introduction
Let () and () be two probability generating functions that are linked through the functional equation  () =  ( ()) . ( Functions of this type occur in branching processes and in random graphs [1][2][3][4][5][6].In branching processes, () represents the probabilities of new offspring from a member of the population and () represents the population size probabilities.In the configuration model [4] of random graphs, () represents the excess degree probabilities of a vertex in small components and () represents the small component size probabilities.Note that, in both cases, () can be a defective generating function; that is, (1) < 1.
Usually () is given and () has to be computed.Only in rare cases is it possible to find an explicit analytic expression of ().However, a numerical iteration to compute the coefficients of () is possible.To the best of our knowledge, such a question has not been investigated and it seems that the iteration we propose in this paper is new.
Interestingly enough, this iteration can be inverted; that is, from the size distribution, we can infer the degree probabilities.We present also this inverse iteration, although it has to be remarked that the inverse iteration is numerically unstable.
The paper is organized as follows.In Section 2, we provide the mathematical background by referring to the case of random graphs.Then, in Section 3, we present the main result, that is, the iteration to compute the size probabilities of the small components of the graph.The possibility of inverting this computation is presented in Section 4.Then, in Section 5, we point out how the same iteration can be used for a branching process.Some conclusions are presented in Section 6.

Mathematical Background
We first present our result by explicitly referring to random graphs in the configuration model for which the picture is more complex.In a later section, we show how to relate the iteration to branching processes.Hence, all definitions in this section and in Sections 3 and 4 are related to random graphs.
A random graph has assigned degree probabilities  ℎ , ℎ = 0, 1, . .., and  ℎ is the probability that a randomly selected vertex has degree ℎ.We recall that the degree of a vertex is the number of vertices adjacent to it.The study of random graphs through generating functions is asymptotic; that is, it assumes an infinite number of vertices.Let  0 () be the probability generating function of the degree distribution; that is, (2) Let  =   0 (1) be the average degree and where clearly The  ℎ values are known as excess degree probabilities.Let  0 () and  1 () be two generating functions that can be expressed as power series as and they are defined by the equations Our aim is to compute the coefficients   and   .
The motivation for the generating functions  0 () and  1 () derives from the analysis of the asymptotic properties of the random graph in the configuration model.If the graph is sufficiently dense, it exhibits the so-called giant component, that is, a connected component whose size asymptotically goes to infinity.The giant component, if present, is unique.The rest of the graph consists of an infinite number of finite trees, the so-called small components (see [3], among many possible references).
It can be shown that if  ℎ is the probability that a randomly chosen vertex (in the whole graph) has degree ℎ, then   is the probability that a randomly chosen vertex belongs to a small component of size , and   is the probability that, after choosing a random vertex  of degree at least one and then a random vertex  adjacent to , the vertex  belongs to a small component of size  after removing the edge {, }.
If the giant component is present, the conditional probability pℎ of choosing in the small components a vertex of degree ℎ is different from  ℎ and similarly for the excess degree probability qℎ .It can be shown that where  is the solution of  =  1 () and V =  0 () is the fraction of vertices in the small components.We can briefly justify (7) by using Bayes' formula: with  being the random event of choosing a vertex in a small component and  ℎ being the random event of choosing a vertex of degree ℎ.Clearly Pr{ |  0 } = 1 and consequently p0 =  0 /V.If ℎ > 0, Pr{ |  ℎ } is the probability that all adjacent ℎ vertices belong to a small component once we have removed the corresponding edges, and so its value is  ℎ .This explains the left expression in (7).To justify the right expression, we need to compute the average degree in the small components by taking the derivative of  0 ()/V (by using ( 7)) and computing it for  = 1; that is,   0 ()/V =  1 ()/V =  2 /V.From this, we immediately get the expression at the right.
It turns out that using pℎ instead of  ℎ in the definition of  0 and  1 has the only effect of scaling the   values by the constant factor V and the   values by the constant factor , which correspond to the conditional probability of choosing within the small components.In particular, we have  0 (1) = V and  1 (1) =  if we use  ℎ and  ℎ in the definition of  0 and  1 , respectively, whereas we have  0 (1) = 1 and  1 (1) = 1 if we use pℎ and qℎ .
We also define the probability   that a randomly selected small component has size .Of course with  being the average size of a small component.Here we have to discount   because the choice of small components necessarily conditions the choice within the small components.
We also derive from the expression so that In this case, the computation is straightforward, since it involves all previously computed quantities.
Theoretically, the generating functions  0 () and  1 () involve an infinite series, but obviously only a finite number of coefficients can be computed.Hence, the computation has to be stopped after having computed the desired number of terms   and   .Since each term is computed only once and it is not the result of subsequent smaller and smaller additions, truncating the computation up to a certain index has no effect on the accuracy of the values we compute.In other words, if we compute just a few terms, they are computed with the same accuracy as we had computed all coefficients.
It is clear from the definitions and the previous iteration that  1 () implies  0 (); that is, once we know the   values, the   values are also known and implied by the   values.It is not difficult to see that the converse is also true.By differentiating (6) and using (3), we get and by integrating (19), we get which leads to the following identities term by term: For  = 2, we get in particular and, for  > 2, we have which allows writing so that all   values can be recursively computed once we know .We note that  1 > 0 implies that  0 =  1 > 0.
Hence, the recursion is well defined if  1 > 0, which is an almost necessary assumption if we investigate the presence of small components.Hence, knowledge of the   values implies knowledge of the   values.If we do not know , we may still compute  from the recursion.We first note that all   depend on  through the factor 1/ √ .Therefore, we initially guess the value  = 1 and compute tentative values r .Since ∑    = , we find the correct value for  as  = (∑  r /) 2 and so we have the correct values:

Inferring the Degree Probabilities from the Component Size Probabilities
We may also consider the inverse problem of finding  0 () and  1 () from  0 () and  1 (), that is, computing the degree distribution which gives rise to a particular small component size distribution.This problem presents interesting features.Arbitrary degree distributions of   and   may not be feasible; that is, there may be no degree distribution that can lead to those values.Formally, the recursion can be easily inverted; that is, knowing the   values, we can compute the   values and   values.Indeed, from (14), we have that is, Computing the   ℎ values is straightforward once we know the   values.From the   values, we easily deduce the   values, apart from the fact that  0 cannot be derived from the   values.However,  0 =  1 , and so it is known a priori.Note also that  1 > 0 implies that  −1 0 > 0. There is, however, a subtle point to be settled.Let us assume that a giant component may be present but we do not know the  and V values.Then, it is simpler to work with the conditional probabilities within the small components.Starting from the (conditional)   probabilities, we compute the   values as explained in the previous section but by using the normalization ∑  r = 1.This way, we actually compute qℎ and pℎ .Then, from (7), we get  ℎ and  ℎ .The unknowns  and V are computed by imposing ∑ ℎ  ℎ = 1 and ∑ ℎ  ℎ = 1, which is equivalent to solving  1 ( −1 ) =  −1 and  0 ( −1 ) = V −1 with  0 and  1 defined on pℎ and qℎ .
However, the inverse recursion is numerically unstable, and, unless we use exact data, it can produce absurd outcomes, like probabilities outside the range [0, 1].The reason of the instability is clear from (28), where we have a difference in the numerator and the denominator is getting smaller and smaller with   0 .As a simple exercise, suppose that we wonder which degree distribution gives rise to a size distribution of the small components of exponential type; that is, with 0 <  < 1.Hence, we have  = 1/(1 − ) and We remark that these   are conditional probabilities.Now we have to compute the   values from the   values.As explained in the previous section, we initially fix  = 1 and compute from (25) the tentative values: for which ∑  r = √2.Hence,  = 2, implying the correct values: If we carry out the computation in (28) symbolically, we get from which (34) From ( 7), we have and also V = 1.Hence, there is no giant component in this case.
Now assume that we have experimental data from which we infer the values:  Not only are there negative values but also the absolute value of   is increasing with  showing an amplifying effect of error propagation.Therefore, a lot of care should be exerted in order to carry out computations on experimental data.This can be matter of further investigation and is beyond the scope of this paper.We show a second example for the inverse computation.Assume that where   are the Catalan numbers.If we carry out the computation in (28) symbolically, we get   We see again the same inconsistencies and the amplifying effect.In any case, we may note that the values with odd index are correctly computed as null values.