^{1,2}

^{3}

^{3,4}

^{1}

^{2}

^{3}

^{4}

We review recent results of ours concerning branching processes with general lifetimes and neutral mutations, under the infinitely many alleles model, where mutations can occur either at the birth of particles or at a constant rate during their lives. In both models, we study the allelic partition of the population at time

We consider a general branching model, where particles have i.i.d. (not necessarily exponential) life lengths and give birth at a constant rate

We enrich this genealogical model with mutations. In Model I, each child is a clone of her mother with probability

Branching processes (and especially birth and death processes) with mutations have many applications in biology. In carcinogenesis [

Branching processes with mutations are also used in epidemiology. Epidemics, and especially their onset, can be modeled by birth and death processes, where particles are infected hosts, births are disease transmissions, and deaths are recoveries or actual deaths. In [

Let us also mention the existence of models, for example, [

In ecology, the neutral theory of biodiversity [

In this paper, we are first interested in the allelic partition of the population and more precisely in properties about the

In our models, it is not possible to obtain a counterpart of Ewens sampling formula but we obtain different kinds of results concerning the frequency spectrum (

We do not know any previous mathematical studies, other than ours, on branching processes with Poissonian mutations, but there are several existing mathematical results on branching models with mutations at birth that we now briefly review.

In discrete time, Griffiths and Pakes [

In continuous time, Pakes [

The paper is organized as follows. In Section

Notice that in this paper, most of the results are stated for linear birth and death processes in order to simplify the notation. Most of them are also true with general life length distributions and are proved in Chapter 3 of the Ph.D. thesis [

We first define the model without mutations and give some of its properties. Afterwards, we will explain the two mutation mechanisms that we consider in this paper.

As a population model, we consider

at time

all particles have i.i.d. reproduction behaviors;

conditional on her birth date

It is important to notice that the common law of life lengths can be as general as possible. Let

The total population process

In our particular case, the common distribution of lifespans is

The advantage of homogeneous, binary CMJ processes is that they allow for explicit computations, for example, about one-dimensional marginals of

The one-dimensional marginals of

If

The following proposition justifies the fact that

If

In fact, convergence in distribution is proved in [

We now assume that particles in the population carry types, also called

In Model I, mutations occur at birth. More precisely, there is some

An example of a splitting tree in Model I and of the allelic partition of the whole extant population at time

In Model II, particles independently experience mutations during their lives at constant rate

An example of a splitting tree with mutations in Model II and of the allelic partition of the whole extant population at time

In what follows, an important role will be played by the

Concerning Model I, it can be seen [

Concerning Model II, it can be seen [

An interesting case that we will focus on is the

In this case,

The same results hold for

We will sometimes state results in the total generality of splitting trees, in which case an equation numbered (-a) (resp. (-b)) refers to Model I (resp. Model II), as done previously. However, we will most of the time focus on the exponential case, in which we will as soon as possible use the unified notation using

In the exponential case, notice that Models I and II are two (incompatible) cases of a more general class of linear birth and death processes with mutations, where particles mutate spontaneously at rate

Recall that a

More precisely, we give properties of the allelic partition of the entire population by studying the

For instance, in Figure

In the case of branching processes, there is no closed-form formula available for the law of the frequency spectrum as it is the case for the Wright-Fisher model thanks to Ewens sampling formula [

We first give an exact expression of the expected frequency spectrum at any time

For

For

In [

The expected frequency spectrums

For

The second terms that appear in the r.h.s. correspond to the probabilities that the progenitor has

In the exponential case, when the process

From Corollary

We suppose that

Notice that

In this section and in all following ones, we are interested in long-time behaviors in the two models we consider. Then, from now on,

This paragraph deals with the improvements of the convergence results (

The main technique we use to prove them is CMJ processes counted with random characteristics (see [

A characteristic is a random nonnegative function on

Let

In the exponential case, one has

Notice that (

Thanks to Proposition

In this paragraph, we only treat the exponential case. Let us assume that the clonal process is supercritical, that is,

In the exponential case, one has for both models

Notice that this result is consistent with [

The following proof of Proposition

Since

Recalling (

We want to obtain a similar result to Proposition

In the exponential case, one has

By a change of variables, we have set

We now state results about ages of the oldest families and about sizes of the largest ones. We mainly focus on the case when clonal populations are subcritical. Then, in Section

We need some notation. For

for

for

In this section, we are interested in finding the orders of magnitudes of the ages and of the sizes of the families; that is, in finding numbers

In this section, we suppose that the clonal processes are subcritical and we are interested in ages of old families. Although we only state the results in the exponential case, they also hold in the general case and are proved in [

In the first result, which is a result in expectation, we show that in both models, the ages are of order of magnitude

One supposes that

This result is a consequence of the expected spectrum formula (

With the same assumptions as in Proposition

The proof of this proposition in the general case and for Model I, given in [

The last result deals with the convergence in distribution of the sequence of the ranked ages of extant families. Let

With the same assumptions as previously, let

In this paragraph, we still suppose that the clonal process is subcritical and we are interested in similar results as those of Section

Concerning Model I, this problem is still open. On the contrary, it is possible to obtain in Model II the sizes of the largest families. In [

One sets

For

Conditional on the survival event, the sequence (

The case of a critical clonal process

If

Similarly to the subcritical case, the problem of sizes of the largest families is still open. Nevertheless, we can state the following conjecture about their order of magnitude.

If

The general case when

As in Model I, ages of the oldest families are of order

In [

Notice that we cannot obtain similar results to Proposition

In [

In Model II, for a general supercritical splitting tree, if

This work was supported by project MANEGE ANR-09-BLAN-0215 (French National Research Agency). The authors want to thank an anonymous referee for his/her careful check of this paper.