The method of potential solutions of Fokker-Planck equations is used to develop a transport equation for the
joint probability of N coupled stochastic variables with the Dirichlet distribution as its asymptotic solution. To ensure a bounded sample space, a coupled nonlinear diffusion process is required: the Wiener processes in the equivalent system of stochastic differential equations are multiplicative with coefficients dependent on all the stochastic variables. Individual samples of a discrete ensemble, obtained from the stochastic process, satisfy a unit-sum constraint at all times. The process may be used to represent realizations of a fluctuating ensemble of N variables subject to a conservation principle. Similar to the multivariate Wright-Fisher process, whose invariant is also Dirichlet, the univariate case yields a process whose invariant is the beta distribution. As a test of the results, Monte Carlo simulations are used to evolve numerical ensembles toward the invariant Dirichlet distribution.
1. Objective
We develop a Fokker-Planck equation whose statistically stationary solution is the Dirichlet distribution [1–3]. The system of stochastic differential equations (SDE), equivalent to the Fokker-Planck equation, yields a Markov process that allows a Monte Carlo method to numerically evolve an ensemble of fluctuating variables that satisfy a unit-sum requirement. A Monte Carlo solution is used to verify that the invariant distribution is Dirichlet.
The Dirichlet distribution is a statistical representation of nonnegative variables subject to a unit-sum requirement. The properties of such variables have been of interest in a variety of fields, including evolutionary theory [4], Bayesian statistics [5], geology [6, 7], forensics [8], econometrics [9], turbulent combustion [10], and population biology [11].
2. Preview of Results
The Dirichlet distribution [1–3] for a set of scalars 0≤Yα, α=1,…,N-1, ∑α=1N-1Yα≤1, is given by
(1)𝒟(Y,ω)=Γ(∑α=1Nωα)∏α=1NΓ(ωα)∏α=1NYαωα-1,
where ωα>0 are parameters, YN=1-∑β=1N-1Yβ, and Γ(·) denotes the gamma function. We derive the stochastic diffusion process, governing the scalars, Yα,
(2)dYα(t)=bα2[SαYN-(1-Sα)Yα]dt+καYαYNdWα(t),α=1,…,N-1,
where dWα(t) is an isotropic vector-valued Wiener process [12], and bα>0, κα>0, and 0<Sα<1 are coefficients. We show that the statistically stationary solution of (2) is the Dirichlet distribution, (1), provided that the SDE coefficients satisfy
(3)b1κ1(1-S1)=⋯=bN-1κN-1(1-SN-1).
The restrictions imposed on the SDE coefficients, bα, κα, and Sα, ensure reflection towards the interior of the sample space, which is a generalized triangle or tetrahedron (more precisely, a simplex) in N-1 dimensions. The restrictions together with the specification of the Nth scalar as YN=1-∑β=1N-1Yβ ensure
(4)∑α=1NYα=1.
Indeed, inspection of (2) shows that, for example, when Y1=0, the diffusion is zero and the drift is strictly positive, while if Y1=1, the diffusion is zero (YN=0) and the drift is strictly negative.
3. Development of the Diffusion Process
The diffusion process (2) is developed by the method of potential solutions.
We start from the Itô diffusion process [12] for the stochastic vector, Yα,
(5)dYα(t)=aα(Y)dt+bαβ(Y)dWβ(t),α,β=1,…,N-1,
with drift, aα(Y), diffusion, bαβ(Y), and the isotropic vector-valued Wiener process, dWβ(t), where summation is implied for repeated indices. Using standard methods given in [12], the equivalent Fokker-Planck equation governing the joint probability, ℱ(Y,t), derived from (5), is
(6)∂ℱ∂t=-∂∂Yα[aα(Y)ℱ]+12∂2∂Yα∂Yβ[Bαβ(Y)ℱ],
with diffusion Bαβ=bαγbγβ. Since the drift and diffusion coefficients are time homogeneous, aα(Y,t)=aα(Y) and Bαβ(Y,t)=Bαβ(Y), (5) is a statistically stationary process and the solution of (6) converges to a stationary distribution, [12, Sec. 6.2.2]. Our task is to specify the functional forms of aα(Y) and bαβ(Y) so that the stationary solution of (6) is 𝒟(Y), defined by (1).
A potential solution of (6) exists if
(7)∂lnℱ∂Yβ=Bαβ-1(2aα-∂Bαγ∂Yγ)≡-∂ϕ∂Yβ,α,β,γ=1,…,N-1
is satisfied, [12, Sec. 6.2.2]. Since the left-hand side of (7) is a gradient, the expression on the right must also be a gradient and can therefore be obtained from a scalar potential denoted by ϕ(Y). This puts a constraint on the possible choices of aα and Bαβ and on the potential, as ϕ,αβ=ϕ,βα must also be satisfied. The potential solution is
(8)ℱ(Y)=exp[-ϕ(Y)].
Now functional forms of aα(Y) and Bαβ(Y) that satisfy (7) with ℱ(Y)≡𝒟(Y) are sought. The mathematical constraints on the specification of aα and Bαβ are as follows.
Bαβ must be symmetric positive semidefinite. This is to ensure the following.
The square-root of Bαβ (e.g., the Cholesky-decomposition, bαβ) exists, required by the correspondence of the SDE (5) and the Fokker-Planck equation (6).
Equation (5) represents a diffusion.
det(Bαβ)≠0, required by the existence of the inverse in (7).
For a potential solution to exist (7) must be satisfied.
With ℱ(Y)≡𝒟(Y) (8) shows that the scalar potential must be
(9)ϕ(Y)=-∑α=1N(ωα-1)lnYα.
It is straightforward to verify that the specifications
(10)aα(Y)=bα2[SαYN-(1-Sα)Yα],(11)Bαβ(Y)={καYαYNforα=β,0forα≠β
satisfy the above mathematical constraints, (1) and (2). Here bα>0, κα>0, 0<Sα<1, and YN=1-∑β=1N-1Yβ. Summation is not implied for (9)–(11).
Substituting (9)–(11) into (7) yields a system with the same functions on both sides with different coefficients, yielding the correspondence between the N coefficients of the Dirichlet distribution, (1), and the Fokker-Planck equation (6) with (10)-(11) as
(12)ωα=bακαSα,α=1,…,N-1,ωN=b1κ1(1-S1)=⋯=bN-1κN-1(1-SN-1).
For example, for N=3 one has Y=(Y1,Y2,Y3=1-Y1-Y2) and from (9) the scalar potential is
(13)-ϕ(Y1,Y2)=(ω1-1)lnY1+(ω2-1)lnY2+(ω3-1)ln(1-Y1-Y2).
The system in (7) then becomes
(14)ω1-1Y1-ω3-1Y3=(b1κ1S1-1)1Y1-[b1κ1(1-S1)-1]1Y3,ω2-1Y2-ω3-1Y3=(b2κ2S2-1)1Y2-[b2κ2(1-S2)-1]1Y3,
which shows that by specifying the parameters, ωα, of the Dirichlet distribution as
(15)ω1=b1κ1S1,(16)ω2=b2κ2S2,(17)ω3=b1κ1(1-S1)=b2κ2(1-S2),
the stationary solution of the Fokker-Planck equation (6) with drift (10) and diffusion (11) is 𝒟(Y,ω) for N=3. The above development generalizes to N variables, yielding (12) and reduces to the beta distribution, a univariate specialization of 𝒟 for N=2, where Y1=Y and Y2=1-Y, see [13].
If (12) hold, the stationary solution of the Fokker-Planck equation (6) with drift (10) and diffusion (11) is the Dirichlet distribution, (1). Note that (10)-(11) are one possible way of specifying a drift and a diffusion to arrive at a Dirichlet distribution; other functional forms may be possible. The specifications in (10)-(11) are a generalization of the results for a univariate diffusion process, discussed in [13, 14], whose invariant distribution is beta.
The shape of the Dirichlet distribution, (1), is determined by the N coefficients, ωα. Equation (12) shows that in the stochastic system, different combinations of bα, Sα, and κα may yield the same ωα and that not all of bα, Sα, and κα may be chosen independently to keep the invariant Dirichlet.
4. Corroborating That the Invariant Distribution Is Dirichlet
For any multivariate Fokker-Planck equation there is an equivalent system of Itô diffusion processes, such as the pair of (5)-(6) [12]. Therefore, a way of computing the (discrete) numerical solution of (6) is to integrate (5) in a Monte Carlo fashion for an ensemble [15]. Using a Monte Carlo simulation we show that the statistically stationary solution of the Fokker-Planck equation (6) with drift and diffusion (10)-(11) is a Dirichlet distribution, (1).
The time evolution of an ensemble of particles, each with N=3 variables (Y1,Y2,Y3), is numerically computed by integrating the system in (5), with drift and diffusion (10)-(11), for N=3 as
(18)dY1(i)=b12[S1Y3(i)-(1-S1)Y1(i)]dt+κ1Y1(i)Y3(i)dW1(i),(19)dY2(i)=b22[S2Y3(i)-(1-S2)Y2(i)]dt+κ2Y2(i)Y3(i)dW2(i),(20)Y3(i)=1-Y1(i)-Y2(i),
for each particle i. In (18)-(19) dW1 and dW2 are independent Wiener processes, sampled from Gaussian streams of random numbers with mean 〈dWα〉=0 and covariance 〈dWαdWβ〉=δαβdt. 400,000 particle triplets, (Y1,Y2,Y3), are generated with two different initial distributions, displayed in Figures 1(a) and 2(a), a triple-delta and a box, respectively. Each member of both initial ensembles satisfy ∑α=13Yα=1. Equations (18)–(20) are advanced in time with the Euler-Maruyama scheme [16] with time step Δt=0.05. Table 1 shows the coefficients of the stochastic system (18)–(20), the corresponding parameters of the final Dirichlet distribution, and the first two moments at the initial times for the triple-delta initial condition case. The final state of the ensembles is determined by the SDE coefficients, constant for these exercises, also given in Table 1, the same for both simulations, satisfying (17).
Initial and final states of the Monte Carlo simulation starting from a triple delta. The coefficients, b1, b2, S1, S2, κ1, κ2, of the system of SDEs (18)–(20) determine the distribution to which the system converges. The Dirichlet parameters, implied by the SDE coefficients via (15)–(17), are in brackets. The corresponding statistics are determined by the well-known formulae of Dirichlet distributions [2].
Initial state: triple delta, see Figure 1
SDE coefficients and the statistics of their implied Dirichlet distribution in the statistically stationary state
b1=1/10
b2=3/2
(ω1=5)
S1=5/8
S2=2/5
(ω2=2)
κ1=1/80
κ2=3/10
(ω3=3)
〈Y1〉0≈0.05
〈Y1〉s=1/2
〈Y2〉0≈0.42
〈Y2〉s=1/5
〈Y3〉0≈0.53
〈Y3〉s=3/10
〈y12〉0≈0.03
〈y12〉s=1/44
〈y22〉0≈0.125
〈y22〉s=4/275
〈y32〉0≈0.13
〈y32〉s=21/1100
〈y1y2〉0≈-0.012
〈y1y2〉s=-1/110
〈y1y3〉0≈-0.017
〈y1y3〉s=-3/220
〈y2y3〉0≈-0.114
〈y2y3〉s=-3/550
Time evolution of the joint probability, ℱ(Y1,Y2), extracted from the numerical solution of (18)–(20). The initial condition is a triple-delta distribution, with unequal peaks at the three corners of the sample space. At the end of the simulation, t=140, the solid lines are those of the distribution extracted from the numerical ensemble, and the dashed lines are those of a Dirichlet distribution to which the solution converges in the statistically stationary state, implied by the constant SDE coefficients, sampled at the same heights.
Time evolution of the joint probability, ℱ(Y1,Y2), extracted from the numerical solution of (18)–(20). The initial condition is a box with diffused sides. By t=160, the distribution converges to the same Dirichlet distribution as in Figure 1.
The time evolutions of the joint probabilities are extracted from both calculations and displayed at different times in Figures 1 and 2. At the end of the simulations two distributions are plotted in Figures 1(d) and 2(d): the one extracted from the numerical ensemble and the Dirichlet distribution determined analytically using the SDE coefficients—in excellent agreement in both figures. The statistically stationary solution of the developed stochastic system is the Dirichlet distribution.
For a more quantitative evaluation, the time evolutions of the first two moments,
(21)μα=〈Yα〉=∫01∫01Yαℱ(Y1,Y2)dY1dY2,(22)〈yαyβ〉=〈(Yα-〈Yα〉)(Yβ-〈Yβ〉)〉,
are also extracted from the numerical simulation with the triple-delta-peak initial condition as ensemble averages and displayed in Figures 3 and 4. The figures show that the statistics converge to the precise state given by the Dirichlet distribution that is prescribed by the SDE coefficients, see Table 1.
Time evolution of the means, extracted from the numerically integrated system (18)–(20), starting from the triple-delta initial condition. Dotted-solid lines: numerical solution, dashed lines: statistics of the Dirichlet distribution determined analytically using the constant coefficients of the SDE, see Table 1.
Time evolution of the second central moments, extracted from the numerically integrated system (18)–(20), starting from the triple-delta initial condition. The legend is the same as in Figure 3.
The solution approaches a Dirichlet distribution, with nonpositive covariances [2], in the statistically stationary limit, Figure 4(b). Note that during the evolution of the process, 0<t≲80, the solution is not necessarily Dirichlet, but the stochastic variables sum to one at all times. The point (Y1, Y2), governed by (18)-(19), can never leave the (N-1)-dimensional (here N=3) convex polytope and by definition Y3=1-Y1-Y2. The rate at which the numerical solution converges to a Dirichlet distribution is determined by the vectors bα and κα.
The above numerical results confirm that starting from arbitrary realizable ensembles, the solution of the stochastic system converges to a Dirichlet distribution in the statistically stationary state, specified by the SDE coefficients.
5. Relation to Other Diffusion Processes
It is useful to relate the Dirichlet diffusion process, (2), to other multivariate stochastic diffusion processes with linear drift and quadratic diffusion.
A close relative of (2) is the multivariate Wright-Fisher (WF) process [11], used extensively in population and genetic biology,
(23)dYα(t)=12(ωα-ωYα)dt+∑β=1N-1Yα(δαβ-Yβ)dWαβ(t),α=1,…,N-1,
where δαβ is Kronecker’s delta, ω=∑β=1Nωβ with ωα defined in (1), and YN=1-∑β=1N-1Yβ. Similar to (2), the statistically stationary solution of (23) is the Dirichlet distribution [17]. It is straightforward to verify that its drift and diffusion also satisfy (7) with ℱ≡𝒟; that is, WF is a process whose invariant is Dirichlet and this solution is potential. A notable difference between (2) and (23), other than the coefficients, is that the diffusion matrix of the Dirichlet diffusion process is diagonal, while that of the WF process is full.
Another process similar to (2) and (23) is the multivariate Jacobi process, used in econometrics,
(24)dYα(t)=a(Yα-πα)dt+cYαdWα(t)-∑β=1N-1YαcYβdWβ(t),α=1,…,N
of Gourieroux and Jasiak [9] with a<0, c>0, πα>0, and ∑β=1Nπβ=1.
In the univariate case, the Dirichlet, WF, and Jacobi diffusions reduce to
(25)dY(t)=b2(S-Y)dt+κY(1-Y)dW(t),
see also [13], whose invariant is the beta distribution, which belongs to the family of Pearson diffusions, discussed in detail by Forman and Sørensen [14].
6. Summary
The method of potential solutions of Fokker-Planck equations has been used to derive a transport equation for the joint distribution of N fluctuating variables. The equivalent stochastic process, governing the set of random variables, 0≤Yα, α=1,…,N-1, ∑α=1N-1Yα≤1, reads as
(26)dYα(t)=bα2[SαYN-(1-Sα)Yα]dt+καYαYNdWα(t),α=1,…,N-1,
where YN=1-∑β=1N-1Yβ and bα, κα, and Sα are parameters, while dWα(t) is an isotropic Wiener process with independent increments. Restricting the coefficients to bα>0, κα>0, and 0<Sα<1 and defining YN as above ensure ∑α=1NYα=1 and that individual realizations of (Y1,Y2,…,YN) are confined to the (N-1)-dimensional convex polytope of the sample space. Equation (26) can therefore be used to numerically evolve the joint distribution of N fluctuating variables required to satisfy a conservation principle. Equation (26) a coupled system of nonlinear stochastic differential equations whose statistically stationary solution is the Dirichlet distribution, (1), provided that the coefficients satisfy
(27)b1κ1(1-S1)=⋯=bN-1κN-1(1-SN-1).
In stochastic modeling, one typically begins with a physical problem, perhaps discrete, then derives the stochastic differential equations whose solution yields a distribution. In this paper we reversed the process: we assumed a desired stationary distribution and derived the stochastic differential equations that converge to the assumed distribution. A potential solution form of the Fokker-Planck equation was posited, from which we obtained the stochastic differential equations for the diffusion process whose statistically stationary solution is the Dirichlet distribution. We have also made connections to other stochastic processes, such as the Wright-Fisher diffusions of population biology and the Jacobi diffusions in econometrics, whose invariant distributions possess similar properties but whose stochastic differential equations are different.
Acknowledgments
It is a pleasure to acknowledge a series of informative discussions with J. Waltz. This work was performed under the auspices of the U.S. Department of Energy under the Advanced Simulation and Computing Program.
JohnsonN. L.An approximation to the multinomial distribution some properties and applicationsMosimannJ. E.On the compound multinomial distribution, the multivariate-distribution, and correlations among proportionsKotzS.JohnsonN. L.BalakrishnanN.PearsonK.Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organsPaulinoC. D. M.de Bragança PereiraC. A.Bayesian methods for categorical data under informative general censoringChayesF.Numerical correlation and petrographic variationMartinP. S.MosimannJ. E.Geochronology of pluvial lake cochise, Southern Arizona, [part] 3, pollen statistics and pleistocene metastabilityLangeK.Applications of the Dirichlet distribution to forensic match probabilitiesGourierouxC.JasiakJ.Multivariate Jacobi process with application to smooth transitionsGirimajiS. S.Assumed beta-pdf model for turbulent mixing: validation and extension to multiple scalar mixingSteinruckenM.Rachel WangY. X.SongY. S.An explicit transition density expansion for a multi-allelic Wright-Fisher diffusion with general diploid selectionGardinerC. W.BakosiJ.RistorcelliJ. R.Exploring the beta distribution in variable-density turbulent mixingFormanJ. L.SørensenM.The Pearson diffusions: a class of statistically tractable diffusion processesPopeS. B.PDF methods for turbulent reactive flowsKloedenP. E.PlatenE.KarlinS.TaylorH. M.