^{1}

Box G. E. P. and Draper N. R. (1987), Empirical Model Building and Response Surfaces, John Wiley & Sons, NY, p. 424.

In many cases relevant to biomedicine, a variable time, which features a certain distribution, is required for objects of interest to pass from an initial to an intermediate state, out of which they exit at random to a final state. In such cases, the distribution of variable times between exiting the initial and entering the final state must conform to the convolution of the first distribution and a negative exponential distribution. A common example is the exponentially modified Gaussian (EMG), which is widely used in chromatography for peak analysis and is long known as ex-Gaussian in psychophysiology, where it is applied to times from stimulus to response. In molecular and cell biology, EMG, compared with commonly used simple distributions, such as lognormal, gamma, and Wald, provides better fits to the variabilities of times between consecutive cell divisions and transcriptional bursts and has more straightforwardly interpreted parameters. However, since the range of definition of the Gaussian component of EMG is unlimited, data approximation with EMG may extend to the negative domain. This extension may seem negligible when the coefficient of variance of the Gaussian component is small but becomes considerable when the coefficient increases. Therefore, although in many cases an EMG may be an acceptable approximation of data, an exponentially modified nonnegative peak function, such as gamma-distribution, can make more sense in physical terms. In the present short review, EMG and exponentially modified gamma-distribution (EMGD) are discussed with regard to their applicability to data on cell cycle, gene expression, physiological responses to stimuli, and other cases, some of which may be interpreted as decision-making. In practical fitting terms, EMG and EMGD are equivalent in outperforming other functions; however, when the coefficient of variance of the Gaussian component of EMG is greater than ca. 0.4, EMGD is preferable.

The normal (Gaussian) and the exponential are probably the most widely known distributions and quite ubiquitous, too. No wonder that situations are possible where they are expected to meet each other. The resulting composite distributions have been suggested to be relevant, for example, to times between consecutive cell divisions [

Generally speaking, when the times of the passages of certain type objects from their initial to their final state are composed of variable times of their transition to an intermediate state and of their dwelling in the intermediate state, out of which they exit at random to their final state, then the random variable that represents the overall passage time of any such object is the sum of two independent random variables, the transit time and the dwell time, and the distribution of the overall passage times is defined as the convolution of the distributions of its summands [

The plots of a Gaussian

Data on intervals between the transcriptional bursts of the prolactin gene [

Data on response time distribution (RTD) picked up from [

EMG has been introduced about 40 years ago in chromatography (see [

More recently, EMG was suggested to be applicable to time distributions related to cell proliferation and differentiation [

However, there are cases when the deconvolution of an apparent EMG yields a Gaussian whose significant portion extends to the negative domain, which makes no physical sense. In such cases, nonnegative peak functions must be more appropriate for being convoluted with the negative exponent. In particular, a closed form expression for the exponentially modified gamma-distribution (EMGD) has been suggested and shown to be relevant to at least some of such cases [

Exponentially modified functions result from applying the mathematical operator called convolution to nonmodified functions. The conventional definitions and notations for convolution are as follows:

When the domain of any of the convoluted functions is other than

In probability theory, convolution is used to define the probability density function (PDF) of the sum

Integration limits in (

If probability densities of one of the variables are zero in the negative domain, then

If probability densities of both variables are zero in the negative domain, then [

For the subsequent discussion, notations will be chosen to account for the biological meanings behind them. In particular, the stochastic variable is time

Using this notation, the convolution of a normal distribution and an exponential distribution is written as

The computation of a closed form solution of

It may be solved by introducing the error function erf, which is not representable with elementary functions:

The resulting closed form solution of (

A generic plot of EMG built according to (

The equivalents of (

A problem with (

It has been suggested [

The convolution integral for EMGD may be written as

The gamma-distribution is a generalization of the Erlang distribution

The negative exponential component of (

The solution of the improper integral by (

Algorithms for numerical treatment of incomplete gamma-functions are included in standard curve-fitting software tools, such as TableCurve 2D used in the present paper and earlier [

Figure

Although fits to data according to the determination coefficient (

With all that, data points related to an empirical distribution may be, and often are, located on the time axis so far from its origin that EMG and EMGD become equivalent as analytical approximations. This is illustrated in Figure

In Figure

Increasing the height of the intersection of EMG (see (

Fitting data on an amperometric spike produced by adrenaline release from adrenal medulla cell synaptic vesicle. Data points (open circles) are obtained by manual tracking of an amperometric trace presented in [

In chromatography, EMG is employed since mid-1960s to describe chromatographic peaks, whose shape is presumably determined by the diffusion-caused Gaussian blur of a compound during its passage through a column and by extracolumn effects of its exponential dilution in a detector cell [

Different authors suggest a bewildering variety of mathematical expressions for EMG to be used in chromatography. The list compiled by Di Marco and Bombi [

Hanggi and Carr [

At present, (

The first attempts to use EMG in a biological context, which date back to early 1960s, relate to response time distributions (RTD) in psychophysiological studies (reviewed in [

The conformance of EMG and EMGD to an exemplary RTD presented in [

The use of EMG in psychophysiology was motivated by the intent to attribute a physiological significance to observed changes in RTD shapes. In particular, the centroid

Some recent examples of using the deconvolution of EMG (ex-Gaussian) for distinguishing the different phases of psychophysical processes include cognitive changes that occur upon normal brain aging and Alzheimer disease [

It has been shown [

The incentive to try EMG for fitting IDT was prompted by the transition probability model of cell cycle [

The intermediate state, which is mapped to G1 and is exited by cells at random, may be associated with the restriction point (RP) of cell cycle ([

EMGD has been shown to approximate IDT distribution as well as EMG does [

EMG has been suggested to be relevant to possible transcriptional mechanisms of the events that are associated with the restriction point of the cell cycle and can determine cell fate [

The events are thought to result from discrete random changes in gene activity. The rationale for this suggestion is detailed elsewhere [

The process of preparation of an idle gene to the next bout of its transcriptional engagement, which is switched on by the complete assembly of a proper preinitiation complex at gene promoter, is likely to involve more than two steps, and this will make several sequential first-order processes at a cell population level. Therefore, the overall time distribution must conform to the convolution of several exponents, which if the exponents are identical, is described with the Erlang distribution, a particular case of the gamma-distribution. When one of the processes stands out of all others in having a far lower rate constant, the overall distribution will be the convolution of an exponent, which is generated by this process, with the distribution collectively generated by all other processes (see Section

The same must be true for the cell cycle where reaching an RP-associated state is likely to be brought about by a combination of numerous loosely correlated events, each constituting, at the cell population level, a process having a much higher rate compared with the process associated with cell passage through the RP.

In either case, EMG- or EMGD-based analysis of time distributions makes it possible to distinguish influences on the slowest process, such as cell passage through RP, from influences on the other constituents of the overall process. With regard to cell proliferation, the importance of such distinction follows from the possibility of the involvement of cell dwelling in the RP-associated state in cell differentiation [

The conformance of a distribution to an exponentially modified peak function is an indication that the underlying processes collectively generate transit times, whose distribution is peaked, and exponentially distributed dwell times and that the mean dwell time is much longer than the mean characteristic time of any of the processes that determine the transit time. It has been shown [

In a study of microtubule growth in living cells by confocal fluorescence microscopy, the profiles of fluorescence intensity of complexes of microtubule-end binding protein with green fluorescent protein were fit to EMG based on the premise that these profiles are generated by a convolution of exponential decay with point Gaussian blur associated with microscopy [

EMG was used to model human skin conductance changes in response to stimuli, such as noise or image [

In several studies of neurosecretion [

More remote to the initial field of EMG applicability are studies where EMG was used to model channel holding time distribution in public telephony systems [

Searching for other cases where EMG may be applicable to time distributions for reasons discussed above shows that this may be true for some situation related to city traffic, such as that described in the paper [

Making inferences from the shapes of distributions of variable parameters is a routine approach in physics; however, it has long been limited in biomedicine by enormous amounts of calculations. The state of things changes in recent years due to the availability of user-friendly software designed for this purpose.

A significant, in this regard, difference between biology and physics is that, in physics, distributions are often observed directly, whereas in biology they are generated by counting procedures, such as single-cell tracking, which may be very time-consuming and laborious. This problem will hopefully be ameliorated with the advent of novel automated single-cell tracking and other single-event monitoring techniques, which are increasingly introduced in biomedicine.

In the biological context, EMG and EMGD are useful as models, which suggest that, in a population of some objects there is, among the processes that generate the distribution of times required for each of the objects to pass between two observed consecutive states, a first-order process whose rate constant is much lower compared with those of all other processes. This slow process may be regarded as being generated by random discrete events that occur at a frequency making the mean interevent time comparable with the mean time between the two apparent consecutive states of the objects in question [

However, no model can account for everything; therefore, any model is a compromise between physical (biological) relevance and mathematical tractability. To reiterate the epigraph to this writing: “… all models are wrong; the practical question is how wrong do they have to be to not be useful” (G. E. P. Box and N. R. Draper, 1987, p. 74). In this regard, EMG, which extends to the negative domain, is essentially wrong when it is applied to interevent time distributions; however, because of its easier mathematical tractability and the availability of ready-to-use curve-fitting tools, it is still useful as a descriptive option when the variation coefficient of its Gaussian component is low, within ca. 0.4. When the coefficient is higher than that, the use of EMGD is warranted. The failure of a distribution related to a class of phenomena, to which EMG or EMGD has been found to be generally applicable, to conform to them in a specific case may prompt that there is either a methodological bias or some unrecognized factor at work.

The use of EMG and/or EMGD in cell biology may help to supplement the assessment of the duration of different phases of cell cycle (G1, S, and G2) based on DNA content distributions with the assessment of the probabilistic and the deterministic part of cell cycle, according to the transition probability model, based on interdivision time distributions [

Similarly, in psychophysiology, EMG or EMGD may help to distinguish between a phase required to make a decision to respond to a stimulus and a phase required to either execute the decision or to become prepared to make it [

Based on the apparent pervasiveness of exponentially modified peak functions, such as EMG and EMGD, in different biological context—from molecular through cellular to physiological—it is tempting to suggest that this is a reflection of a common biological strategy of making decisions, for example, whether to commence or not to a next burst of gene transcription, cycle of cell proliferation, or response to a stimulus, each time there being alternative options chosen at random. As far as the probabilities of such choices depend on conditions, a population of biological systems can thus come to an optimal balance of different ways of coping with the unforeseeably changing world around them.

The author declares no conflicts of interest.