Classical Ergodicity and Modern Portfolio Theory

What role have theoretical methods initially developed in mathematics and physics played in the progress of financial economics? What is the relationship between financial economics and econophysics? What is the relevance of the ‘classical ergodicity hypothesis’ to modern portfolio theory? This paper addresses these questions by reviewing the etymology and history of the classical ergodicity hypothesis in 19th century statistical mechanics. An explanation of classical ergodicity is provided that establishes a connection to the fundamental empirical problem of using non-experimental data to verify theoretical propositions in modern portfolio theory. The role of the ergodicity assumption in the ex post / ex ante quandary confronting modern portfolio theory is also examined.


Introduction
At least since Markowitz (1952) initiated modern portfolio theory (MPT), it has often been maintained that the tradeoff between systematic risk and expected return is the most important theoretical element of financial economics, e.g., Campbell (1996). Extending Mirowski (1984), the static equilibrium methods used to develop propositions in MPT such as the capital asset pricing model can be traced to mathematical concepts developed from the deterministic 'rational mechanics' approach to 19 th century physics. In the years since Markowitz (1952), financial economics has also adopted alternative mathematical methods from more recent contributions to physics, especially the diffusion processes employed by Black and Scholes (1973) to determine option prices. The emergence of econophysics during the last decade of the twentieth century, e.g., Roehner (2002); Jovanovic and Schinckus (2013), has provided a variety of theoretical and empirical methods adapted from physics -ranging from statistical mechanics to chaos theory -to analyze financial phenomena. Yet, despite considerable overlap in method, contributions to econophysics have gained limited attention in financial economics. In contrast, econophysicists generally consider financial economics to be primarily concerned with a core theory that is inconsistent with the empirical orientation of physical theory.
Physical theory has evolved considerably from the constrained optimization, static equilibrium approach of rational mechanics which underpins MPT. In detailing historical developments in physics since the 19 th century, it is conventional to jump from the determinism of rational mechanics 2 to quantum mechanics to recent developments in chaos theory, overlooking the relevance of the initial steps toward modeling stochastic behavior of physical phenomena by Ludwig Boltzmann (1844-1906, James Maxwell (1831Maxwell ( -1879 and Josiah Gibbs (1839Gibbs ( -1903. As such, there is a point of demarcation between the intellectual pre-histories of MPT and econophysics that can, arguably, be traced to the debate over energistics around the end of the 19 th century. While the evolution of physics after energistics involved the introduction and subsequent stochastic generalization of ergodic concepts, fueled by the emergence of MPT following Markowitz, financial economics incorporated ergodicity into empirical methods aimed at generalizing and testing the capital asset pricing model and other elements of MPT. 1 Significantly, stochastic generalization of the static equilibrium approach of MPT required the adoption of a restricted class of ergodic processes, i.e., 'time reversible' probabilistic models, especially the unimodal likelihood functions associated with certain stationary distributions. In contrast, from the early ergodic models of Boltzmann to the fractals and chaos theory of Mandlebrot, physics has employed a wider variety of ergodic and non-ergodic stochastic models aimed at capturing key empirical characteristics of various physical problems at hand. Such models typically have a mathematical structure that varies from the constrained optimization techniques underpinning MPT, restricting the straightforward application of many physical models. Yet, the demarcation between the use of ergodic notions in physics and financial economics was blurred substantively by the introduction of diffusion process techniques to solve contingent claims valuation problems. Following contributions by Sprenkle (1961) and Samuelson (1965), Black and Scholes (1973) and Merton (1973) provided an empirically viable method of using diffusion methods to determine, using Ito's lemma, a partial differential equation that can be solved for an option price. 3 Use of Ito's lemma to solve stochastic optimization problems is now commonplace in financial economics, e.g., Brennan and Schwartz (1979), including the continuous time generalizations of MPT, e.g., Epstein and Ji (2013). In spite of the considerable progression of certain mathematical techniques employed in physics into financial economics, overcoming the difficulties of applying the wide range of models developed for physical situations to fit the empirical properties of financial data is still a central problem confronting econophysics. Schinckus (2010, p.3816) accurately recognizes that the positivist philosophical foundation of econophysics depends fundamentally on empirical observation: "The empiricist dimension is probably the first positivist feature of econophysics". Following McCauley (2004) and others, this concern with empiricism often focuses on the identification of macro-level statistical regularities that are characterized by the scaling laws identified by Mandelbrot (1997) and Mandlebrot and Hudson (2004) for financial data. Unfortunately, this empirically driven ideal is often confounded by the 'non-repeatable' experiment that characterizes most observed economic and financial data. There is quandary posed by having only a single observed ex post time path to estimate the distributional parameters for the ensemble of ex ante time paths needed to make decisions involving future values of financial variables. In contrast to natural sciences such as physics, in the human sciences there is no assurance that ex post statistical regularity translates into ex ante forecasting accuracy.
Resolution of this quandary highlights the usefulness of employing a 'phenomenological' approach to modeling stochastic properties of financial variables relevant to MPT.
To this end, this paper provides an etymology and history of the 'classical ergodicity hypothesis' in 19 th century statistical mechanics. Subsequent use of ergodicity in financial economics, in general, and MPT, in particular, is then examined. A modern interpretation of classical ergodicity is provided that uses Sturm-Liouville theory, a mathematical method central to classical statistical mechanics, to decompose the transition probability density of a one-dimensional diffusion process subject to regular upper and lower reflecting barriers. This 'classical' decomposition divides the transition density of an ergodic process into a possibly multi-modal limiting stationary density which is independent of time and initial condition, and a power series of time and boundary dependent transient terms. In contrast, empirical theory aimed at estimating relationships from MPT typically ignores the implications of the initial and boundary conditions that generate transient terms and focuses on properties of a particular class of unimodal limiting stationary densities with finite parameters. To illustrate the implications of the expanded class of ergodic processes available to econophysics, properties of the bi-modal quartic exponential stationary density are considered and used to assess the ability of the classical ergodicity hypothesis to explain certain 'stylized facts' associated with the ex post / ex ante quandary confronting MPT.

A Brief History of Classical Ergodicity
The Encyclopedia of Mathematics (2002) defines ergodic theory as the "metric theory of dynamical systems. The branch of the theory of dynamical systems that studies systems with an invariant measure and related problems." This modern definition implicitly identifies the birth of ergodic theory with proofs of the mean ergodic theorem by von Neumann (1932) and the pointwise ergodic theorem by Birkhoff (1931). These early proofs have had significant impact in a wide range of modern subjects. For example, the notions of invariant measure and metric transitivity used in the proofs are fundamental to the measure theoretic foundation of modern probability theory (Doob 1953;Mackey 1974). Building on a seminal contribution to probability theory by Kolmogorov (1933), in the years immediately following it was recognized that the ergodic theorems generalize the strong law of large numbers. Similarly, the equality of ensemble and time averages -the essence of the mean ergodic theorem -is necessary to the concept of a strictly stationary stochastic process.
Ergodic theory is the basis for the modern study of random dynamical systems, e.g., Arnold (1988).
In mathematics, ergodic theory connects measure theory with the theory of transformation groups.
This connection is important in motivating the generalization of harmonic analysis from the real line to locally compact groups.
From the perspective of modern mathematics, statistical physics or systems theory, Birkhoff (1931) and von Neumann (1932) are excellent starting points for a modern history of ergodic theory.
Building on the modern ergodic theorems, subsequent developments in these and related fields have been dramatic. These contributions mark the solution to a problem in statistical mechanics and thermodynamics that was recognized sixty years earlier when Ludwig Boltzmann introduced the classical ergodic hypothesis to permit the theoretical phase space average to be interchanged with the measurable time average. For the purpose of contrasting methods from physics and econophysics with those used in MPT, the selection of the less formally correct and rigorous classical ergodic hypothesis of Boltzmann is a more auspicious beginning. Problems of interest in mathematics are generated by a range of subjects, such as physics, chemistry, engineering and biology. The formulation and solution of physical problems in, say, statistical mechanics or particle physics will have mathematical features which are inapplicable or unnecessary for MPT. For example, in statistical mechanics, points in the phase space are often multi-dimensional functions representing the mechanical state of the system, hence the desirability of a group-theoretic interpretation of the ergodic hypothesis. From the perspective of MPT, such complications are largely irrelevant. The history of classical ergodic theory captures the etymology and basic physical interpretation providing 6 a more revealing pre-history of the relevant MPT mathematics. This arguably more revealing prehistory begins with the formulation of theoretical problems that von Neumann and Birkhoff were later able to solve. Mirowski (1984;1989a, esp. ch.5) establishes the importance of 19 th century physics in the development of the neoclassical economic system advanced by W. Stanley Jevons (1835-1882 and Leon Walras (1834Walras ( -1910 during the marginalist revolution of the 1870's. Being derived using principles from neoclassical economic theory, MPT also inherited essential features of mid-19th century physics: deterministic rational mechanics; conservation of energy; and the non-atomistic continuum view of matter that inspired the energetics movement later in the 19 th century. 2 More precisely, from neoclassical economics MPT inherited a variety of static equilibrium techniques and tools such as mean-variance utility functions and constrained optimization. As such, failings of neoclassical economics identified by econophysicists also apply to central propositions of MPT. Included in the failings is an over-emphasis on theoretical results at the expense of identifying models that have greater empirical validity, e.g., Roehner (2002); Schinckus (2010).
It was during the transition from rational to statistical mechanics during the last third of the 19 th century that Boltzmann made contributions leading to the transformation of theoretical physics from the microscopic mechanistic models of Rudolf Clausius (1835-1882) and James Maxwell to the macroscopic probabilistic theories of Josiah Gibbs andAlbert Einstein (1879-1955). 3 Coming largely after the start of the marginalist revolution in economics, this fundamental transformation in theoretical physics had little impact on the progression of financial economics until the appearance of diffusion equations in contributions on continuous time finance that started in the 1960's and culminated in Black and Scholes (1973). The deterministic mechanics of the energistic approach was 7 well suited to the axiomatic formalization of neoclassical economic theory which culminated in: the von Neumann and Morgenstern expected utility approach to modeling uncertainty; the Bourbaki inspired Arrow-Debreu general equilibrium theory, e.g., Weintraub (2002), and, ultimately, MPT.
In turn, empirical estimation and the subsequent extension of static equilibrium MPT results to continuous time were facilitated by the adoption of a narrow class of ergodic processes.
Having descended from the deterministic rational mechanics of mid-19 th century physics, defining works of MPT do not capture the probabilistic approach to modeling systems initially introduced by analysis still resonate in many subjects of the modern era. The etymology for "ergodic" begins with an 1884 paper by Boltzmann, though the initial insight to use probabilities to describe a gas system can be found as early as 1857 in a paper by Clausius and in the famous 1860 and 1867 papers by Maxwell. 5 The Maxwell distribution is defined over the velocity of gas molecules and provides the probability for the relative number of molecules with velocities in a certain range. Using a mechanical model that involved molecular collision, Maxwell (1867) was able to demonstrate that, in thermal 8 equilibrium, this distribution of molecular velocities was a 'stationary' distribution that would not change shape due to ongoing molecular collision. Boltzmann aimed to determine whether the Maxwell distribution would emerge in the limit, whatever the initial state of the gas. In order to study the dynamics of the equilibrium distribution over time, Boltzmann introduced the probability distribution of the relative time a gas molecule has a velocity in a certain range while still retaining the notion of probability for velocities of a relative number of gas molecules. Under the classical ergodic hypothesis, the average behavior of the macroscopic gas system, which can objectively be measured over time, can be interchanged with the average value calculated from the ensemble of unobservable and highly complex microscopic molecular motions at a given point in time. In the words of Weiner (1939, p.1): "Both in the older Maxwell theory and in the later theory of Gibbs, it is necessary to make some sort of logical transition between the average behavior of all dynamical systems of a given family or ensemble, and the historical average of a single system."

Use of the Ergodic Hypothesis in Financial Economics
At least since Samuelson (1976), it has been recognized that empirical theory and estimation in economics, in general, and financial economics, in particular, relies heavily on the use of specific unimodal stationary distributions associated with a particular class of ergodic processes. As reflected in the evolution of the concept in economics, the specification and implications of ergodicity have only developed gradually. The early presentation of ergodicity by Samuelson (1976) involves the addition of a discrete Markov error term into the deterministic cobweb model to demonstrate that estimated forecasts of future values, such as prices, "should be less variable than the actual data".
Considerable opaqueness about the definition of ergodicity is reflected in the statement that a "'stable' stochastic process ... eventually forgets its past and therefore in the far future can be expected to approach an ergodic probability distribution" (Samuelson 1976, p.2). The connection between ergodic processes and non-linear dynamics that characterizes present efforts in economics goes unrecognized, e.g., (Samuelson 1976, p.1, 5). While some explicit applications of ergodic processes to theoretical modeling in economics have emerged since Samuelson (1976), e.g., Horst and Wenzelburger (1984); Dixit and Pindyck (1994), financial econometrics has produced the bulk of the contributions.
Initial empirical estimation for the deterministic models of neoclassical economics proceeded with the addition of a stationary, usually Gaussian, error term to produce a discrete time general linear model (GLM) leading to estimation using ordinary least squares or maximum likelihood techniques.
In the history of MPT, such early estimations were associated with tests of the capital asset pricing model such as the "market model", e.g., Elton and Gruber (1984). Iterations and extensions of the GLM to deal with complications arising in empirical estimates dominated early work in econometrics, e.g., Dhrymes (1974) and Theil (1971), leading to application of generalized least squares estimation techniques that encompassed autocorrelated and heteroskedastic error terms. Employing L 2 vector space methods with stationary Gaussian-based error term distributions ensured these early stochastic models implicitly assumed ergodicity. The generalization of this discrete time estimation approach to the class of ARCH and GARCH error term models by Engle and Granger was of such significance that a Nobel prize in economics was awarded for this contribution, , e.g., Engle and Granger (1987).
By modeling the evolution of the volatility, this approach permitted a limited degree of non-linearity to be modeled providing a substantively better fit of MPT models to observed financial time series, e.g., Beaulieu et al. (2013).
The emergence of ARCH, GARCH and related empirical models was part of a general trend toward the use of inductive methods in economics, often employing discrete, linear time series methods to model transformed economic variables, e.g., Hendry (1995). At least since Dickey and Fuller (1979), it has been recognized that estimates of univariate time series models for many financial times series reveals evidence of 'non-stationarity'. A number of approaches have emerged to deal with this apparent empirical quandary. 6 In particular, transformation techniques for time series models have received considerable attention. Extension of the Box-Jenkins methodology led to the concept of economic time series being I(0) -stationary in the level -and I(1) -non-stationary in the level but stationary after first differencing. Two I(1) economic variables could be cointegrated if differencing the two series produced an I(0) process, e.g., Hendry (1995). Extending early work on distributed lags, long memory processes have also been employed where the time series is only subject to fractional differencing. Significantly, recent contributions on Markov switching processes and exponential smooth transition autoregressive processes have demonstrated the "possibility that nonlinear ergodic processes can be misinterpreted as unit root nonstationary processes" (Kapetanios and Shin 2011, p.620). Bonomo et al. (2011) illustrates the recent application of Markov switching processes in estimating the asset pricing models of MPT.
The conventional view of ergodicity in economics, in general, and financial economics, in particular, is reflected by Hendry (1995, p.100): "Whether economic reality is an ergodic process after suitable transformation is a deep issue" which is difficult to analyze rigorously. As a consequence, in the limited number of instances where ergodicity is examined in economics a variety of different interpretations appear. In contrast, the ergodic hypothesis in classical statistical mechanics is associated with the more physically transparent kinetic gas model than the often technical and targeted concepts of ergodicity encountered in modern economics, in general, and MPT, in particular. For Boltzmann, the classical ergodic hypothesis permitted the unobserved complex microscopic interactions of individual gas molecules to obey the second law of thermodynamics, a concept that has limited application in economics. 7 Despite differences in physical interpretation, the problem of modeling 'macroscopic' financial variables, such as common stock prices, foreign exchange rates, 'asset' prices or interest rates, when it is not possible to derive a theory for describing and predicting empirical observations from known first principles about the (microscopic) rational behavior of individuals and firms. By construction, this involves a phenomenological approach to modeling. 8 Even though the formal solutions proposed were inadequate by standards of modern mathematics, the thermodynamic model introduced by Boltzmann to explain the dynamic properties of the Maxwell distribution is a pedagogically useful starting point to develop the implications of ergodicity for MPT.
To be sure, von Neumann (1932) and Birkhoff (1931) correctly specify ergodicity using Lebesque integration -an essential analytical tool unavailable to Boltzmann -but the analysis is too complex to be of much value to all but the most mathematically specialized economists. The physical intuition of the kinetic gas model is lost in the generality of the results. Using Boltzmann as a starting point, the large number of mechanical and complex molecular collisions could correspond to the large number of microscopic, atomistic liquidity providers and traders interacting to determine the macroscopic financial market price. In this context, it is variables such as the asset price or the interest rate or the exchange rate, or some combination, that is being measured over time and ergodicity would be associated with the properties of the transition density generating the macroscopic variables. Ergodicity can fail for a number of reasons and there is value in determining the source of the failure. In this vein, there are two fundamental difficulties associated with the classical ergodicity hypothesis in Boltzmann's statistical mechanics -reversibility and recurrence -that are largely unrecognized in financial economics. 9 Halmos (1949Halmos ( , p.1017) is a helpful starting point to sort out the differing notions of ergodicity that are of relevance to the issues at hand: "The ergodic theorem is a statement about a space, a function and a transformation". In mathematical terms, ergodicity or 'metric transitivity' is a property of 'indecomposable', measure preserving transformations. Because the transformation acts on points in the space, there is a fundamental connection to the method of measuring relationships such as distance or volume in the space. In von Neumann (1932) and Birkhoff (1931), this is accomplished using the notion of Lebesque measure: the admissible functions are either integrable (Birkhoff) or square integrable (von Neumann). In contrast to, say, statistical mechanics where spaces and functions account for the complex physical interaction of large numbers of particles, economic theories such as MPT usually specify the space in a mathematically convenient fashion. For example, in the case where there is a single random variable, then the space is "superfluous" (Mackey 1974, p.182) as the random variable is completely described by the distribution. Multiple random variables can be handled by assuming the random variables are discrete with finite state spaces. In effect, conditions for an 'invariant measure' are assumed in MPT in order to focus attention on "finding and studying the invariant measures" (Arnold 1998, p.22) where, in the terminology of financial econometrics, the invariant measure usually corresponds to the stationary distribution or likelihood function.
The mean ergodic theorem of von Neumann (1932) provides an essential connection to the ergodicity hypothesis in financial econometrics. It is well known that, in the Hilbert and Banach spaces common to econometric work, the mean ergodic theorem corresponds to the strong law of large numbers. In statistical applications where strictly stationary distributions are assumed, the relevant ergodic transformation, L*, is the unit shift operator: with k being an integer and [x] the strictly stationary distribution for x that in the strictly stationary case is replicated at each t. 10 Significantly, this reversible transformation is independent of initial time and state. Because this transformation can be achieved by imposing strict stationarity on [x], L* will only work for certain ergodic processes. In effect, the ergodic requirement that the transformation be measure preserving is weaker than the strict stationarity of the stochastic process sufficient to achieve L*. The practical implications of the reversible ergodic transformation L* are described by Davidson (1991, p.331): "In an economic world governed entirely by [time reversible] ergodic processes ... economic relationships among variables are timeless, or ahistoric in the sense that the future is merely a statistical reflection of the past". 11 Employing conventional econometrics in empirical studies, MPT requires that the real world distribution for x(t), e.g., the asset return, be sufficiently similar to those for both x(t+k) or x(t-k), i.e., the ergodic transformation L* is reversible. The reversibility assumption is systemic in MPT appearing in the use of long estimation periods to determine important variables such as the "equity risk premium". There is a persistent belief that increasing the length or sampling frequency of a financial time series will improve the precision of a statistical estimate, e.g., Dimson et al. (2002).
Similarly, focus on the tradeoff between 'risk and return' requires the use of unimodal stationary densities for transformed financial variables such as the rate of return. The impact of initial and boundary conditions on financial decision making is generally ignored. The inconsistency of reversible processes with key empirical facts, such as the asymmetric tendency for downdrafts in prices to be more severe than upswings, is ignored in favor of adhering to 'reversible' theoretical models that can be derived from first principles associated with constrained optimization techniques, e.g., Constantides (2002).

A Phenomenological Interpretation of Classical Ergodicity
In physics, phenomenology lies at the intersection of theory and experiment. Theoretical relationships between empirical observations are modeled without deriving the theory directly from first principles, e.g., Newton's laws of motion. Predictions based on these theoretical relationships are obtained and compared to further experimental data designed to test the predictions. In this fashion, new theories that can be derived from first principles are motivated. Confronted with nonexperimental data for important financial variables, such as common stock prices, interest rates and the like, financial economics has developed some theoretical models that aim to fit the 'stylized facts' of those variables. In contrast, the MPT is initially derived directly from the 'first principles' of constrained expected utility maximizing behavior by individuals and firms. Given the difficulties in economics of testing model predictions with 'new' experimental data, physics and econophysics have the potential to provide a rich variety of mathematical techniques that can be adapted to determining mathematical relationships among financial variables that explain the 'stylized facts' of observed nonexperimental data. 12 The evolution of financial economics from the deterministic models of neoclassical economics to more modern stochastic models has been incremental and disjointed. The preference for linear models of static equilibrium relationships has restricted the application of theoretical frameworks that capture more complex non-linear dynamics, e.g., chaos theory; truncated Levy processes. Yet, important financial variables have relatively innocuous sample paths compared to some types of variables encountered in physics. There is an impressive range of mathematical and statistical models that, seemingly, could be applied to almost any physical or financial situation. If the process can be verbalized, then a model can be specified. This begs the questions: are there transformationsergodic or otherwise -that capture the basic 'stylized facts' of observed financial data? Is the random instability in the observed sample paths identified in, say, stock price time series consistent with the ex ante stochastic bifurcation of an ergodic process, e.g., Chiarella et al. (2008)? In the bifurcation case, the associated ex ante stationary densities are bimodal and irreversible, a situation where the mean calculated from past values of a single, non-experimental ex post realization of the process is not necessarily informative about the mean for future values.
Boltzmann was concerned with demonstrating that the Maxwell distribution emerged in the limit as t for systems with large numbers of particles. The limiting process for t requires that the system run long enough that the initial conditions do not impact the stationary distribution. At the time, two fundamental criticisms were aimed at this general approach: reversibility and recurrence.
In the context of financial time series, reversibility relates to the use of past values of the process to forecast future values. Recurrence relates to the properties of the long run average which involves the ability and length of time for an ergodic process to return to its stationary state. For Boltzmann, both these criticisms have roots in the difficulty of reconciling the second law of thermodynamics with the ergodicity hypothesis. Using Sturm-Liouville methods, it can be shown that classical ergodicity requires the transition density of the process to be decomposable into the sum of a stationary density and a mean zero transient term that captures the impact of the initial condition of the system on the individual sample paths; irreversibility relates to properties of the stationary density and non-recurrence to the behavior of the transient term.
Because the particle movements in a kinetic gas model are contained within an enclosed system, e.g., a vertical glass tube, classical Sturm-Liouville (S-L) methods can be applied to obtain solutions 16 for the transition densities. These classical results for the distributional implications of imposing regular reflecting boundaries on diffusion processes are representative of the modern phenomenonological approach to random systems theory which: "studies qualitative changes of the densites of invariant measures of the Markov semigroup generated by random dynamical systems induced by stochastic differential equations" (Crauel et al. 1999, p.27]). 13 Because the initial condition of the system is explicitly recognized, ergodicity in these models takes a different form than that associated with the unit shift transformation of unimodal stationary densities typically adopted in financial economics, in general, and MPT, in particular. The ergodic transition densities are derived as solutions to the forward differential equation associated with one-dimensional diffusions.
The transition densities contain a transient term that is dependent on the initial condition of the system and boundaries imposed on the state space. Path dependence, i.e., irreversibility, can be introduced by employing multi-modal stationary densities.
The distributional implications of boundary restrictions, derived by modeling the random variable as a diffusion process subject to reflecting barriers, have been studied for many years, e.g., Feller (1954). The diffusion process framework is useful because it imposes a functional structure that is sufficient for known partial differential equation (  If the diffusion process is subject to upper and lower reflecting boundaries that are regular and fixed (-< a < b < ), the classical "Sturm-Liouville problem" involves solving (1) subject to the separated boundary conditions: 16 And the initial condition: In contrast, applications in financial econometrics employ the strictly stationary U*, where the location of x 0 is irrelevant while U incorporates x 0 as an initial condition associated with the solution of a partial differential equation.

Density Decomposition Results 20
In general, solving the forward equation (1) for U subject to (3), (4) and some admissible form of (5) is difficult, e.g., Feller (1954), Risken (1989). In such circumstances, it is expedient to restrict the 20 problem specification to permit closed form solutions for the transition density to be obtained. Wong (1964) provides an illustration of this approach.
The transition probability density U for the ergodic process can then be reconstructed by working back from a specific closed form for the stationary distribution using known results for the solution of specific forms of the forward equation. In this procedure, the d 0 , d 1 , d 2 , e 0 and e 1 in the Pearson ODE are used to specify the relevant parameters in (1). The U for important stationary distributions that fall within the Pearson system, such as the normal, beta, central t, and exponential, can be derived by this method.
The solution procedure employed by Wong (1964) depends crucially on restricting the PDE problem sufficiently to apply classical S-L techniques. Using S-L methods, various studies have generalized the set of solutions for U to cases where the stationary distribution is not a member of the Pearson system or U is otherwise unknown, e.g., Linetsky (2005). In order to employ the separation of variables technique used in solving S-L problems, (1) has to be transformed into the canonical form of the forward equation. To do this, the following function associated with the invariant measure is introduced:

A[s] B[s] ds
Using this function, the forward equation can be rewritten in the form: Equation (6) is the canonical form of equation (1). The S-L problem now involves solving (6) subject to appropriate initial and boundary conditions.
Because the methods for solving the S-L problem are ODE-based, some method of eliminating the time derivative in (1) is required. Exploiting the assumption of time homogeneity, the eigenfunction expansion approach applies separation of variables, permitting (6) to be specified as: Where [x] is only required to satisfy the easier-to-solve ODE: Transforming the boundary conditions involves substitution of (7) into (3) and (4) and solving to get:  (3) and (4) ensure the problem is self-adjoint (Berg and McGregor 1966, p.91).
The classical S-L problem of solving (6) subject to the initial and boundary conditions admits a solution only for certain critical values of , the eigenvalues. Further, since equation (1) is linear in U, the general solution for (7) is given by a linear combination of solutions in the form of eigenfunction expansions. Details of these results can be found in Hille (1969, ch. 8), Birkhoff and Rota (1989, ch. 10) and Karlin and Taylor (1981). When the S-L problem is self-adjoint and regular the solutions for the transition probability density can be summarized in the following (see Appendix for Proof):

Proposition: Ergodic Transition Density Decomposition
The regular, self-adjoint Sturm-Liouville problem has an infinite sequence of real eigenvalues, 0 = 0 < 1 < 2 < ... with: lim n n To each eigenvalue there corresponds a unique eigenfunction n n [x]. Normalization of the eigenfunctions produces: The n [x] eigenfunctions form a complete orthonormal system in L 2 [a,b]. The unique solution in L 2 [a,b] to (1), subject to the boundary conditions (3)-(4) and initial condition (5) is, in general form: Given this, the transition probability density function for x at time t can be reexpressed as the sum of a stationary limiting equilibrium distribution associated with the 0 = 0 eigenvalue, that is linearly independent of the boundaries, and a power series of transient terms, associated with the remaining eigenvalues, that are boundary and initial condition dependent: where: Using the specifications of n , c n , and n , the properties of T [x,t] are defined as: This Proposition provides the general solution to the regular, self-adjoint S-L problem of deriving U when the process is subject to regular reflecting barriers. Taking the limit as t in (9), it follows from (10) and (11)  The theoretical advantage obtained by imposing regular reflecting barriers on the diffusion state space for the forward equation is that an ergodic decomposition of the transition density is assured.
The relevance of bounding the state space and imposing regular reflecting boundaries can be illustrated by considering the well known solution (e.g., Cox and Miller 1965, p.209) for U involving a constant coefficient standard normal variate Y(t) = ({xx 0µt }/ ) over the unbounded state space . In this case the forward equation (1) reduces to: ½{ 2 U / Y 2 } = U / t. By evaluating these derivatives, it can be verified that the principal solution for U is: and as t -or t + then U 0 and the stochastic process is non-ergodic because it does not possess a non-trivial stationary distribution. The mean ergodic theorem fails: if the process runs long enough, then U will evolve to where there is no discernible probability associated with starting from x 0 and reaching the neighborhood of a given point x. The absence of a stationary distribution raises a number of questions, e.g., whether the process has unit roots. Imposing regular reflecting boundaries is a certain method of obtaining a stationary distribution and a discrete spectrum (Hansen and Schienkman 1998, p.13). Alternative methods, such as specifying the process to admit natural boundaries where the parameters of the diffusion are zero within the state space, can give rise to continuous spectrum and raise significant analytical complexities. At least since Feller (1954), the search for useful solutions, including those for singular diffusion problems, has produced a number of specific cases of interest. However, without the analytical certainty of the classical S-L framework, analysis proceeds on a case by case basis.
One possible method of obtaining a stationary distribution without imposing both upper and lower boundaries is to impose only a lower (upper) reflecting barrier and construct the stochastic process such that positive (negative) infinity is non-attracting, e.g., Linetsky (2005); Aït-Sahalia (1999). This can be achieved by using a mean-reverting drift term. In contrast, Cox and Miller [42,  subject to the lower reflecting barrier at x = 0 given in (2) to solve for both the U and the stationary density. The principal solution is solved using the 'method of images' to obtain:  Wong (1964) uses a different approach, initially selecting a stationary distribution and then solving for U using the restrictions of the Pearson system to specify the forward equation. In this approach, the functional form of the desired stationary distribution determines the appropriate boundary conditions. While application of this approach has been limited to the restricted class of distributions associated with the Pearson system, it is expedient when a known stationary distribution, such as the standard normal distribution, is of interest. More precisely, let: In this case, the boundaries of the state space are non-attracting and not regular. Solving the Pearson Given this, as t -then U 0 and as t + then U achieves the stationary standard normal distribution.

The Quartic Exponential Distribution
The roots of bifurcation theory can be found in the early solutions to certain deterministic ordinary differential equations. Consider the deterministic dynamics described by the pitchfork bifurcation ODE: where 0 and 1 are the 'normal' and 'splitting' control variables, respectively (e.g., Cobb 1978Cobb , 1981. While 0 has significant information in a stochastic context, this is not usually the case in the deterministic problem so 0 = 0 is assumed. Given this, for 1 0, there is one real equilibrium ({dx / dt} = 0) solution to this ODE at x = 0 where "all initial conditions converge to the same final point exponentially fast with time" (Crauel and Flandoli 1998, p.260). For 1 > 0, the solution bifurcates into three equilibrium solutions x = { 0, ± 1 }, one unstable and two stable. In this case, the state space is split into two physically distinct regions (at x = 0) with the degree of splitting controlled by the size of 1 . Even for initial conditions that are 'close', the equilibrium achieved will depend on the sign of the initial condition. Stochastic bifurcation theory extends this model to incorporate Markovian randomness. In this theory, "invariant measures are the random analogues of deterministic fixed points" (Arnold 1998, p.469). Significantly, ergodicity now requires that the component densities that bifurcate out of the stationary density at the bifurcation point be invariant measures, e.g., Crauel et al. (1999, sec.3). As such, the ergodic bifurcating process is irreversible in the sense that past sample paths (prior to the bifurcation) cannot reliably be used to generate statistics for the future values of the variable (after the bifurcation).
It is well known that the introduction of randomness to the pitchfork ODE changes the properties of the equilibrium solution, e.g., (Arnold 1998, sec.9.2). It is no longer necessary that the state space for the principal solution be determined by the location of the initial condition relative to the bifurcation point. The possibility for randomness to cause some paths to cross over the bifurcation point depends on the size of volatility of the process, , which measures the non-linear signal to white noise ratio. Of the different approaches to introducing randomness (e.g., multiplicative noise), the simplest approach to converting from a deterministic to a stochastic context is to add a Weiner 28 process (dW(t)) to the ODE. Augmenting the diffusion equation to allow for to control the relative impact of non-linear drift versus random noise produces the "pitchfork bifurcation with additive noise" (Arnold 1998, p.475) which in symmetric form is: Applications in financial economics, e.g., Aït-Sahlia (1999), refer to this diffusion process as the double well process. While consistent with the common use of diffusion equations in financial economics, the dynamics of the pitchfork process captured by T[x,t |x0] have been "forgotten" (Arnold 1998, p.473).
Models in MPT are married to the transition probability densities associated with unimodal stationary distributions, especially the class of Gaussian-related distributions. Yet, it is well known that more flexibility in the shape of the stationary distribution can be achieved using a higher order exponential density, e.g., Fisher (1921), Cobb et al. (1983), Crauel and Flandoli (1999). Increasing the degree of the polynomial in the exponential comes at the expense of introducing additional parameters resulting in a substantial increase in the analytical complexity, typically defying a closed form solution for the transition densities. However, at least since Elliott (1955), it has been recognized that the solution of the associated regular S-L problem will still have a discrete spectrum, even if the specific form of the eigenfunctions and eigenvalues in T[x,t |x 0 ] are not precisely determined (Horsthemke and Lefever 1984, sec. 6.7) . Inferences about transient stochastic behavior can be obtained by examining the solution of the deterministic non-linear dynamics. In this process, attention initially focuses on the properties of the higher order exponential distributions.
To this end, assume that the stationary distribution is a fourth degree or "general quartic" exponential: where: K is a constant determined such that the density integrates to one; and, 4 > 0. 21 Following Fisher 1921, the class of distributions associated with the general quartic exponential admits both unimodal and bimodal densities and nests the standard normal as a limiting case where 4 = 3 = 1 = 0 and 2 = ½ with K = 1/ ( 2 ). The stationary distribution of the bifurcating double well process is a special case of the symmetric quartic exponential distribution: where µ is the population mean and the symmetry restriction requires 1 = 3 = 0. Such multi-modal stationary densities have received scant attention in financial economics, in general, and in MPT, in particular. To see why the condition on 1 is needed, consider change of origin X = Y -{ 3 / 4 4 } to remove the cubic term from the general quartic exponential (Matz 1978, p.480): [y] K Q exp[ { (y µ y ) (y µ y ) 2 (y µ y ) 4 }] where 0 The substitution of y for x indicates the change of origin which produces the following relations between coefficients for the general and specific cases: gives ± {| | / (2 )} which reduces to ± 1 for the double well process, as in Ait-Sahlia (1999, Figure   6B, p.1385).

INSERT FIGURE 1 HERE
As illustrated in Figure 1, the selection of a i in the stationary density i [x] = K Q exp{ -(.25 x 4 -.5 x 2 -a i x) } defines a family of general quartic exponential densities, where a i is the selected value of for that specific density. 22 The coefficient restrictions on the parameters and dictate that these values cannot be determined arbitrarily. For example, given that 4 is set at .25, then for a i = 0, it follows that = 2 = 0.5. 'Slicing across' the surface in Figure 1 at a i = 0 reveals a stationary distribution that is equal to the double well density. Continuing to slice across as a i increases in size, the bimodal density becomes progressively more asymmetrically concentrated in positive x values.
Though the location of the modes does not change, the amount of density between the modes and around the negative mode decreases. Similarly, as a i decreases in size the bimodal density becomes more asymmetrically concentrated in positive x values. While the stationary density is bimodal over a i {-1,1}, for |a i | large enough the density becomes so asymmetric that only a unimodal density appears. For the general quartic, asymmetry arises as the amount of the density surrounding each mode (the sub-density) changes with a i . In this, the individual stationary sub-densities have a symmetric shape. To introduce asymmetry in the sub-densities, the reflecting boundaries at a and b that bound the state space for the regular S-L problem can be used to introduce positive asymmetry in the lower sub-density and negative asymmetry in the upper sub-density.
Following Chiarella et al. (2008), the stochastic bifurcation process has a number of features which are consistent with the ex ante behavior of a securities market driven by a combination of chartists and fundamentalists. Placed in the context of the classical S-L framework, because the stationary distributions are bimodal and depend on forward parameters -such as , , and a i in Figure 1 - properties of ex ante bifurcating ergodic processes to generate theoretical ex post sample paths that provide a better approximation to the sample paths of observed financial data.

Conclusion
The classical ergodicity hypothesis provides a point of demarcation in the pre-histories of MPT and econophysics. To deal with the problem of making statistical inferences from 'non-experimental' data, theories in MPT typically employ stationary densities that are: time reversible; unimodal; and, allow no short or long term impact from initial and boundary conditions. The possibility of bimodal processes or ex ante impact from initial and boundary conditions is not recognized or, it seems, intended. Significantly, as illustrated in the need to select an a i in Figure 1 in order to specify the 'real world' ex ante stationary density, a semantic connection can be established between the subjective uncertainty about encountering a future bifurcation point and, say, the possible collapse of an asset price bubble impacting future market valuations. Examining the quartic exponential stationary distribution associated with a bifurcating ergodic process, it is apparent that this distribution nests the Gaussian distribution as a special case. In this sense, results from classical statistical mechanics can 32 be employed to produce a stochastic generalization of the unimodal, time reversible processes employed in modern portfolio theory.
nth eigenvalue has exactly n zeroes in (a,b).

Proof:
For n = 0 the following applies: Since each n [x] satisfies the boundary conditions.

Proof:
From (8): But from part (b) this will = 0 (which is a contradiction) unless k = 0 for some k.

Proof:
From part (a), 0 [x] has no zeroes in (a,b). Therefore, either or .
It follows from part (b) that 0 = 0.
(e) n > 0 for n 0. This follows from part (d) and the strict inequality conditions provided in part (a).   To the view of perfect intelligence nothing is uncertain." What Boltzmann, Planck and others had observed in statistical physics was that, even though the behavior of one or two molecules can be completely determined, it is not possible to generalize these mechanics to the describe the macroscopic motion of molecules in large, complex systems, e.g., Brush (1983, esp. ch.II).
4. As such, Boltzmann was part of the larger: "Second Scientific Revolution, associated with the theories of Darwin, Maxwell, Planck, Einstein, Heisenberg and Schrödinger, (which) substituted a world of process and chance whose ultimate philosophical meaning still remains obscure" (Brush 1983, p.79). This revolution superceded the: "First Scientific Revolution, dominated by the physical astronomy of Copernicus, Kepler, Galileo, and Newton, ... in which all changes are cyclic and all motions are in principle determined by causal laws." The irreversibility and indeterminism of the Second Scientific Revolution replaces the reversibility and determinism of the First.
5. There are many interesting sources on these points which provide citations for the historical papers that are being discussed. Cercignani (1998, p.146-50) discusses the role of Maxwell and Boltzmann in the development of the ergodic hypothesis. Maxwell (1867) is identified as "perhaps the strongest statement in favour of the ergodic hypothesis". Brush (1976) has a detailed account of the development of the ergodic hypothesis. Gallavotti (1995) traces the etymology of "ergodic" to the 'ergode' in an 1884 paper by Boltzmann. More precisely, an ergode is shorthand for 'ergomonode' which is a 'monode with given energy' where a 'monode' can be either a single stationary distribution taken as an ensemble or a collection of such stationary distributions with some defined parameterization. The specific use is clear from the context. Boltzmann proved that an ergode is an equilibrium ensemble and, as such, provides a mechanical model consistent with the second law of thermodynamics. It is generally recognized that the modern usage of 'the ergodic hypothesis' originates with Ehrenfest (1911).
6. Kapetanios and Shin (2011, p.620) capture the essence of this quandary: "Interest in the interface of nonstationarity and nonlinearity has been increasing in the econometric literature. The motivation for this development may be traced to the perceived possibility that nonlinear ergodic processes can be misinterpreted as unit root nonstationary processes. Furthermore, the inability of standard unit root tests to reject the null hypothesis of unit root for a large number of macroeconomic variables, which are supposed to be stationary according to economic theory, is another reason behind the increased interest." 7. The second law of thermodynamics is the universal law of increasing entropy -a measure of the randomness of molecular motion and the loss of energy to do work. First recognized in the early 19 th century, the second law maintains that the entropy of an isolated system, not in equilibrium, will necessarily tend to increase over time. Entropy approaches a maximum value at thermal equilibrium. A number of attempts have been made to apply the entropy of information to problems in economics, with mixed success. In addition to the second law, physics now recognizes the zeroth law of thermodynamics that "any system approaches an equilibrium state" (Reed and Simon 1980, p.54). This implications of the second law for theories in economics was initially explored by Georgescu-Roegen (1971). 45 8. In this process, the ergodicity hypothesis is required to permit the one observed sample path to be used to estimate the parameters for the ex ante distribution of the ensemble paths. In turn, these parameters are used to predict future values of the economic variable.
9. Heterodox critiques are associated with views considered to originate from within economics. Such critiques are seen to be made by 'economists', e.g., Post Keynesian economists, institutional economists, radical political economists and so on. Because such critiques take motivation from the theories of mainstream economics, these critiques are distinct from econophysics. Following Schinckus (2010, p.3818): "Econophysicists have then allies within economics with whom they should become acquainted." 10. Dhyrmes (1974, p.1-29) discusses the algebra of the lag operator.
11. Critiques of mainstream economics that are rooted in the insights of The General Theory recognize the distinction between fundamental uncertainty and objective probability. As a consequence, the definition of ergodic theory in heterodox criticisms of mainstream economics lacks formal precision, e.g., the short term dependence of ergodic processes on initial conditions is not usually recognized. Ergodic theory is implicitly seen as another piece of the mathematical formalism inspired by Hilbert and Bourbaki and captured in the Arrow-Debreu general equilibrium model of mainstream economics.
12. In this context though not in all contexts, econophysics provides a 'macroscopic' approach. In turn, ergodicity is an assumption that permits the time average from a single observed sample path to (phenomenologically) model the ensemble of sample paths. Given this, econophysics does contain a substantively richer toolkit that encompasses both ergodic and non-ergodic processes. Many works in econophysics implicitly assume ergodicity and develop models based on that assumption.
13. The distinction between invariant and ergodic measures is fundamental. Recognizing a number of distinct definitions of ergodicity are available, following Medio (2005, p.70) the Birkhoff-Khinchin ergodic (BK) theorem for invariant measures can be used to demonstrate that ergodic measures are a class of invariant measures. More precisely, the BK theorem permits the limit of the time average to depend on initial conditions. In effect, the invariant measure is permitted to decompose into invariant 'sub-measures'. The physical interpretation of this restriction is that sample paths starting from a particular initial condition may only be able to access a part of the sample space, no matter how long the process is allowed to run. For an ergodic process, sample paths starting from any admissible initial condition will be able to 'fill the sample space', i.e., if the process is allowed to run long enough, the time average will not depend on the initial condition. Medio (2005, p.73) provides a useful example of an invariant measure that is not ergodic.
14. The phenomenological approach is not without difficulties. For example, the restriction to Markov processes ignores the possibility of invariant measures that are not Markov. In addition, an important analytical construct in bifurcation theory, the Lyapunov exponent, can encounter difficulties with certain invariant Markov measures. Primary concern with the properties of the 46 stationary distribution is not well suited to analysis of the dynamic paths around a bifurcation point. And so it goes.
15. A diffusion process is 'regular' if starting from any point in the state space I, any other point in I can be reached with positive probability (Karlin and Taylor (1981, p.158). This condition is distinct from other definitions of regular that will be introduced: 'regular boundary conditions' and 'regular S-L problem'.
16. The classification of boundary conditions is typically an important issue in the study of solutions to the forward equation. Important types of boundaries include: regular; exit; entrance; and natural. Also important in boundary classification are: the properties of attainable and unattainable; whether the boundary is attracting or non-attracting; and whether the boundary is reflecting or absorbing. In the present context, regular, attainable, reflecting boundaries are usually being considered, with a few specific extensions to other types of boundaries. In general, the specification of boundary conditions is essential in determining whether a given PDE is self-adjoint 17. Heuristically, if the ergodic process runs long enough, then the stationary distribution can be used to estimate the constant mean value. This definition of ergodic is appropriate for the onedimensional diffusion cases considered in this paper. Other combinations of transformation, space and function will produce different requirements. Various theoretical results are available for the case at hand. For example, the existence of an invariant Markov measure and exponential decay of the autocorrelation function are both assured.
18. For ease of notation it is assumed that t 0 = 0. In practice, solving (1) combined with (3)-(5) requires a and b to be specified. While a and b have ready interpretations in physical applications, e.g., the heat flow in an insulated bar, determining these values in economic applications can be more challenging. Some situations, such as the determination of the distribution of an exchange rate subject to control bands (e.g., Ball and Roma 1998), are relatively straight forward. Other situations, such as profit distributions with arbitrage boundaries or output distributions subject to production possibility frontiers, may require the basic S-L framework to be adapted to the specifics of the modeling situation.
19. The mathematics at this point are heuristic. More appropriate would be to observe that U* is the special case where U = [x], a strictly stationary distribution. This would require discussion of how to specify the initial and boundary conditions to ensure that this is the solution to the forward equation.
20. A more detailed mathematical treatment can be found in de Jong (1994).

21.
In what follows, except where otherwise stated, it is assumed that = 1. Hence, the condition that K be a constant such that the density integrates to one incorporates the = 1 assumption. Allowing 1 will scale either the value of K or the 's from that stated.

22.
A number of simplifications were used to produce the 3D image in Figure 1: x has been centered about µ; and, = K Q = 1. Changing these values will impact the specific size of the parameter values