JPSJournal of Probability and Statistics1687-95381687-952XHindawi Publishing Corporation83824010.1155/2010/838240838240Research ArticleThe Foundations of Probability with Black SwansChichilniskyGracielaZitikisRičardasDepartments of Economics and Mathematical StatisticsColumbia University335 Riverside DriveNew York, NY 10027USAcolumbia.edu201014122009201008092009251120092010Copyright © 2010This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We extend the foundation of probability in samples with rare events that are potentially catastrophic, called black swans, such as natural hazards, market crashes, catastrophic climate change, and species extinction. Such events are generally treated as ‘‘outliers’’ and disregarded. We propose a new axiomatization of probability requiring equal treatment in the measurement of rare and frequent events—the Swan Axiom—and characterize the subjective probabilities that the axioms imply: these are neither finitely additive nor countably additive but a combination of both. They exclude countably additive probabilities as in De Groot (1970) and Arrow (1971) and are a strict subset of Savage (1954) probabilities that are finitely additive measures. Our subjective probabilities are standard distributions when the sample has no black swans. The finitely additive part assigns however more weight to rare events than do standard distributions and in that sense explains the persistent observation of ‘‘power laws’’ and ‘‘heavy tails’’ that eludes classic theory. The axioms extend earlier work by Chichilnisky (1996, 2000, 2002, 2009) to encompass the foundation of subjective probability and axiomatic treatments of subjective probability by Villegas (1964), De Groot (1963), Dubins and Savage (1965), Dubins (1975) Purves and Sudderth (1976) and of choice under uncertainty by Arrow (1971).

1. Introduction

Black swans are rare events with important consequences, such as market crashes, natural hazards, global warming, and major episodes of extinction. This article is about the foundations of probability when catastrophic events are at stake. It provides a new axiomatic foundation for probability requiring sensitivity both to rare and frequent events. The study culminates in Theorem 6.1, that proves existence and representation of a probability satisfying three axioms. The last of these axioms requires sensitivity to rare events, a property that is desirable but not respected by standard probabilities. The article shows the connection between those axioms and the Axiom of Choice at the foundation of Mathematics. It defines a new type of probabilities that coincide with standard distributions when the sample is populated only by relatively frequent events. Generally, however, they are a mixture of countable and finitely additive measures, assigning more weight to black swans than do normal distributions, and predicting more realistically the incidence of “outliers,” “power laws,” and “heavy tails” [1, 2].

The article refines and extends the formulation of probability in an uncertain world. It provides an argument, and formalization, that probabilities must be additive functionals on L(𝒰) (where 𝒰 is a σ-field of "events" represented by their indicator bounded and real valued functions), that are neither countably additive nor finitely additive. The contribution is to provide an axiomatization showing that subjective probabilities must lie in the full space L* rather than L1 as the usual formalization (Arrow, ) forcing countable additivity implies. The new axioms refine both Savage's  axiomatization of finitely additive measures, and Villegas'  and Arrow's  that are based on countably additive measures, and extend both to deal more realistically with catastrophic events.

Savage  axiomatized subjective probabilities as finitely additive measures representing the decision makers' beliefs, an approach that can ignore frequent events as shown in the appendix. To overcome this, Villegas  and Arrow  introduced an additional continuity axiom (called “Monotone Continuity”) that yields countably additivity of the measures. However Monotone Continuity has unusual implications when the subject is confronted with rare events, for example, it predicts that in exchange for a couple of cents, one should be willing to accept a small risk of death (measured by a countably additive probability), a possibility that Arrow called “outrageous” [3, Pages 48–49]. This article defines a realistic solution: for some, very large, payoffs and in certain situations, one may be willing to accept a small risk of death—but not in others. This means that Monotone Continuity holds in some cases but not in others, a possibility that leads to the axiomatization proposed in this article and is consistent with the experimental observations reported by (Chanel and Chichilnisky [6, 7]). The results are as follows. We show that countably additive measures are insensitive to black swans:   they assign negligible weight to rare events, no matter how important these may be, treating catastrophes as outliers. Finitely additive measures, on the other hand, may assign no weight to frequent events, which is equally troubling. Our new axiomatization balances the two approaches and extends both, requiring sensitivity in the measurement of rare as well as frequent events. We provide an existence theorem for probabilities that satisfy our axioms, and a characterization of all that do.

The results are based on an axiomatic approach to choice under uncertainty and sustainable development introduced by Chichilnisky  and illuminate the classic issue of continuity that has always been at the core of “subjective probability” axioms (Villegas, , Arrow ). To define continuity, we use a topology that tallies with the experimental evidence of how people react to rare events that cause fear (Le Doux , Chichilnisky ), previously used by Debreu  to formalize a market's Invisible Hand, and by Chichilnisky [9, 12, 14] to axiomatize choice under uncertainty with rare events that inspire fear. The new results provided here show that the standard axiom of decision theory, Monotone Continuity, is equivalent to De Groot's Axiom SP4 that lies at the foundation of classic likelihood theory (Proposition 2.1) and that both of these axioms underestimate rare events no matter how catastrophic they may be. We introduce here a new Swan Axiom (Section 3) that logically negates them both, show it is a combination of two axioms defined by Chichilnisky [9, 14] and prove that any subjective probability satisfying the Swan Axiom is neither countably additive nor finitely additive: it has elements of both (Theorem 4.1). Theorem 6.1 provides a complete characterization of all subjective probabilities that satisfy linearity and the Swan Axiom, thus extending earlier results of Chichilnisky [1, 2, 9, 12, 14].

There are other approaches to subjective probability such as Choquet Expected Utility Model (CEU, Schmeidler, ) and Prospect Theory (Kahneman and Tversky, [16, 17]). They use a nonlinear treatment of probabilities of likelihoods (see, e.g., Dreze, , or Bernstein, ), while we retain linear probabilities. Both have a tendency to give higher weight to small probabilities, and are theoretical answers to experimental paradoxes found by Allais in 1953 and Ellsberg in 1961, among others refuting the Independence Axiom of the Subjective Expected Utility (SEU) model. Our work focuses instead directly on the foundations of probability by taking the logical negation of the Monotone Continuity Axiom. It is striking that weakening or rejecting this axiom—respectively, in decision theory and in probability theory—ends up in probability models that are more in tune with observed attitudes when facing catastrophic events. Presumably each approach has advantages and shortcomings. It seems that the approach offered here may be superior on four counts: (i) it retains linearity of probabilities, (ii) it identifies Monotone Continuity as the reason for underestimating the measurement of catastrophic events, an axiom that depends on a technical definition of continuity and has no other compelling feature, (iii) it seems easier to explain and to grasp, and therefore (iv) it may be easier to use in applications.

2. The Mathematics of UncertaintyUncertainty

Uncertainty is described by a set of distinctive and exhaustive possible events represented by a family of sets {Uα},  αN, whose union describes a universe 𝒰=αUα. An event U𝒰 is identified with its characteristic function ϕU:𝒰R where ϕU(x)=1 when xU and ϕU(x)=0 when xU. The subjective probability of an event U is a real number W(U) that measures how likely it is to occur according to the subject. Generally we assume that the probability of the universe is 1 and that of the empty set is zero W()=0. In this article we make no difference between subjective probabilities and likelihoods, using both terms intercheangeably. Classic axioms for subjective probability (resp. likelihoods) are provided by Savage  and De Groot . The likelihood of two disjoint events is the sum of their likelihoods: W(U1U2)=W(U1)+W(U2) when U1U2=; a property called additivity. These properties correspond to the definition of a probability or likelihood as a finite additive measure on a family (σ-algebra) of measurable sets of 𝒰, which is Savage's  definition of subjective probability. W is countably additive when W(i=1Ui)=i=1W(Ui) whenever UiUj if ij. A purely finitely additive probability is one that is additive but not countably additive. Savage's subjective probabilities can be purely finitely additive or countably additive. In that sense they include all the probabilities in this article. However as seen below, this article excludes probabilities that are either purely finitely additive, or countably additive, and therefore our characterization of a subjective probability is strictly finer than that of Savage's , and different from the view of a measure as a countably additive set function (e.g. De Groot, ) The following Axioms were introduced by Villegas ; and others for the purpose of obtaining countable additivity.

Monotone Continuity Axiom (MC) (Arrow [<xref ref-type="bibr" rid="B2">3</xref>])

For every two events f and g with W(f)>W(g), and every vanishing sequence of events {Eα}=1,2 (defined as follows: for all α,  Eα+1Eα and α=1Eα=) there exists N such that altering arbitrarily the events f and g on the set Ei, where i>N, does not alter the subjective probability ranking of the events, namely, W(f)>W(g), where f and g are the altered events.

This axiom is equivalent to requiring that the probability of the sets along a vanishing sequence goes to zero. Observe that the decreasing sequence could consist of infinite intervals of the form (n,) for n=1,2. Monotone continuity therefore implies that the likelihood of this sequence of events goes to zero, even though all its sets are unbounded. A similar example can be constructed with a decreasing sequence of bounded sets, (-1/n,1/n) for n=1,2, which is also a vanishing sequence as it is decreasing and their intersection is empty.

De Groot's Axiom <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M47"><mml:mi>S</mml:mi><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mn>4</mml:mn></mml:mrow></mml:msub></mml:math></inline-formula> (De Groot, [<xref ref-type="bibr" rid="B16a">20</xref>], Chapter 6, page 71)

If A1A2 is a decreasing sequence of events and B is some fixed event that is less likely than Ai for all i, then the probability of the intersection iAi is larger than that of B.

The following proposition establishes that the two axioms presented above are one and the same; both imply countable additivity.

Proposition 2.1.

A relative likelihood (subjective probability) satisfies the Monotone Continuity Axiom if and only if it satisfies Axiom SP4. Each of the two axioms implies countable additivity.

Proof.

Assume that De Groot's axiom SP4 is satisfied. When the intersection of a decreasing sequence of events is empty iAi= and the set B is less likely to occur than every set Ai, then the subset B must be as likely as the empty set; namely, its probability must be zero. In other words, if B is more likely than the empty set, then regardless of how small is the set B, it is impossible for every set Ai to be as likely as B. Equivalently, the probability of the sets that are far away in the vanishing sequence must go to zero. Therefore SP4 implies Monotone Continuity. Reciprocally, assume that MC is satisfied. Consider a decreasing sequence of events Ai and define a new sequence by substracting from each set the intersection of the family, namely, A1-iAi,  A2-iAi,. Let B be a set that is more likely than the empty set but less likely than every Ai. Observe that the intersection of the new sequence is empty, i(Ai-iAi)= and since AiAi+1 the new sequence is, by definition, a vanishing sequence. Therefore by MC limiW(Ai-iAi)=0. Since W(B)>0,  B must be more likely than Ai-iAi for some i onwards. Furthermore, Ai=(Ai-iAi)(iAi) and (Ai-iAi)(iAi)=, so that W(Ai)>W(B) is equivalent to W(Ai-iAi)+W(iAi)>W(B). Observe that W(iAi)<W(B) would contradict the inequality W(Ai)=W(Ai-iAi)+W(iAi)>W(B), since as we saw above, by MC, limiW(Ai-iAi)=0, and W(Ai-iAi)+W(iAi)>W(B). It follows that W(iAi)>W(B), which establishes De Groots's Axiom SP4. Therefore Monotone Continuity is equivalent to De Groot's Axiom SP4. A proof that each of the axioms implies countable additivity is in Villegas , Arrow  and De Groot .

The next section shows that the two axioms, Monotone Continuity and SP4, are biased against rare events no matter how catastrophic these may be.

3. The Value of Life

The best way to explain the role of Monotone Continuity is by means of an example provided by Arrow [3, Pages 48–49]. He explains that if a is an action that involves receiving one cent, b is another that involves receiving zero cents, and c is a third action involving receiving one cent and facing a small probability of death, then Monotone Continuity requires that the third action involving death and one cent should be preferred to the action with zero cents if the probability of death is small enough. Even Arrow says of his requirement “this may sound outrageous at first blush…” (Arrow [3, Pages 48–49]). Outrageous or not, Monotone Continuity (MC) leads to neglect rare events with major consequences, like death. Death is a black swan.

To overcome the bias we introduce an axiom that is the logical negation of MC: this means that sometimes MC holds and others it does not. We call this the Swan Axiom, and it is stated formally below. To illustrate this, consider an experiment where subjects are offered a certain amount of money to choose a pill at random from a pile, which is known to contain one pill that causes death. It was shown experimentally (Chanel and Chichilnisky ) that in some cases people accept a sum of money and choose a pill provided that the pile is large enough—namely, when the probability of death is small enough—thus satisfying the Monotone Continuity axiom and determining the statistical value of their lives. But there are also cases where the subjects will not accept to choose any pill, no matter how large is the pile. Some people refuse the payment of one cent if it involves a small probability of death, no matter how small the probability may be (Chanel and Chichilnisky, [6, 7]). This conflicts with the Monotone Continuity axiom, as explicitly presented by Arrow .

Our Axiom provides a reasonable resolution to this dilemma that is realistic and consistent with the experimental evidence. It implies that there exist catastrophic outcomes such as the risk of death, so terrible that one is unwilling to face a small probability of death to obtain one cent versus nothing, no matter how small the probability may be. According to our Axiom, no probability of death may be acceptable when one cent is involved. Our Axiom also implies that in other cases there may be a small enough probability that the lottery involving death may be acceptable, for example if the payoff is large enough to justify the small risk. This is a possibility discussed by Arrow . In other words: sometimes one is willing to take a risk with a small enough probability of a catastrophe, in other cases one is not. This is the content of our Axiom, which is formally stated as follows.

The Swan Axiom

This axiom is the logical negation of Monotone Continuity: There exist events f and g with W(f)>W(g), and for every vanishing sequence of events {Ei}i=1,2 an N>0 such that altering arbitrarily the events f and g on the set Ei, where i>N, does not alter the probability ranking of the events, namely, W(f)>W(g), where f and g are the altered events. For other events f and g with W(f)>W(g), there exist vanishing sequence of events {Ei}i=1,2 where for every N, altering arbitrarily the events f and g on the set Ei, where i>N, does alter the probability ranking of the events, namely W(f)<W(g), where f and g are the altered events.

Definition 3.1.

A probability W is said to be biased against rare events or insensitive to rare events when it neglects events that are small according to Villegas and Arrow; as stated in Arrow [3, page 48]:  “An event that is far out on a vanishing sequence is “small” by any reasonable standards” (Arrow [3, page 48]). Formally, a probability is insensitive to rare events when given two events f and g and any vanishing sequence of events {Ej},  N=N(f,g)>0, such that W(f)>W(g)W(f)>W(g)  for  all  f,g satisfying f=f and g=g a.e. on EjcR when j>N, and Ec denotes the complement of the set E.

Proposition 3.2.

A subjective probability satisfies Monotone Continuity if and only if it is biased against rare events.

Proof.

This is immediate from the definitions of both [3, 12].

Corollary 3.3.

Countably additive probabilities are biased against rare events.

Proof.

It follows from Propositions 2.1 and 3.2 [9, 12].

Proposition 3.4.

Purely finitely additive probabilities are biased against frequent events.

Proof.

See example in the appendix.

Proposition 3.5.

A subjective probability that satisfies the Swan Axiom is neither biased against rare events, nor biased against frequent events.

Proof.

This is immediate from the definition.

4. An Axiomatic Approach to Probability with Black Swans

This section proposes an axiomatic foundation for subjective probability that is unbiased against rare and frequent events. The axioms are as follows:

Axiom 1.

Subjective probabilities are continuous and additive.

Axiom 2.

Subjective probabilities are unbiased against rare events.

Axiom 3.

Subjective probabilities are unbiased against frequent events.

Additivity is a natural condition and continuity captures the notion that “nearby” events are thought as being similarly likely to occur; this property is important to ensure that “sufficient statistics” exist. “Nearby” has been defined by Villegas  and Arrow  as follows: two events are close or nearby when they differ on a small set as defined in Arrow , see previous section. We saw in Proposition 3.2 that the notion of continuity defined by Villegas and Arrow—namely, monotone continuity—conflicts with the Swan Axiom. Indeed Proposition 3.2 shows that countably additive measures are biased against rare events. On the other hand, Proposition 3.4 and the Example in the appendix show that purely finitely additive measures can be biased against frequent events. A natural question is whether there is anything left after one eliminates both biases. The following proposition addresses this issue.

Theorem 4.1.

A subjective probability that satisfies the Swan Axiom is neither finitely additive nor countably additive; it is a strict convex combination of both.

Proof.

This follows from Propositions 3.2, 3.4 and 3.5, Corollary 3.3 above, and the fact that convex combinations of measures are measures. It extends Theorem 6.1 of Section 6 below, which applies to the special case where the events are Borel sets in R or in an interval (a,b)R.

Theorem 4.1 establishes that neither Savage's approach nor Villegas' and Arrow's satisfy the three axioms stated above. These three axioms require more than the additive subjective probabilities of Savage, since purely finitely additive probabilities are finitely additive and yet they are excluded here. At the same time the axioms require less than the countably subjective additivity of Villegas and Arrow, since countably additive probabilities are biased against rare events. Theorem 4.1 above shows that a strict combination of both does the job.

Theorem 4.1 does not however prove the existence of likelihoods that satisfy all three axioms. What is missing is an appropriate definition of continuity that does not conflict with the Swan Axiom. The following section shows that this can be achieved by identifying an event with its characteristic function, so that events are contained in the space of bounded real-valued functions on the universe space 𝒰, L(𝒰), and endowing this space with the sup norm.

5. Axioms for Probability with Black Swans, in <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M137"><mml:mrow><mml:mi>R</mml:mi></mml:mrow></mml:math></inline-formula> or <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M138"><mml:mo stretchy="false">(</mml:mo><mml:mi>a</mml:mi><mml:mo>,</mml:mo><mml:mi>b</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:math></inline-formula>

From here on events are the Borel sets of the real line R or the interval (a,b). This is a widely used case that make the results concrete and allows to compare the results with the earlier axioms on choice under uncertainty of Chichilnisky [9, 12, 14]. We use a concept of “continuity” based on a topology that was used earlier by Debreu  and by Chichilnisky [1, 2, 9, 10, 12, 14]: observable events are in the space of measurable and essentially bounded functions L=L(R) with the sup norm f=esssupxR|f(x)|. This is a sharper and more stringent definition of closeness than the one used by Villegas and Arrow, since two events can be close under the Villegas-Arrow definition but not under ours, see the appendix.

A subjective probabiliy satisfying the classic axioms by De Groot  is called a standard probability, and is countably additive. A classic result is that for any event fL a standard probability has the form W(f)=Rf(x)·ϕ(x)dμ, where ϕL1(R) is an integrable function in R.

The next step is to introduce the new axioms, show existence and characterize all the distributions that satisfy the axioms. We need more definitions. A subjective probability W:LR is called biased against rare events, or insensitive to rare events when it neglects events that are small according to a probability measure μ on R that is absolutely continuous with respect to the Lebesgue measure. Formally, a probability is insensitive to rare events when given two events f and gε=ε(f,g)>0, such that W(f)>W(g)W(f)>W(g) for all f,g satisfying f=f and g=g a.e. on AR and μ(Ac)<ε. Here Ac denotes the complement of the set A. W:LR is said to be insensitive to frequent events when given any two events f,gε(f,g)>0 that W(f)>W(g)W(f)>W(g) for all f,g satisfying f=f and g=g a.e. on AR and μ(Ac)>1-ε.  W is called sensitive to rare (respectively frequent) events when it is not insensitive to rare (respectively frequent) events.

The following three axioms are identical to the axioms in last section, specialized to the case at hand.

Axiom 1.

W:LR is linear and continuous.

Axiom 2.

W:LR is sensitive to frequent events.

Axiom 3.

W:LR is sensitive to rare events.

The first and the second axiom agree with classic theory and standard likelihoods satisfy them. The third axiom is new.

Lemma 5.1.

A standard probability satisfies Axioms 1 and 2, but it is biased against rare events and therefore does not satisfy Axiom 3.

Proof.

Consider W(f)=Rf(x)ϕ(x)dx,Rϕ(x)dx=K<. Then W(f)+W(g)=Rf(x)ϕ(x)dx+Rg(x)ϕ(x)dx=Rf(x)+g(x)·ϕ(x)dx=W(f+g), since f and g are characteristic functions and thus positive. Therefore W is linear. W is continuous with respect to the L1 norm f1=R|f(x)|ϕ(x)dμ because f<ε implies W(f)=Rf(x)·ϕ(x)dx=R|f(x)|·ϕ(x)dxεϕ(x)dx=εK. Since the sup norm is finer than the L1 norm, continuity in L1 implies continuity with respect to the sup norm (Dunford and Schwartz, ). Thus a standard subjective probability satisfies Axiom 1. It is obvious that for every two events f,g, with W(f)>W(g), the inequality is reversed namely W(g)>W(f) when f and g are appropriate variations of f and g that differ from f and g on sets of sufficiently large Lebesgue measure. Therefore Axiom 2 is satisfied. A standard subjective probability is however not sensitive to rare events, as shown in Chichilnisky [1, 2, 9, 10, 12, 14, 23].

6. Existence and RepresentationTheorem 6.1.

There exists a subjective probability W:LR satisfying Axioms 1, 2, and 3. A probability satisfies Axioms 1, 2 and 3 if and only if there exist two continuous linear functions on L, denoted ϕ1 and ϕ2 and a real number λ,0<λ<1, such that for any observable event fLW(f)=λxϵRf(x)ϕ1(x)dx+(1-λ)ϕ2(f) where ϕ1L1(R,μ) defines a countably additive measure on R and ϕ2 is a purely finitely additive measure.

Proof.

This result follows from the representation theorem by Chichilnisky [9, 12].

Example 6.2 (“Heavy” Tails).

The following illustrates the additional weight that the new axioms assign to rare events; in this example in a form suggesting “heavy tails.” The finitely additive measure ϕ2 appearing in the second term in (6.1) can be illustrated as follows. On the subspace of events with limiting values at infinity, L={fϵL:limx(x)<}, define ϕ2(f)=limxf(x) and extend this to a function on all of L using Hahn Banach's theorem. The difference between a standard probability and the likelihood defined in (6.1) is the second term ϕ2, which focuses all the weight at infinity. This can be interpreted as a “heavy tail,” a part of the distribution that is not part of the standard density function ϕ1 and gives more weight to the sets that contain terminal events, namely sets of the form (x,).

Corollary 6.3.

In samples without rare events, a subjective probability that satisfies Axioms 1, 2, and 3 is consistent with classic axioms and yields a countably additive measure.

Proof.

Axiom 3 is an empty requirement when there are no rare events while, as shown above, Axioms 1 and 2 are consistent with standard relative likelihood.

7. The Axiom of Choice

There is a connection between the new axioms presented here and the Axiom of Choice that is at the foundation of mathematics (Godel, ), which postulates that there exists a universal and consistent fashion to select an element from every set. The best way to describe the situation is by means of an example, see also Dunford and Schwartz , Yosida [25, 26], Chichilnisky and Heal , and Kadane and O'Hagan .

Example 7.1 (illustration of a purely finitely additive measure).

Consider a possible measure ρ satisfying the following: for every interval AR,  ρ(A)=1 if A{x:x>a, for some aR}, and otherwise ρ(A)=0.  Such a measure would not be countably additive, because the family of countably many disjoint sets {Vi}i=0,1, defined as Vi=(i,i+1](-i-1,-i], satisfies ViVi= when ij, and i=0Vi=i=0(i,i+1](-i-1,-i]=R, so that ρ(i=0Vi)=1, while i=0ρ(Vi)=0, which contradicts countable additivity. Since the contradiction arises from assuming that ρ is countably additive, such a measure could only be purely finitely additive.

One can illustrate a function on L that represents a purely finitely additive measure ρ if we restrict our attention to the closed subspace L of L consisting of those functions f(x) in L that have a limit when x, by the formula ρ(f)=limxf(x), as in Example 6.2 of the previous section. The function ρ(·) can be illustrated as a limit of a sequence of delta functions whose supports increase without bound. The problem however is to extend the function ρ to another defined on the entire space L. This could be achieved in various ways but as we will see, each of them requires the Axiom of Choice.

One can use Hahn—Banach's theorem to extend the function ρ from the closed subspace LL to the entire space L preserving its norm. However, in its general form Hahn—Banach's theorem requires the Axiom of Choice (Dunford and Schwartz, ). Alternatively, one can extend the notion of a limit to encompass all functions in L including those with no standard limit. This can be achieved by using the notion of convergence along a free ultrafilter arising from compactifying the real line R as by Chichilnisky and Heal . However the existence of a free ultrafilter also requires the Axiom of Choice.

This illustrates why any attempts to construct purely finitely additive measures, requires using the Axiom of Choice. Since our criteria include purely finitely additive measures, this provides a connection between the Axiom of Choice and our axioms for relative likelihood. It is somewhat surprising that the consideration of rare events that are neglected in standard statistical theory conjures up the Axiom of Choice, which is independent from the rest of mathematics (Godel, ).

Appendix<statement id="ex3"><title>Example A.1 (Illustration of a probability that is biased against frequent events).

Consider the function W(f)=liminfxϵR(f(x)). This is insensitive to frequent events of arbitrarily large Lebesgue measure (Dunford and Schwartz, ) and therefore does not satisfy Axiom 2. In addition it is not linear, failing Axiom 1.

Example A.2 (two approaches to “closeness”).

Consider the family {Ei} where Ei=[i,), i=1,2,. This is a vanishing family because for all i,EiEi+1 and i=1Ei=. Consider now the events fi(t)=K when tEi and fi(t)=0 otherwise, and gi(t)=2K when tEi and gi(t)=0 otherwise. Then for all i,supEi|fi(t)-gi(t)|=K. In the sup norm topology this implies that fi and gi are not “close” to each other, as the difference fi-gi does not converge to zero. No matter how far along the vanishing sequence Ei the two events fi,gi differ by K. Yet since the events fi,gi differ from f0 and g0 respectively only in the set Ei, and {Ei} is a vanishing sequence, for large enough i they are as “close” as desired according to Villegas-Arrow's definition of “nearby” events.

The Dual Space <inline-formula><mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M278"><mml:mrow><mml:msubsup><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>∞</mml:mi></mml:mrow><mml:mrow><mml:mi>*</mml:mi></mml:mrow></mml:msubsup></mml:mrow></mml:math></inline-formula>: Countably Additive and Finitely Additive Measures

The space of continuous linear functions on L with the sup norm is the “dual” of L, and is denoted L*. It has been characterized, for example, in Yosida [25, 26]. L* consists of the sum of two subspaces (i)  L1 functions g that define countably additive measures ν on R by the rule ν(A)=Ag(x)dx where R|g(x)|dx< so that υ is absolutely continuous  with respect to the Lebesgue measure, and (ii) a subspace consisting of purely finitely additive measures. A countable measure can be identified with an L1 function, called its “density,” but purely finitely additive measures cannot be identified by such functions.

Example A.3.