This paper investigates simultaneous learning about both nature and others’ actions in repeated games and identifies a set of sufficient conditions for which Harsanyi’s doctrine holds. Players have a utility function over infinite histories that are continuous for the sup-norm topology. Nature’s drawing after any history may depend on any past actions. Provided that (1) every player maximizes her expected payoff against her own beliefs, (2) every player updates her beliefs in a Bayesian manner, (3) prior beliefs about both nature and other players’ strategies have a grain of truth, and (4) beliefs about nature are independent of actions chosen during the game, we construct a Nash equilibrium, that is, realization-equivalent to the actual plays, where Harsanyi’s doctrine holds. Those assumptions are shown to be tight.
Consider a finite number of agents interacting simultaneously. Every agent possibly plays an infinite number of times, and her payoff depends on the joint choice of actions as well as events beyond agents’ control (called
We provide a learning foundation for this doctrine. We consider a class of games where nature’s choices may (or not) depend on any past actions by players, and payoff functions are continuous for the sup-norm topology over the set of infinite histories. Provided that Bayesian players have a grain of truth, we show that resulting outcomes converge for the sup-norm topology to a Nash equilibrium that we construct, where Harsanyi’s doctrine holds.
Kalai and Lehrer [
To derive a convergence result for the sup-norm topology with choices of nature, a significantly different approach than that in Kalai and Lehrer [ along every equilibrium play path, every player chooses the (possibly randomized) action that maximizes her subjective expected payoff against her beliefs on others’ strategies; in case of unilateral deviation, every player plays the strategy described in the beliefs of the deviator; in case of multilateral deviation, strategies are defined arbitrarily.
Continuity for the sup-norm topology allows for a form of control over future payoffs. We use this property to approximate resulting plays by a Nash equilibrium for a game with a finite number of histories derived from the original game (a somewhat equivalent notion of truncated games for our setting). This approximation is a direct consequence of continuity for the sup-norm topology, and global approximation by a Nash equilibrium for the whole game is also a consequence of the control over subsequent payoffs derived from continuity. The resulting equilibrium is also proven to satisfy Harsanyi’s doctrine.
The proof requires the introduction of the concept of
The literature typically deals with more restrictive settings than ours; for instance, most follow Kalai and Lehrer by assuming that the same game is repeated over time and that the payoff function is a discounted sum of one-shot payoffs. Continuity for the sup-norm topology, as considered here, goes far beyond this setting. Absolute continuity of beliefs, an issue of paramount importance in our work, is not a necessary condition for convergence in general. Sandroni [
The paper is organized as follows. In Section
The model and some assumptions needed to obtain the main result of the paper are now described.
Time is discrete and continues forever. A period is denoted by the letter
For every
Let
A
Nature draws a state in every period, after every possible history. We thus represent nature’ choices by a behavioral strategy
The game is played with
The concept of
Consider a
Denote by
The beliefs of the players about others’ strategies and the realizations of the states of nature are now formally described.
Every player is assumed to have subjective prior beliefs about both other players’ strategies and nature. Those prior beliefs are formed before the first period of the game, and they will be updated in every subsequent period in a Bayesian manner, according to available information (see Kalai and Lehrer [
Formally, the beliefs of player
The belief of player
We consider the following probabilistic representation of beliefs. We associate a
First, the measure
Define
So defined, we now uniquely extend this measure to
Define the probability measure
The above representation implicitly requires that the belief of every player about nature is independent (in a probabilistic sense) of the actions chosen by other players. Moreover, it is also that every player believes that other players choose their actions independently of each other.
In Section
For sake of notational convenience, we shall denote by the same symbol
The intertemporal payoff functions of the players are now described. Every player has the utility function over the set of infinite histories
We assume that, for every player the function
For any
Moreover, every player is assumed to maximize the above expression, namely, her (subjective) expected payoff given her subjective belief about nature’ drawings and against her subjective belief about other players’ strategies.
The above specification of payoffs encompasses the case treated in Kalai and Lehrer [
This section is devoted to defining the solution concepts that will be used throughout.
First, the concept of best-response against others’ strategies, given a belief about nature, is defined. Pick any player
Fix now
The next notion allows us to specify a concept of closeness, in a probabilistic sense, between two vectors of strategies and for two particular choices of nature. Define first, for
Fix
The concept of “playing
With the above definitions, it is now possible to introduce the concept of
Fix the strategy the strategy profile
In other words, in any stochastic subjective equilibrium, the following requirements hold: (i) every player maximizes her intertemporal utility function against her beliefs about others’ strategies and given her beliefs about nature, and (ii) the beliefs about others’ strategies and nature are realization-equivalent (up to
The above definition extends the notion of subjective equilibrium, as introduced in Kalai and Lehrer [
In this section, the main result of the paper is stated and discussed. That is, the set of sufficient conditions leading to convergence toward the Nash equilibria in finite time is given.
We first introduce a definition, which captures the concept of
Consider a
For any
Before stating the main result of this paper, a notion in Measure Theory is first defined. Consider two measures
Finally, for any realized play path
Consider a the strategy the beliefs are such that player the belief of player
Fix now any arbitrary
The above theorem says that, if (1) players maximize their intertemporal utility functions against their own beliefs, and if (2) beliefs are updated in a Bayesian manner, as long as the independence requirement is satisfied and the grain of truth holds, actual plays are realization-equivalent to an almost Nash equilibrium in finite time.
One of the keys to proving the above result is that, when Assumptions (i)–(iv) are satisfied, along the realized play path actions satisfy the properties of a stochastic (almost-) subjective equilibrium in finite time. Since also any (almost) Nash equilibrium is an almost stochastic subjective equilibrium, and since also any (almost) Nash equilibrium trivially satisfies Assumptions (i)–(iv) above, Theorem
The assumptions used in Theorem
Assumptions (ii) and (iv) above are tight; for instance the reader is referred to Kalai and Lehrer [
When all assumptions in Theorem
In terms of possible extensions to Theorem
The proof of Theorem
This section is devoted to giving the main line of the proof of Theorem
The strategy of the proof of Theorem
The first proposition makes the link between strategies and beliefs satisfying Assumptions (i)–(iv) in Theorem
Consider a
For every
The above result implies that, as long as Assumptions (i)–(iv) hold, actual plays and beliefs about others’ strategies along almost every path will become, in finite time, an (almost-) stochastic subjective equilibrium.
The next proposition makes the link between (almost-) stochastic subjective equilibrium and (almost-) Nash equilibrium. Its proof is given in the Appendix.
Fix any vector of beliefs
The above result mainly states that, provided that beliefs are accurate enough, every stochastic subjective equilibrium is an (almost-) Nash equilibrium. Given the conclusion of Proposition
Arbitrary accuracy of beliefs follows from the next proposition, which is the well-known and important result proved by Blackwell and Dubins [
Before stating the result, let
Consider two
With all the above intermediary results, we next move to the proof of Theorem
Fix the strategies
By Proposition
Thus, we have found a period
The proof is now complete.
This section provides some extended discussions of the assumptions in Theorem
In this section, an example is given showing that Assumption (iv) in Theorem
Consider two players engaged in an infinitely repeated game. The repeated game is similar to the one studied so far, with the difference that a randomly generated (fixed-size) pair of payoff matrices
Player
Kalai and Lehrer [
The intuition of such a result is that, when
Instead of representing the belief of player
Of importance is the assumption that every player believes that others’ actions are uncorrelated with each other. Informally, this last assumption ensures that any player’ beliefs are represented by a measure product over beliefs about others’ strategies. When this assumption is not present, it is easy to find examples where convergence toward a Nash equilibrium does not obtain (see for instance Kalai and Lehrer, [
The Appendices are devoted to proving technical results left aside earlier in the paper. In what follows, we consider nature as an additional player maximizing a constant utility function. This does not yield any loss of generality.
Proposition
Fix
Therefore, Theorem
Consider any realized play paths
The proof of Proposition
We first start with a technical lemma, stating that when two measures become eventually similar for the sup-norm, the expectations of any continuous functions according to those measures also become eventually similar. Consider a complete metric space
Consider two positive and finite measures
Consider any such function
The proof is complete.
We next state another technical lemma, related to the notion of stochastic subjective equilibrium.
For every
Fix any vector of beliefs
To prove the result, we first truncate the infinite repeated game to a finitely repeated game, show that the result holds within this truncated game, and then extends the result to the original framework.
Fix
First, we have that, for every
We restrict our attention to the truncated game of length
Formally, for any behavioral strategy
Consider also the function
With this last function, only changes of individual strategy within the truncated game of length
In a first step, we show that
By applying Lemma
We next analyze the right-hand side of (
For every history
Further, for any given
Consider now the set of such histories assigned strictly positive probability by
Moreover, since
We next use the above remark to find a uniform upperbound to the right-hand side of (
For every set
Taking the maximum over such sets, we have that
Setting
Moreover, since the contribution of any strategy profile after period
We have thus derived the desired inequality, and the proof is now complete.
With the two previous lemmas, we can now prove Proposition
The proof goes as follows. Fix
We associate to for every for every
if if
To prove Proposition
We first claim that there exists
Indeed, by Lemma
Define
We next use the previous claim to get our result. In a first step we first prove the property for every individual deviation in the support of
By Lemma
Define
Combining (
Moreover, since
We now extend this result to any arbitrary behavioral strategy
All together, we have shown that
The author declares that there is no conflict of interests regarding the publication of this paper.