The Dynamics of the Discrete Ultimatum Game and the Role of the Expectation Level

We have studied evolutionary ultimatum game with spatially arranged players, who have choice between the two kinds of strategies (named greedy and altruist). The strategies in the ultimatum game here are described by p(i) and a(i), that is, the probability of offering i to himself and the accepting probability when receiving i. By using computer simulations with C++ builder, we have provided the dynamics of the greedy and altruistic strategies and found that the proportion evolution of the “greedy” strategy for different initial cases is approximately 60%. Furthermore, the explanations for the interesting phenomenon are presented from different aspects. In addition, we illustrate that the factor of the expectation level (aspiration level) in the updating rule plays an important role in the promotion of altruistic behaviors.


Introduction
The issues of altruism and selfishness are in the centre of some of the most fundamental questions concerning our evolutionary origins, our social relations, and the organization of society. The investigation for the altruism is of great interest across biology and social sciences [1,2]. Moreover, experimental evidence indicates that human altruism is a powerful force in the animal world [3]. The ultimatum game is such a prime showpiece of the altruistic behaviors, while the rule of the ultimatum game is quite simple, which reads that two players are asked to divide a certain sum of money. One of the players, the proposer, suggests how to divide it, and the other player, the responder, has two choices: one is to agree to the division; the other one is to reject and thus both get nothing.
For the division strategy of the ultimatum game, the past decades have witnessed many theoretical investigations [4][5][6]. Many factors have been found to influence the outcomes of the ultimatum game, such as mutation [7], background payoff [8], payoff-oriented mechanism [9], degree-based assignation of roles [10], role preference [11], stochastic evolutionary dynamics [12], and the empathy mechanism [13]. Several studies focus on analyzing many types of connectivity structures. For instance, investigations on square lattice [7], small-world networks [14], scale-free networks [15], and adaptive networks [16,17] have been conducted to primarily clarify the possible role of the topology.
Most of the abovementioned works have treated the strategies of the ultimatum game as continuous ones. But it can lead to some different and interesting outcomes by considering the strategies as discrete ones [6]. Nowak et al. [4] studied the four strategies of a minigame: 1 = (ℎ, ), 2 = ( , ), 3 = ( , ℎ), and 4 = (ℎ, ℎ), where represents the low amount, while ℎ represents the high amout. They found that the ration 1 = (ℎ, ) dominated fairness 3 = ( , ℎ) in the stochastic case; while introducing the "reputation" factor, the fairness was favored. Szolnoki et al. [18] introduced a spatial ultimatum game with a discrete set of strategies and showed that this simple alteration could lead to fascinatingly rich dynamical behaviors. They [19] further illustrated the importance of discrete strategies in the ultimatum game  and found that fine-grained strategy intervals promote the evolution of fairness in the spatial ultimatum game.
From a psychological perspective, models of social preferences [20] have been provided to formally explain the apparently irrational behavior [21] in the ultimatum game. The typical theories of negative reciprocity [22] focus on the intentions and describe rejection as a tool to punish the unfair proposer. The theories of inequality aversion [23] focus on the outcomes and claim that people are naturally averse to unequal distributions, especially when disadvantageous. By taking the prospect idea into considerations, Chen and Wang [24] introduced appropriate payoff aspirations in a smallworld networked game and found its profound effect on the promotion for cooperation. Perc and Wang [25] also showed that heterogeneous aspirations promoted cooperation in the Prisoner's Dilemma game.
Motivated by the above considerations, we follow [26] and adopt the two strategies to study their evolution in the spatial lattices [27,28]. These two strategies, that is, greedy and altruist, to some extent have some emotional significance, especially from the emotional perspectives. Furthermore, by introducing the expectation level into the updating rule, we will focus on the role of the expectation level for the altruist.
The outline of this paper is as follows. Section 2 shows the evolution of the greedy and altruistic strategies, from the viewpoints of the payoff, the type links, and the transform probabilities for illustration. Subsequently, by adding the expectation level, the evolution of altruist is given in Section 3. Finally, our conclusions are drawn in Section 4.

Evolution of Greedy and Altruist Strategies
The rule for the ultimatum game has been described above. Suppose that the total sum is , and in the simulations the value for is set to be equal to 100. And the players participating in the game have the equal opportunity to be proposer or responder.
The strategy is generally denoted by = ( , ), where represents the amount given to the proposer per se and thus the amount − offered to the responder and the corresponding acceptance probability for the responder is denoted by . Following [26], here, we consider the two particular forms for ( , ) in the square lattices and the strategies are described as probability distributions: (i) Greedy strategy: higher values are more probable to be accepted but lower amount is more probable to be proposed to others.
(ii) Altruist strategy: higher values are more probable to be proposed to others.
We first study the spatial and the temporal evolution of the two strategies in the square lattices. Firstly, each player is denoted by one site in two-dimensional square lattices, and this system size = × in simulations has been set to be = 100 × 100. Initially, each site is occupied by any one of the two strategies. In each round of games, each player plays the game with its immediate eight neighbors in the square lattice with periodic boundary. The score for each one is the sum of the payoff in these eight encounters. At the start of the next generation, each lattice-site is occupied either by one of the neighbors or by its previous owner, depending on who obtains the highest payoff in that round and so to next round [27,28]. One run of the model consists of 1000 generations. Each experimental condition is replicated 30 runs and all the data, other than the snap, is the average of these 30 runs.
The average proportion of "greedy" strategy in the population only with "greedy" and "altruist" strategy is shown in Figure 1 Figure 1, it is obvious that the final situations are quite similar and the "greedy" population is about 60 percent in the whole population, almost independent of the initial conditions. The result of the simulation reveals that the "altruist" and "greedy" strategies could coexist.
Combining with the average payoff for greedy population and the altruist ones, shown in Figure 2, one can quantitatively compare that the payoff of the altruist (about 320) is much bigger than those of the greedy population (about 240). But the fraction of the greedy population (about 60%) is bigger than that of altruist (about 40%). Naturally, one may wonder what causes this to happen.
To give an intuitional illustration, we show the asymptotic pattern in Figure 3. Here, it is worth stating that the same asymptotic results arise with other initial cases. The color coding is as follows: yellow represents an altruist (A) site following A in the preceding generation; aqua is a greedy (G) site following G; red is A following G; and black is G following A. From Figure 3, one can easily see that yellow and aqua cluster together, while red and black lie in between yellow and aqua.
From the perspective of the type of links, Figure 4 further shows the evolution of different link types, where A-A denotes a link connecting nodes with both altruist strategies; G-G represents a link connecting nodes with both greedy strategies; and A-G is a link connecting nodes with the two different strategies. We further show that the asymptotic results are the same for the different G : A ratios. Clearly, the asymptotic results show that, among all the links, the G-G occupies the first place; A-A comes second; and A-G is the last.
The amounts of red and black in Figure 3 imply how many sites are varying from one generation to the next, which is correspondingly shown in Figure 5. Figure 5(a) shows the conditional probability, where Pr( = G | −1 = G) + Pr( = A | −1 = G) = 1 means that the G players in the preceding ( − 1)th generation either remain as G or change into A in the th generation and so to Pr( = A | −1 = A) + Pr( = G | −1 = A) = 1. Figure 5(b) directly shows the proportions of the four different colors at the th generation ( = 0, 1, 2, . . . , 100). One can easily see that the proportions of A → G and G → A are the same during the asymptotic patterns after few generations.

Expectation Effect on the Evolution of Altruist
In this section, we proceed to the exploration of the greedy and altruist by introducing the expectation level. The strategy updating relies on the difference between player's actual payoff and the expectation level. Following previous works [24], we define a parameter ∈ [0, ] as the average expectation level of the players, and each player calculates its expectation payoff = , where denotes the number of neighbors and in this square lattice each is identical as = 8. During the evolutionary process, player will compare the sum of payoff (denoted by ) from neighbors with 4 Discrete Dynamics in Nature and Society  the expectation level and change its current strategy to its opposite strategy with a probability depending on the difference ( − ) as where characterizes the noise effects in the strategy adoption process. The aspiration level provides the benchmark which is used to evaluate whether player satisfies its current strategy. This evolutionary rule is stochastic. The rationale is that players can make use of their own payoff information efficiently and evaluate their satisfaction levels toward their current strategies more accurately. This probability characterizes the exact extent of changing their current strategies. Herein, we simply set = 1 in this section and concentrate on how the expectation payoff affects the evolution of altruist on this square lattice.
Discrete Dynamics in Nature and Society    From Figure 6, one can see that the evolution of altruist to some extent is related to the expectation level. Figure 6(b) clearly shows the altruist fraction increases with . The tendency is quite similar to that of payoff shown in Figure 7.  the fractions of A → G and G → A increase simultaneously. Intuitively speaking, it is quite reasonable that the higher the expectation, the bigger the difference between expectation and the actual payoff, and thus the higher the probability changing.
As for the strategy pairs, Figure 9 shows the fraction that the player with G (A) connecting with G (A) or A has G neighbor. As increases, both A-A link and A-G link increase, while G-G link decreases. This thus implies that expectation promotes altruist.

Conclusion
The competition between the "greedy" and "altruist" strategies is studied by numerical simulations and the results of evolutionary processes are plotted for some relevant cases. From the perspectives of payoff evolution, snaps, the transform probability, and the link type, we give the intuitional illustrations. Furthermore, to make full use of one's own information, the expectation level is introduced into the strategy updating mechanism. The simulation results show that the expectation level can to some extent promote the fraction of altruist.