New Insights on the Emergence of Classes Model

We show the results of a detailed replication of the Emergence of Classes Model Axtell et al. (2004). We study the effect of possible biases on the original proposal and we find additional results and conclusions. We also explore the effects of minor changes on the decision rules that agents play.


Introduction
The efforts for replicating to replicate previous published models have grown during recent years.However, model replicating is a very tough task, as it was showed by Axelrod 1 and Edmonds 2 .In this paper, we replicate the model by Axtell et al. 3 hereafter AEY , where two agents want a portion of the same pie, and the portion that a particular agent gets depends on the portion demanded by the other agent.Our results are in agreement with their conclusions, both with nondistinguishable and distinguishable agents the tag model , as L ópez-Paredes et al. 4 and Dessalles et al. 5 also confirmed in a previous replication of this work.
In this paper, we analyze the hypothesis that researchers should make to obtain the results shown in AEY'S model, and we pay special attention to a the initial conditions of the system potential artefacts/biases following Galán et al. 6 and Kubera et al. 7 , b how dependent the results are on the reward values in the payoff matrix, and c different ways in which an agent can take a decision.These considerations should be carefully explained to facilitate replication and prevent researchers from making erroneous hypothesis and considering particular cases as general conclusions.
After that, we go one step further by introducing a change in the agents' decision rule: agents behave more realistically and do not compute average benefits.Their decision depends on the most likely option taken by their opponents in previous games.
Finally, we change the way in which the agents are paired by placing them on a regular spatial structure and forcing them to play against any of their neighbours.
Our results confirm the role that tags play in the emergent behaviour of artificial societies.The effect of tags in human decision processes has been empirically demonstrated by Ito et al. 8 .

The Model
We begin by replicating the bargaining model by AEY.In this model, two players demand some portion of a "pie" which is a metaphor for a property that is going to be shared out .The portion of pie that they get i.e., the reward depends on the other agent's demand.They can demand three different portions of the pie: low, medium or high.As long as the sum of the two demands is not more than 100 percent of the pie, they receive what they demanded; otherwise each gets notting.
There is a population of n agents that play in random pairs.Each agent has a memory in which she maintains the decision taken by their opponents in previous matches.The information collected in their memory is used to demand the portion of the pie that maximizes her expected benefit with probability 1−ε , although sometimes, with probability ε, the decisions are taken randomly.
At first, the authors assume that the agents are not distinguishable from one another except for the content of their memories .They conclude that, whenever there are not observable differences among the agents, there is only one possible state of equilibrium, in which all the pie is shared out among the agents all the agents learn to compromise and demand "half of the pie" .However, under certain conditions, a "fractious state" can emerge: in this case, all the agents are either aggressive or passive some of them demand low and some of them demand high , and no equilibrium is reached.
In a second stage of their research, they add a visual "tag" to the agents.The players are capable of identifying their opponent's tag and they store the decision taken by their opponents in a different memory set depending on the opponent's tag .In this case, the authors prove that, just by adding different tags to the players, discriminatory states can emerge under certain conditions, in which agents with different tags follow different behaviours.

Replication
In our replication of the AEY's model, we used the original payoff matrix i.e., the combination of rewards for the different demands 30 percent for low, 50 percent for medium, and 70 percent for high.We also used the original decision rule.
When two players are paired to play, each one gets the portion that she demands as long as the sum of the two demands is less than or equal to 100 percent of the pie.For example, i if player 1 demands 30, she will receive 30 independently of player 2's decision when player 1 chooses 30, the sum of 30 player 1's demand and all the possible combinations of demands for player 2 are less than or equal to 100 percent of the pie.
ii if player 1 demands 50, she will get 50 unless player 2 demands 70 if player 2 chooses 70, the sum of the two demands is higher than 100 percent of the pie.In this case, both players get nothing.
iii if player 1 demands 70, she will get 70 only if player 2 demands 30 if player 2 chooses 50 or 70, the sum of the two demands exceeds 100 percent of the pie and each agent gets nothing.

Decision Rule
What makes an agent choose low, medium or high?An agent will check his memory to find how often each option has been chosen by her opponents.Then, she considers that the probability that her current opponent chooses 30 L , for example, is equal to the relative appearance of 30 in her memory.In the same way, she calculates how likely it is for the opponent to choose 50 M and 70 H . Once the agent knows this information, she estimates the expected benefit for the three possible options as follows: where B "x is the mean benefit I get if I choose "x" and P "event" is the probability that "event" occurs .Notice that this "rational behaviour" takes place with probability 1 − ε.However, a random decision is taken with probability ε.
A simulation of this replication is shown in Figures 1 and 2. Both simulations were run with the same initial parameters the same number of agents, the same memory size and the same uncertainty parameter ε .
The simplexes shown in Figures 1 and 2 represent the memory state of the agents.The more demands of L an agent keeps in her memory, the closer to the bottom-right vertex she is plotted.Equivalently, if a player's memory contains a considerable amount of H's, she is placed near the top vertex.Finally, if most of the elements in an agent's memory are M's, she is plotted close to the bottom-left vertex.
The simplex is split into three different regions, separated by three "decision borders".The top region is dominated by frequent demands of H in previous matches.This is why agents in this region tend to demand L with probability 1−ε , as it maximizes their estimated benefit.On the right region, agents are likely to demand H with probability 1 − ε because L is the dominant element in their memories.Agents on the left region have often found that their opponents demand M; since demanding M maximizes the expected payoff, they are likely to choose M with probability 1 − ε in the current iteration.
The three "decision borders" intersect in a point that represents Nash's equilibrium in which agents have the same preference for L, M, or H. AEY states that the system reaches an "equitable equilibrium" when all the agents have, at least, 1 − ε • m where ε is the uncertainty factor and m is the memory size elements in their memories equal to M. Figure 1 shows an equitable equilibrium.In this state, all the agents have found frequent demands of M in the past, and they assume that M is the best response.Because all the agents demand M, all the pie is shared out among the players, which means that the system has reached an efficient state.Once the equitable equilibrium is established, it is very difficult for the system to leave this state, since the system is ergodic, there is still a chance that the system reaches every state in the long term, due to the noise parameter ε. Figure 2, by contrast, shows a fractious state, in which all the agents are whether aggressive or passive most of them select L or H; M is hardly chosen and no equilibrium is reached.In this case, the system was started with different random initial conditions.Because the agents have not learnt to compromise, some portions of the pie remains undistributed, which shows the high inefficiency of this system.
Because the system is ergodic, there is a chance that the population evolves from the fractious state shown in Figure 2 to the equitable equilibrium depicted in Figure 1.The number of iterations to achieve this change in the state of the system was defined by AEY as "transition time".AEY studied the transition time and analyzed the sensitivity of results to the memory size m and the uncertainty factor ε , and so we did in our replication.To this aim, we forced the agents' memories so that the system reached a fractious state Figure 2 , and then we measured the number of runs that the system needed to reach the equitable equilibrium Figure 1 .Figure 3 shows the results of our simulation.
Both experiments, the original and the replication, produce the same result in relation with the transition time: it increases as the memory size grows.Notice that this simulation starts in a fractious state; this is why, at first, all the agents tend to demand L or H with high probability 1 − ε because their memories contain mainly L and H.This situation provokes that the agents continue demanding L or H M never maximizes their expected benefit at the first stages of the simulation, when the system is in a fractious state .Therefore, we depend on the noise parameter ε to escape the fractious state, as this is the only way to make M appear in the agent's memories, and, consequently, make the agents consider that M is a good option.When the system is started fractious state , the probabilities that an agent chooses M is ε/3 the probability of taking a random decision is equal to ε. Supposing that this is the case, the probability that the random decision is equal to M is one out of three i.e., the probability that L and H are not randomly chosen .In conclusion, the probability that M is chosen is ε/3.This is the reason why the higher ε, the higher the probabilities of leaving the fractious state and thus, the faster the convergence to an equitable equilibrium, as Figure 3 shows.

Introduction of a New Decision Rule
After replicating the original scenario, we changed AEY's decision rule so that the agents demanded the portion of the pie maximizing their benefits against the most likely option taken by their opponents in previous games.In this case, an agent assumes that her opponent's option will be "the mode" of the content of her memory.An agent will choose H if L is the most frequent decision taken by her opponents in the previous matches; if the most repeated value in her memory is M, the player will choose M. If previous matches show that H is the most frequent decision taken by her opponents, she will choose L.
When the agents used this new decision rule, the chances of reaching the equitable equilibrium in the first place were considerably reduced as L ópez-Paredes et al. 4 concluded .Figures 4 and 5 show this comparison.To perform this simulation, all the agents where initialized with random memories as they were in AEY's model , and we measured the percentage of experiments that first reached an equitable equilibrium, versus the number of experiments that first reached a fractious state.
Furthermore, if we only consider the experiments that reached an equitable equilibrium, the time to get it was longer in comparison with the same conditions in the experiment with AEY's original decision rule.
Figure 6 shows two simulations of our modification of AEY's model, in which the decision rule has been changed as described before.The left simplex shows an equitable equilibrium and the right simplex displays a fractious state, both after 100 iterations.The simulation was run with the same parameters as in Figures 1 and 2 100 agents, memory length 30 and ε 0.2 .Notice how the "decision borders" have changed as a result of the introduction of the new decision rule.

Payoff Matrix Sensitivity Analysis
In AEY's model, the values of the possible demands are fixed: 30 percent of the pie for low L , 50 percent of the pie for medium M , and 70 percent of the pie for high H .We have studied different combinations for the low L and high H rewards to analyze the effects on the behaviour of the system, in any case, the sum of the values of L and H is equal to 100 percent of the pie.The combination of payoffs is shown in Table 1.
The analysis of the simulations showed that when the differences H-M and M-L are high, the transition time between the fractious state and the equitable equilibrium is longer.A comparison of the transition time for different payoff matrices is shown in Figure 7.

Changing the Initial Conditions: "Progressive Memory"
In AEY's model, all the individuals in the experiment have a fixed-size memory size m along all the matches.The agents are generated with m random values in their memories.Kubera et al. 7 explains that it could introduce biases in the results.
In this modification of the original model, we shall assume that the memory size of each individual grows at a rate of one unity per match, starting with a 0-size memory, until the memory size reaches AEY's fixed value m .The memory size will not grow any longer when it reaches this value.
To fix ideas, let us suppose that we have defined a memory size of 6 m 6 .This means that each agent can remember the decision taken by her latest six opponents.Therefore, all the agents have six memory positions.However, in the first match, their memories are empty, as they have never played against any other player before.This is the reason why, in the first match, the decision taken by each agent is random.Afterwards, all the agents store the decision taken by their opponents, as they did in AEY's model.They will use this information to take a decision in the second match, with the same criteria as in AEY's model.Then, the decision taken by their opponents will be stored in their memories once again.In the third match, each agent will have information about the two previous matches; they will take a decision based on this information and store the decision taken by their opponents, and so on.When the number of matches is higher than the memory size for each agent m , the agents will store the decisions taken by their opponents in their memories, but will eliminate the oldest value in their memories so that the memory size is equal to m in the following matches.Figure 8 compares the time it takes for the system to reach the equitable equilibrium, both with and without progressive memory.If the system lacks progressive memory original AEY's model , agent's memories are initialized with m 12 random values.In the case of progressive memory, each agent's memory is started with one random value and their memory grows in one element iteration by iteration until it reaches length m 12.
The simulation showed that just by changing the initial conditions, the results of the simulation are completely different.
First, as Figure 8 shows, the time it takes for the system to reach the equitable equilibrium is longer than in AEY's original model.Because the first decision is random, the chances of choosing L or H are twice the chances of choosing M, which makes the system approach to the fractious state during the first steps of the simulation.The presence of noise in the system ε / 0 , makes it possible that agents choose M with certain probability, which leads the system to the equitable equilibrium in the long term.Because of this transitory situation, in which the system tends to approach the fractious state during some iterations, the number of runs until the system reaches the equitable equilibrium is higher than in AEY's model.
Secondly, notice that, in the case of progressive memory, the value assigned to ε is crucial.For low values of ε, the system tends to reach a fractious state.The presence of noise makes the agents choose M at some point of the simulation.The increment of the presence of  M in their memories makes the agents consider that M is a good reply: eventually, the agents learn to compromise and reach an equitable equilibrium.This fact is not likely when ε grows.
Therefore, although the simulation shows that changing the initial conditions results in an increase of the time to reach the equilibrium, we conclude that initial conditions are irrelevant in the long run.

The Model with Two Agent Types (the "Tag" Model)
In a second experiment, AEY let the agents be distinguishable from one another by introducing a tag: they create two types of agents, each of whom with a different tag colour .The agents are capable of identifying their opponents' tag colour and they keep the portion of the pie demanded by their opponents in two memory sets, depending on the opponent's tag.AEY states that discrimination segregation can emerge spontaneously, both when the agents play with other agents of the same type intratype matches and when the agents play against players with different tag intertype matches .
To study the different cases of segregation, AEY uses two simplexes: one shows the memory state of the agents when they play against agents with their same tag and the other one displays agents' memories when they play against agents with a different tag.

Intratype Segregation
Figure 9 shows the three scenarios that can arise when players of the same tag play among them intratype matches .
In the case of intratype matches, we could appreciate three different scenarios.
i Equitable equilibrium all the agents demand M independently of their tag .
ii Fractious state the agents are whether aggressive or passive and do not learn to compromise .
iii Intratype segregation: The agents with one tag reach an equitable equilibrium and the agents with the other tag reach a fractious state.
The first and the second scenarios do not show any kind of discrimination: the system reaches an equitable equilibrium or a fractious state independently of the agent's tag, as it did in AEY's model with one agent type.The third scenario is more interesting: when dark players play against dark players, they consider that M is the best response and reach  an equitable equilibrium.However, when light players play among them, they do not learn to compromise and the system reaches a fractious state.This happens even though the decision rule is the same for both types of agents.

Intertype Segregation
In the case of intertype matches, we can appreciate the two different scenarios shown in Figure 10: i Equitable equilibrium all the agents demand M independently of their tag .
ii Fractious state the agents of one colour are aggressive-they choose H-and the agents of another colour are passive-they choose L .Some of the experiments showed intertype discrimination.When the agents with different tags are paired to play, the dark agents find that light agents have frequently demanded H. Consequently, they decide to choose L, which is the only demand that allows them to get a nonzero benefit.On the contrary, after a number of iterations, the light agents have found that light agents are likely to choose low L .Therefore, they choose high H , as it maximizes their benefit.This situation can be seen as a "stable fractious state", because the system keeps in this state for longs periods of time: all the agents with one tag are aggressive they all choose H and all the agents of the other tag are passive all of them choose L .
After a series of simulations, we conclude that the chances that the system reaches a scenario different from the equitable equilibrium are very low.If fact, when we tried the same parameters that AEY used in their simulation 100 agents, memory size 20 , segregation never emerged, we contacted Axtell to make sure that we were using the same decision rule that they did.We needed to reduce the number of agents and the memory length so that we could appreciate segregation Figures 9 and 10 .
Then, we tried changing the decision rule, so that the agents choose the best reply against the most frequent option taken by their opponents in previous matches mode of their memory , see Section 3.2.The simulation showed that just after changing the decision rule, segregation emerged spontaneously much more often than when we used the original decision rule .In this case, we easily observed all the possible cases of segregation shown in AEY's model.

Distribution of the Agents in a Spatial Regular Structure
In AEY's model, the agents play in random pairs, which means that any agent can play against any other agent of the population.In this new extension, we will consider a 10 × 10 toroidal surface where 100 agents will be placed.The agents will be able to play against any of their eight surrounding neighbours i.e., they can play against any other agent that belongs to her radius 1-Moore neighbourhood .As in the original model, there will be the same number of dark-tagged agents and light-tagged agents.
Since the geographical position of each agent is now considered, we will take into account how the tags are distributed in the grid.The effects of the presence of initial clusters have a great relevance in spatial and geographical distribution issues.This is why we will use three different distributions of tags, as Figure 11 shows.
When the agents were randomly paired, we obtained three different results in intratype games Figure 9 and two different results in intertype games Figure 10 .The aim of this section is to test if these five results can also be obtained when the agents are placed as Figure 11 depicts.To that end, the simulations will be performed with the same parameters that we used in the replication of the original model.We will use the "modebased" decision rule described in Section 3.2, since it facilitates the emergence of segregation in the "tag model", as it was discussed in Section 4.
The simulations confirmed that the same points of attraction that we got in the original model can be obtained with this new extension of the model, both in intratype and intertype games.
Nonetheless, we discovered that, with certain distributions of the tags, it is possible to get new points of attraction that did not appear when the agents were randomly paired to play.This is the case of intertype games when the tags are distributed in two zones as shown in Figure 11 c . Figure 12 shows the four points of attraction that the system reaches in the intertype games when this distribution of the tags is used.Notice that due to this distribution of the tags, only the agents that form the borders can play intertype games: in intertype games, each agent is paired with any of their neighbours with different tag.
This distribution of the tags creates two borders between the dark and the light agents notice that the lattice is a torus .Figure 12 a shows an equitable equilibrium for both borders regardless of the agents' tag .In Figure 12  and all the light agents demand low, which leads to the emergence of segregation.Both borders reach a low-high equilibrium.Figure 12 c shows a new case of intertype segregation.However, in this case, both types of agents demand low or high depending on which border they are.Finally, Figure 12 d shows another case of intertype segregation: the agents in one of the borders reach a medium-medium equilibrium and the agents in the other border reach a low-high equilibrium.
Notice that the results shown in Figures 12 a and 12 b also appeared in AEY's original model they are equivalent to the points of attraction shown in Figure 10 .By contrast, the results shown in Figures 12 c and 12 d only appeared after placing the agents on a grid and distributing the tags in two zones.
However, we conclude that, as there is no connection among the agents that form the two borders, the equilibrium that they reach is independent of one another.Nevertheless, a more in-depth analysis showed that more complex equilibriums can emerge when a set of agents acted as borders between tags and these borders were not connected between them.

Conclusions
In AEY's model, segregation emerges spontaneously, even tough all the agents have the same behaviour rule regardless of their tag .The recognition of the opponent's tag-which a priori does not need to influence on the decisions, as it is an external property-makes the agents "learn" how to behave depending on whether the agent they play against is a same-tag agent or a different-tag agent.
The replication of AEY's no-tags model, showed that there are two centres of attraction in the system: an equitable equilibrium, in which the agents learn to compromise; and a fractious state, in which all the agents are either aggressive or passive and no equilibrium is reached.Because of the ergodicity of the system, there is a possibility that the state of the system switches between these two regimes.We measured the transition time between the two regimes and observed that it rises as the memory size and the number of agents grow, as 3 concluded.The simulation of our replication is completely in agreement with their results.
The modification of AEY's no-tags model showed interesting results.We conclude that simple changes within the original model using the mode instead of the mean to take a decision , provokes dramatic changes in the studied system.In fact, when we introduced this new decision rule, the chances of reaching an equitable equilibrium in the first place were considerably lower than in AEY's original model.
Moreover, changing the original payoff matrix resulted in a considerable modification in the transition time: the higher the reward assigned to low, the longer it took for the system to reach the equitable equilibrium.
Initializing the agents with a progressive memory instead of using AEY's fixed-size memory showed an interesting scenario: at first, agents tend to be aggressive or passive, but after a number of iterations, they learn to compromise.This makes the system reach an equitable equilibrium in the long run.Therefore, agents' fractious behaviour in the first stages of the simulation results in an increase of the transition time in comparison with AEY's original model.However, we observed that changing the initial conditions does not affect the system in the longer term.
After replicating the tag model, we conclude that our results are in accordance with the original AEY's work.Additionally, we could appreciate that the chances that segregation emerges were really low when we used the original decision rule.After replacing the original decision rule with the mode-based decision rule, segregation emerged much more often.Placing the agents on a regular spatial structure showed that the system could reach the same points of attraction as in the original model, although, initially, no geographical constraints were considered in the original AEY's model.We are currently considering different distribution of the tags in the grid, which makes it possible the emergence of new equilibriums that did not appear in the original model.In future research we will consider different social networks topologies to study how these equilibriums can be affected by the new topologies.

Payoff Matrix
Using mathematical notation, the payoff matrix shown in

Decision Rule
The decision rule used in AEY's model Section 3.1 is explained with mathematical notation below: n A j : number of positions with value j ∈ L, M, H in the memory array of agent A ⇒ v 1 , v 2 , . . ., v m A Pr B A j n A j /m ⇒Probability estimated by the agent A for the possibility that the opponent B selects the strategy j equivalent to the relative frequency of occurrence of value j in the memory array of the agent A .
The utility function for agent A when she selects the strategy i ∈ S i L, M, H is: • V i, j /i ∈ S A ; V i, j 1 if i j ≤ 100; V i, j 0 if i j > 100. A.1 Then, each agent A selects with probability 1 − ε the strategy i that maximizes her utility function: A selects i ∈ S A L, M, H /EU A i maxU A i and selects a random strategy i ∈ S A with probability ε.

Decision Rule
Using mathematical notation, the mode-based decision rule used in Section 3.2 is explained below: Each agent A selects, with probability 1 − ε her strategy i according to the statistical mode Mo of her memory array as follows: Mo

1 Figure 3 :
Figure 3: Replication of AEY's model.Transition time as a function of the memory length m ; n 10; various ε uncertainty factor .

Figure 4 :
Figure 4: Percentage of experiments that reached a fractious state as the first centre of attraction.Uncertainty parameter ε 0.2.Original decision rule.

Figure 5 :Figure 6 :
Figure 5: Percentage of experiments that reached an equitable equilibrium as the first centre of attraction.Uncertainty parameter ε 0.2.New decision rule.

Figure 7 :
Figure 7: Number of iterations to equitable equilibrium as a function of L lowest payoff and n number of agents ; uncertainty parameter ε 0.1 and memory length m 10.

Figure 8 :
Figure 8: Comparison of AEY's model with and without progressive memory.Number of iterations to equitable equilibrium.Uncertainty parameter ε 0.1.Memory length m 12.

aFigure 11 :
Figure 11: Distribution of the agents in a spatial regular structure.n 100 agents 50 of each type .Notice that the lattice is a torus

Figure 12 :
Figure 12: Four possible results for intertype matches when the tags are distributed in two zones.n 100 agents 50 of each type .Notice that the lattice is a torus.

Table 1 :
Possible payoff matrices combination of demands .
Table 1 can be explained as follows: : space of agent i i 1, . . ., n possible strategies j: possible strategy ⇒ j ∈ L, M, H /M 50, H 100 − L, L < H L: select Low, M: select Medium, H: select High .v 1 , v 2 , . . ., v m i : memory array of agent i, which stores the strategies v k ∈ L, M, H chosen by the opponents in the m previous rounds A, B : couple of agent randomly paired n/2 randomly pairs by round .If agent A chooses strategy i ∈ S A , and agent B chooses strategy j ∈ S B , they will receive i, j if i j ≤ 100, and 0, 0 if i j > 100 see Table 1, Combination of payoffs . i v 1 , v 2 , ..., v m If Mo v 1 , v 2 , ..., v m If Mo v 1 , v 2 , ..., v m If Mo v 1 , v 2 , ..., v mA H ⇒ A selects strategy i L and selects a random strategy i ∈ A with probability ε.
A L ⇒ A selects strategy i H A M ⇒ A selects strategy i M