Punishment and Feedback Mechanism for the Evolution Game on Small-World Network Based on Varying Topology

We address the problem of the punishment and feedback mechanism for the evolution game on small-world network with varying topology. Based on the strategy updating rule, we propose a new punishment and feedback mechanism; that is, all the individuals of the network will play an n-round Prisoner’s Dilemma Game firstly and then, for the most defectors, their neighbors will punish them and break the connecting link with them and set up the new connecting link for themselves. The mechanism can make the degree of the whole network decrease. We find that the mechanism can help keep the cooperators surviving and make them avoid being wiped out by the defectors. With the mechanism being adopted, the number of n-round Prisoner’s Dilemma Game (PDG) almost has no effect on the evolution game. Furthermore, the probability of the average connecting ⟨k⟩ and the scale of the network is related to the result of the evolution game.


Introduction
There are a lot of phenomena about game theory in our world ranging from nature to our human society.Cooperation is one of the most important properties of game theory [1,2].In nature, two different kinds of animals work together to make benefit for themselves.And in human society, the cooperation is ubiquitous in many aspects such as politics and economy.Why can cooperation be emerged in many fields and why can the cooperation be selected by nature?All of the phenomena have been focused on for some decades.
Recently, cooperative behavior is researched by the evolution game, which provides a very efficient framework for appearing and retaining the cooperation between the self-individual and the social well-being.The vertex in the evolution represents an individual, and the edge represents the interaction between the two players.There are two main models in the evolution game: one is the Public Goods Game (PGG) which can be used to study the problem of the cooperation in game [3].Wang et al. [4] studied the evolutionary dynamics of PGG in finite populations.With the dynamics being adopted, players who contributed much could successfully defend the invasion by the defectors.Moreover, they could let more players join their group.This work could help us understand the cooperative behaviors about the contributions in real world.Furthermore, Wang et al. [5] researched the wealth distribution among the rich and the poor in PGG to analyze the cooperation individuals with collective risk.The other model is the Prisoner's Dilemma Game (PDG) which has been used to eliminate the contradiction between the self-interest and the social benefit [6].The study for the PDG can be divided into two parts.One is the evolution strategy rule which can be benefitical to the emergence of cooperation such as the "Best-Take-Over" rule where the individual will imitate the strategy of the neighbor whose benefit is the largest [7].Fu et al. [8] presented a punished strategy which had the high heterogeneity property that could make the cooperators survive and wipe out the defectors.The other part is to study the structure of the complex network [9][10][11][12][13][14][15][16][17][18][19][20][21][22] such as lattice, small-world, scale-free, and community structure.Nowak and May found that the spatial structure could benefit the cooperators against defectors' invasion on lattice network [23].Tomassini et al. [24] used two kinds of models of 2 Mathematical Problems in Engineering complex network, regular lattices and random graphs, to investigate the Hawk-Dove game and found that the fraction of cooperators in the network was related to the gain-tocost ratio.Fu et al. [25] found in small-world network the underlying network topological organization could help in enhancing and sustaining the cooperative behaviors.Santos found that the scale-free network could help the cooperators in emerging [26].Perc and Szolnoki [27] found that the distribution of the wealth and social status could promote the cooperation in the evolution game.Ishibuchi and Namikawa [28] researched the evolution strategies of iterated PDG.The players in the game are located in a cell of grid world.They found that the structure could benefit by promoting the cooperation with random pairing.
In this paper, we pay attention to the evolution strategy rule.Based on the strategy updating rule, we propose a new punishment and feedback mechanism whose degree is decreasing as the evolution game goes.In the punishment and feedback mechanism, the individual in the small-world network plays an -round PDG with their neighbors firstly.And then, according to the payoff they gain and self-interest, some individuals who do not satisfy their payoff will punish their neighbors through breaking the connecting link to the one who chooses defection most time.Meanwhile, the individuals will build up the connecting interaction link correspondingly for themselves.
The remainder of this paper is organized as follows.Section 2 introduces the model and the punishment strategy rule.Section 3 gives the simulation results.The conclusion is made in Section 4.

The Model and the Punishment and Feedback Mechanism of the Evolution Game
There are two strategies, cooperation () and defection (), in the PDG.According to the selected strategy, the two players will get the benefits for themselves.If one player chooses the  and the other chooses the , the individual who chooses the  will get the lowest payoff , in the meantime, the individual choosing the  will gain the highest payoff .If the two players both choose the  or , they will gain the payoff  or , with  >  >  > , and 2 >  + .We use a 2 × 2 matrix denoting the possible strategies and payoffs as follows: In the following, we will use a limit payoff matrix introduced by Nowak and May [29]: Every individual in model ( 2) will interplay with their neighbors; since there are two strategies for PDG, if individual  chooses the strategy , we denote   = (1, 0)  and   = (0, 1)  for .In each round, the individual will keep their strategy unchanged when they interplay with all of their neighbors.After the ending of each round, the individual will accumulate the payoff with their neighbors.The payoff of the player at site  and their neighbor at site  can be described as follow [8]: where the total payoff for the player at site  should add all the payoff with their neighbors [8].In the next round, the player can take their neighbors' strategy for updating with the probability.The probability which the player at site  adopts their neighbors' strategy at site  depends on the payoff difference [30]: where  characterizes the noise which can permit the irrational choices.
In our model, the total number of all players is .Assume every individual interplays an -round PDG with their neighbors firstly ( ≫ 1).After that, choosing  individuals randomly, they could be permitted to adjust their connecting link to the neighbors by the payoff they gain (1 ≪  ≪ ).If the individual satisfies their income, they will keep the link.Otherwise, they can break the connecting link and seek others for a new one.Here, we focus on the defector.For the defector, will get more benefits when their neighbors prefer to cooperate in the -round PDG.It means that the defector has the responsibility for the less payoff of the cooperator.Because of the self-interest, the cooperator cannot bear the defector and they will break the connecting link for the defector and seek another for a new one.Furthermore, with the effect of the -round PDG, the individual who has the function of the memory can record the situation of the strategy selected by the neighbors.Based on above rule, we give our punishment mechanism that the individuals will play an -round PDG with their neighbors firstly.After that, according to the payoff they gain and the self-interest, some individuals who do not satisfy their payoff will punish their neighbors who defect most of the time through breaking the connecting link with them.Two of the neighbors of the defectors will dismiss the connecting link.At the same time, they who are not the neighbors before the nonadjacent individuals will build up the connecting interaction link correspondingly for themselves.In [8], the authors had explained that friend's friend could benefit from becoming a new link.Moreover, the two individuals who punish the defector prefer to cooperate because of the payoff gained in the -round PDG.When the two individuals have the interaction, the cooperation can be emerged between them in the next -round PDG.This punishment and feedback mechanism is corresponding to our real world better.

Main Results and Simulations
We consider the evolution game on small-world network with the total individuals .Each vertex represents one individual.And if two individuals have an interaction, there is an undirected edge between them.Set the average connectivity ⟨⟩ = 4 and the rewire probability  = 0.6 for the initial network.Here, we give an example of our model of the smallworld network in Figure 1.
The steps of the evolution game are implemented as follows.
(1) Every individual on the network will play an round PDG with their neighbors firstly.At the end of one round, the individual can be allowed to change their strategy by the formula (4).Furthermore, the individual can take notes of the times of defection for their neighbors.
(2) Choose  individuals from the total  individuals.And according to the payoff and the defection times that the individuals get, the  individuals are permitted to adjust their interaction link.If one of the  individuals satisfies the payoff for themself and the performances of their neighbors, they can keep the interaction link.Otherwise, for the individual choosing the strategy  in most time, two of their neighbors will break the interaction with them and then they will build up the interaction for themselves.
The situation is shown in Figure 2.
(3) Repeat the two aforementioned steps with the time step  = 400.
In the initial network, the cooperators and defectors are distributed randomly and the percentages of the cooperators and the defectors are equal.50 simulations are run independently and we choose the average data of the 50 simulations for Figures 5-9.If the punishment and feedback mechanism is adopted, there are two situations about the breaking of the interaction link for our model.One is choosing two of the neighbors randomly from the individuals who defect most of the time.The other situation is choosing two of the neighbors who gain the least payoff in the -round PDG.Comparing the two situations, we analyze the difference in Figure 3. Form Figure 3, we can see that for the situation where the two neighbors are chosen through the payoff they gain, the fraction of the cooperators in the network can be larger.As above analysis, the two neighbors for most hostile individuals, if they gain the payoff less, prefer to cooperate with others.When they break the interaction for the defectors and set up the interaction with each other, the cooperation can be emerged between them easily.This situation is related to our real world better.For example, the friends of a friend are easier to cooperate in our real life.Therefore, we will select the punishment and feedback mechanism in the following by the situation of the neighbors chosen.And if the mechanism is adopted, the degree of the whole network will be decreased.There is an extreme example shown in Figure 4.For individual , in one -round PDG,  is the most defector and their neighbors choose to break the interaction.And for the next -round PDG,  is also the most defector and their only two neighbors still choose to break the interaction.Then  has no interaction with others.For this situation, because the defection is the only Nash equilibrium  for the game, then let the individual  choose the strategy  until the end of the evolution game.
Firstly, we will report the effect of the parameter  in the limited payoff matrix in Figure 5.With the parameter  increasing, the fraction of the cooperators in the network will decrease.As we know, the defectors will gain more payoff when  is increasing.It causes the situation where the fraction of the defectors in the network is also increasing.With the punishment and feedback mechanism adopted, the fraction of the cooperators in the network will oscillate in some scopes.It means that the mechanism could keep the numbers of cooperators fluctuating in some scope and prevents the defectors from invading them totally.It is similar to the Soliton disturbances in physics.
Next, we study the effect of the number of round  for the evolution game.In Figure 6, we can see that when the the number  increases, the average fraction of cooperators in the network will not change very well by fixed other parameters.It means that the decreasing of the degree induced by the punishment and feedback mechanism can reduce the negative effect of the round  which is different from [8].
Furthermore, we consider the effect of the probability for the evolution game.The mainly parameter  which stands for the intensity of selection in (4) is discussed in Figure 7. From Figure 7, we can see that the fraction of the cooperators will decrease with  increasing.When  is increasing, the probability of the changing strategy is also increasing.Because the defectors can get the largest payoff and the changing of the strategy is depending on the payoff that the individual gains, it will help the strategy selected by the individual in transforming strategy  to strategy .
Finally, we will investigate the effect of the average connectivity ⟨⟩ and scale of the network .From Figure 8, we can find that with ⟨⟩ increasing, the fraction of the cooperators in the network will decrease correspondingly.The average connectivity ⟨⟩ stands for that one individual having the relations with the others, because the punishment and feedback mechanism just takes effect for one individualthe most defectors.When ⟨⟩ increases, the effect of the mechanism will diminish and the properties of the PDG will play an important role in the evolution game.Therefore, it causes the situation where the defectors can invade the cooperators and the fraction of cooperators will decrease correspondingly.And when the scale of the network  increases, the effect of the mechanism will also be weak by the dropping of the fraction of /, which implies that the more complex the network is, the weaker the effect of the mechanism is.This situation coincides with our real life.As the old Chinese saying: one boy is a boy, two boys half a boy, three boys no boy.

Conclusion
In this paper, we have discussed the problem of the punishment and feedback mechanism with the topology changed on small-world network.We have proposed a new punishment and feedback mechanism where all the individuals in the network will play an -round PDG firstly and then for the most defectors, their neighbors will punish them and break the connecting link with them and set up the connecting link for themselves.The mechanism causes the degree of the whole network decreasing.We have found that the mechanism can help in keeping the cooperators surviving and in avoiding being wiped out by the defectors.The number of -round PDG has almost no effect on the evolution game.Furthermore, when the probability decreases, the fraction of the cooperators in the network will decrease.And when the average connecting ⟨⟩ and the scale of the network increase, the fraction of cooperators will decrease.The results have shown that with the number and complex of the network increased, the effect of punishment and feedback mechanism will be weakened.

Figure 2 :
Figure 2: The punishment and feedback mechanism.

Figure 3 :
Figure 3: The fraction of cooperators of two different situations of interaction link for the evolution game.The blue line represents the choosing situation and the red line stands for the unchosen situation with with  = 200,  = 8,  = 5,  = 1.3, ⟨⟩ = 4, and  = 0.01.

Figure 4 :
Figure 4: An extreme example of the punishment and feedback mechanism.