1. Introduction

MPE

Mathematical Problems in Engineering

1563-5147 1024-123X

Hindawi Publishing Corporation

108024

10.1155/2014/108024

108024

Research Article

Punishment Effect of Prisoner Dilemma Game Based on a New Evolution Strategy Rule

Sun

Dehui

Kou

Xiaoliang

Zhang

Wei

Key Laboratory of Beijing for Field-Bus Technology & Automation

North China University of Technology

Beijing 100144

China

ncut.edu.cn

2014

2742014

2014 11 03 2014 06 04 2014 27 4 2014

2014

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We discuss the effect of the punishment in the prisoner’s dilemma game. We propose a new evolution strategy rule which can reflect the external factor for both players in the evolution game. In general, if the punishment exists, the D (defection-defection) structure (i.e., both of the two players choose D-D strategy) which is the Nash equilibrium for the game can keep stable and never let the cooperation emerge. However, if a new evolution strategy rule is adopted, we can find that the D-D structure can not keep stable and it will decrease during the game from the simulations. In fact, the punishment mainly affects the C-D (cooperation-defection) structure in the network. After the fraction of the C-D structure achieved some levels, the punishment can keep the C-D structure stable and prevent it from transforming into C-C (cooperation-cooperation) structure. Moreover, in light of the stability of structure and the payoff of the individual gains, it can be found that the probability which is related to the payoff can affect the result of the evolution game.

1. Introduction

Game theory is ubiquitous in the real world [1–23] in nature and society, such as the invasion of alien species and the conflict of trade between two countries. However, how to settle up with the contradiction between the selfish individual and the social wellbeing and make maximum benefit for the whole society have confused scientists for some decades. There are two classic models in game theory: the public goods game (PGG) and the prisoner’s dilemma game (PDG). PGG can be used to study the problem about the cooperation in game [3]. Wang et al. [5] studied the evolutionary dynamics of PGG in finite populations. Under the evolutionary dynamics, players who contributed more could successfully defend the invasion and invade others. It could help us understand cooperative behaviors about the contributions in the real world. Furthermore, Wang et al. [6] considered the effect of wealth distribution about PGG under collective risk to analyze the cooperation among rich and poor individuals. On the other hand, PDG has been used to study how to eliminate the dilemma between the person and the society [8]. Nowak and May [9] found that the spatial structure could benefit the cooperators against defectors’ invasion, which inaugurated a new field-complex network, to study the game theory. The vertex nodes represent the individuals and the edges represent the interactions among the players. Tomassini et al. [11] used two kinds of models of complex network-regular lattices and random graphs to research the Hawk-Dove game and found that the fraction of cooperators in the network was related to the gain-to-cost ratio. Heterogeneity, one of the most important properties in complex network, plays a very important role in the evolution game. Fu et al. [12] found that in small world network the underlying network topological organization could help in enhancing and sustaining the cooperative behaviors. Furthermore, Fu et al. [13] presented a punished strategy having the high heterogeneity property which could make the cooperators survive and wipe out the defectors. Perc and Szolnoki [15] found that the distribution of the wealth and social status could promote the cooperation in the evolution game. Ishibuchi and Namikawa [16] researched the evolution strategies about iterated PDG. The players in the game were located in a cell of grid world. They found that the structure could benefit from promoting the cooperation with random pairing. Roca et al. [18] discussed the effect of spatial structure about the evolution cooperation. Despite the results, they offered some new insights like the relation between the intensities of selection.

In this paper, we pay our attention to the effect of the defection’s payoff for the evolution game. In order to simplify the process of the game of PDG, most of the literature usually adopts the limit PDG model, in which the punishment is zero. Here, we consider the factor about the punishment for the evolution game. The reminder of this paper is organized as follows. Section 2 gives the model and the strategy rule. Section 3 presents the simulation results and the explanations. And the conclusion is made in Section 4.

2. The Model and the Strategy Rule of the Evolution Game

There are only two strategies, C (cooperation) and D (defection), in the PDG. According to the strategy selected, the two players will get the benefits, respectively. If one player chooses the C and the other chooses the D, the individual choosing the strategy C will get the lowest payoff S. Meanwhile, the individual choosing the strategy D will gain the highest payoff T. If both of the two players choose the strategy C or strategy D, they will gain the payoff R or P, with T>R>P>S and 2R>T+S. We use a 2×2 matrix denoting the possible strategies and payoff as follows: (1) (RSTP). In order to study the processing of the game theory concisely, researches usually take the limited payoff matrix; that is, let element P be zero, which is introduced by Nowak and May [19]. Here, we use the normal payoff matrix as follows: (2)A=(10bp), 1<b<2, 0<p<1. Now, we give the strategy rule on lattice network for the evolution game; that is, if one individual’s payoff is larger than its neighbor’s, it will keep its strategy; otherwise, it will randomly imitate the other individuals’ strategy in which one of its neighbors interacts with itself. Note that this evolution rule can reflect the external effect for the individuals in the game as in Figure 1. If the individual E’s payoff is larger than F’s, E will keep its strategy unchanged. Otherwise, E will imitate H’s, I’s, or G’s strategy randomly. At the same time, we can regard H, I, or G as the environment for E who does not interact with them. In the evolution game, the probability of the strategy changed between individuals x and y depended on the payoff difference [13]: (3)W[sx⟵sy]=11+exp⁡⁡(Ux-Uy)/β, where β characterizes the noise for permitting irrational choices. In iterated PDG, both of the two players will choose the strategy D gradually, so the strategy (D, D) is the only Nash equilibrium for the PGD. Usually, the researches can let the punishment benefit be zero for simplifying the processing of the evolution game. However, in real world, the punishment benefit is not always zero. When the external effect becomes a very important factor in the evolution game, the effect of punishment can never be ignored. Moreover, sometimes the effect of punishment may not affect the evolution game in negative degree.

Figure 1

The imitating strategy rule.

3. Main Result and Simulations

In this section, we will illustrate some simulations on the lattice network with the size N=50×50; let the final time step T=400; and all individuals’ strategies selected are D in the initial network. We run 100 simulations independently and take the average data of the 100 simulations for Figures 2–12. It is easy to see that b in the payoff matrix can affect the result of the evolution game from the model. Therefore, we will discuss the two different cases for 1<b<1.5 and 1.5<b<2, respectively. First, we take b=1.35, p=0.3, and β=0.1. For b=1.35, the result of the game will evolute to the equilibrium state which is the cooperators and defectors located on the lattice alternatively for the evolution strategy rule. With the punishment element p existing, one individual will not choose the strategy C. Because others choose the strategy D, it means that the one selecting strategy C will gain no payoff; moreover, both of the two players selecting strategy D will also get payoffs. And, in PDG, both of the two players choose the strategy D which is the dilemma of the game. Generally, the punishment can help the D-D structure keep stability. From Figure 3, comparing to p=0, the appearance of cooperators in the network will increase slowly in the initial network. As the game goes, we can see that the faction of the cooperators for p=0.3 will be larger than that of p=0. But, from the common sense, the result of the game for p=0 should be better than p=0.3. Why can this unusual situation happen? If one individual on lattice selects strategy C, its neighbors can get the maximum payoff. However, the ones connected with the neighbors whose payoffs are less than the neighbors’, according to the evolution rule, may select the strategy C. In addition, the strategy of changing probability depends on the individual’s payoff. For the effect of the punishment element p, the probability of the strategy D transforming to strategy C will be large for formula (3). At the same time, the probability of the strategy C changing to strategy D will be less contrarily. So, the C-D structure formed for p=0.3 is more stable than for p=0, which is the reason why the percentage of the cooperators for p=0.3 is more than that for p=0 at the end of the game. And then we find that the strategy D-D structure cannot stop the cooperation in the network for the evolution strategy rule. For a fixed b, we will see that the different p can affect the evolution game. Here, let b=1.35, β=0.1, p1=0.3, and p2=0.5. From Figure 3, being similar to the above analysis, for p2=0.5, we can see that the fraction of cooperators will increase more slowly and achieve more profit than p1=0.3. Therefore, the punishment does not always take negative effect. In some particular evolution rules, the punishment can help in promoting the cooperation.

Figure 2

The solid line represents the fraction of cooperators in the network with b=1.35 for p=0 and the dashed line represents that for p=0.3.

Figure 3

The solid line represents the fraction of cooperators in the network with b=1.35 and β=0.1 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 4

The solid line represents the fraction of cooperators in the network with b=1.55 and β=0.1 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 5

The solid line represents the fraction of C-C structure in the network with b=1.55 and β=0.1 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 6

The solid line represents the fraction of C-D structure in the network with b=1.55 and β=0.1 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 7

The solid line represents the fraction of D-D structure in the network with b=1.55 and β=0.1 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 8

The solid line represents the fraction of cooperators in the network with b=1.55 and β=0.0015 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 9

The solid line represents the fraction of C-C structure in the network with b=1.55 and β=0.0015 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 10

The solid line represents the fraction of C-D structure in the network with b=1.55 and β=0.0015 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 11

The solid line represents the fraction of D-D structure in the network with b=1.55 and β=0.0015 for p1=0.3 and the dashed line represents that for p2=0.5.

Figure 12

The triangle line represents the fraction of cooperators in the network with b=1.55 for β=0.1, the square line represents the fraction of cooperators in the network for β=0.03, and the circle line represents the fraction of cooperators in the network for β=0.0015.

Next, for 1.5<b<2, the result of the evolution game is different from the situation for 1<b<1.5. The cooperators can form triangle clusters to fight against the defectors’ invasion efficiently and can expand more in the network [21, 22]. So, the cooperators can break up the equilibrium state who can take advantage obviously at the end of the game. Let b=1.55, β=0.1, p1=0.3, and p2=0.5. From Figure 4, for p2=0.5, the fraction of the cooperators will increase firstly and then when the percentage of the cooperators achieves some levels, the growth of the cooperators will slow down. Furthermore, at the end of the evolution game, the percentage of the cooperators will be less than p1=0.3.

We will illustrate it together in Figures 5–7. Because of the evolution rule, if someone chooses the strategy C, others who are the neighbors of the C individual’s neighbors may select the strategy C. Therefore, the D-D structure cannot keep stable and then the D-D structure will transform to the C-D structure and drop obviously as the game goes. We can also see that, for higher p, the fraction of D-D structure decreases faster and more in Figure 8. With the cooperators increased in the network, the C-D structure will increase and, accordingly, the C-C structure will also increase. Because the cooperators can form the triangle structure to defend the defectors’ invasion and the cluster of the cooperators can be larger, so the fraction of the C-C structure can increase. From velocity of increasing for cooperators in Figures 6 and 7, the C-D structure will increase fast. After achieving the summit, the C-D structure will transform to C-C structure fast and then decrease as the evolution game goes. The effect of the C-C cluster will enhance. That is the reason why the fraction of C-C structure will increase as the game goes. With the effect of the punishment, the C-D structure will decrease more slowly for p2=0.5 than for p1=0.3. For higher p, the C-D structure can better keep stable. This situation means that the punishment p can defend the cluster of the cooperators and keep the C-D structure stable. Therefore, the fraction of the cooperators for p1=0.3 will be larger than p2=0.5’s at the end of the evolution game. In conclusion, for the particular evolution rule, the effect of the punishment can affect certain strategy structure. Here, we give the data about the the changing of the fraction of C-D structure on lattice network as in Table 1.

Table 1

The data about changing of the fraction of C-D structure in network.

C-D	T = 20	T = 40	T = 60	T = 80
p = 0.3	0.035862	0.246115	0.554025	0.643101
p = 0.5	0.035819	0.245783	0.557239	0.636969

C-D	T = 100	T = 120	T = 140	T = 160

p = 0.3	0.611341	0.567636	0.515405	0.470873
p = 0.5	0.602177	0.557766	0.519670	0.485273

C-D	T = 180	T = 200	T = 220	T = 240

p = 0.3	0.443957	0.418154	0.400718	0.394343
p = 0.5	0.473424	0.446014	0.423388	0.420653

C-D	T = 260	T = 280	T = 300	T = 320

p = 0.3	0.381033	0.368123	0.353101	0.351371
p = 0.5	0.415273	0.406363	0.400152	0.390591

C-D	T = 340	T = 360	T = 380	T = 400

p = 0.3	0.348811	0.336239	0.334177	0.33415
p = 0.5	0.397233	0.393287	0.389619	0.37703

From the above analysis, we can find that the changing of the payoff which the individual gains can affect the probability. And then we will focus on the effect of the probability. So, we can use the parameter β in formula (3). Here, for b=1.55 and β=0.0015, let p1=0.3 and p2=0.5. When the parameter β turns to be small, the probability of the strategy changing will also be small. In Figure 8, for p1=0.3, the fraction of the cooperators will increase firstly and will be larger than p2=0.5 at the end of the game. From Figures 9–11, we can see that the C-D and C-C structure for p1=0.3 will increase firstly and the D-D structure will decrease firstly. And, then, the C-D structure for p2=0.5 achieves the summit more; after that, the fraction of the C-D structure decreases less than p1=0.3 and the fraction of C-C structure for p=0.3 in the network will almost be more than p2=0.5. Comparing with Figures 4–7, we can see the difference. In Figure 8, the faction of cooperators will increase slowly for higher p, but the situation is in contrast to that in Figure 4. Why can this difference happen? For the smaller probability, the changing of strategy is not often. It will help the strategy D-D structure in keeping stable. However, for the evolution strategy rule, the D-D structure cannot keep stable and it will transform to C-D structure. In another way, the smaller probability can reduce the effect of evolution rule in some degree. From Figures 9–11, we also can see that the effect of the punishment mainly affects the fraction of the C-D structure. For higher p, the C-D strategy can gain more and keep more stable. In Figure 12, we can find that the more the β is, the more the fraction of the cooperators is. Moreover, with the β increased, the fraction of the cooperators in the network will increase fast, which implies that the probability can affect the result of the evolution game.

4. Conclusion

In this paper, we have discussed the problem of the effect of the punishment for the evolution game on lattice. We proposed an evolution strategy rule which can reflect the external factors. Under the evolution rule, we can find that the punishment can affect the evolution game. The punishment can help the cooperators to increase firstly which is contrary to the common sense that the D-D structure will keep stable. Actually, the D-D structure cannot be stable for the evolution rule. Moreover, the punishment through the C-D structure affects the result of the evolution game. For higher p, when the C-D structure achieves the summit, it will keep more stable and decrease less. Despite the payoff the players gain, we also find that the probability is related to the evolution game. The more the probability is, the more and faster the fraction of cooperators increases.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant no. 61174116.

Colman

A. M.

Game Theory and Its Applications in the Social and Biological Sciences 1995

Oxford, UK

Butterworth-Heinemann

Pennisi

How did cooperative behavior evolve

Science 2005 309 5731 93

2-s2.0-21644449402

10.1126/science.309.5731.93

Olson

The Logic of Collective Action: Public Goods and the Theory of Groups 1965

Cambrigdge, Mass, USA

Harvard University Press

Wang

Lin

Flocking of multi-agents with a virtual leader

IEEE Transactions on Automatic Control 2009 54 2 293 307

2-s2.0-61349202353

10.1109/TAC.2008.2010897

MR2491958

Wang

Chen

Wang

Evolutionary dynamics of public goods games with diverse contributions in finite populations

Physical Review E 2010 81 5 8

2-s2.0-77952391903

10.1103/PhysRevE.81.056103

056103

Wang

Effects of heterogeneous wealth distribution on public cooperation with collective risk

Physical Review E 2010 82 1 13

10.1103/PhysRevE.82.016102

016102

MR2736375

Zhang

Chen

M. Z. Q.

Wang

Adaptive flocking with a virtual leader of multiple agents governed by locally Lipschitz nonlinearity

Nonlinear Analysis: Real World Applications 2013 14 1 798 806

10.1016/j.nonrwa.2012.06.010

MR2969874

ZBL1258.68161

Poundstone

The Prisoner’s Dliemma 1992

New York, NY, USA

Doubleday

Nowak

M. A.

May

R. M.

Evolutionary games and spatial chaos

Nature 1992 359 6398 826 829

2-s2.0-0026613691

10.1038/359826a0

Chen

Wang

Lam

Semiglobal observer-based leaderfollowing consensus with input saturation

IEEE Transactions on Industrial Electronics 2014 61 6 2842 2850

10.1109/TIE.2013.2275976

Tomassini

Luthi

Giacobini

Hawks and Doves on small-world networks

Physical Review E 2006 73 1 10

2-s2.0-32844463079

10.1103/PhysRevE.73.016132

016132

Chen

Liu

Wang

Social dilemmas in an online social network: the structure and evolution of cooperation

Physics Letters A 2007 371 1-2 58 64

2-s2.0-35348954241

10.1016/j.physleta.2007.05.116

Chen

Liu

Wang

Promotion of cooperation induced by the interplay between structure and game dynamics

Physica A 2007 383 2 651 659

2-s2.0-34447095289

10.1016/j.physa.2007.04.099

Rong

Chen

Wang

Chen

Wang

Decentralized adaptive pinning control for cluster synchronization of complex dynamical networks

IEEE Transactions on Cybernetics 2013 43 1 394 399

Perc

Szolnoki

Social diversity and promotion of cooperation in the spatial prisoner’s dilemma game

Physical Review E 2008 77 1 5

011904

10.1103/PhysRevE.77.011904

Ishibuchi

Namikawa

Evolution of iterated prisoner's dilemma game strategies in structured demes under random pairing in game playing

IEEE Transactions on Evolutionary Computation 2005 9 6 552 561

2-s2.0-29244450853

10.1109/TEVC.2005.856198

Chen

M. Z. Q.

Lam

Lin

Semi-global leader-following consensus of linear multi-agent systems with input saturation via low gain feedback

IEEE Transactions on Circuits and Systems. I. Regular Papers 2013 60 7 1881 1889

10.1109/TCSI.2012.2226490

MR3072458

Roca

C. P.

Cuesta

J. A.

Sanchez

Effect of spatial structure on the evolution of cooperation

Physical Review E 2009 80 4 16

046106

10.1103/PhysRevE.80.046106

Nowak

M. A.

May

R. M.

The spatial dilemmas of evolution

International Journal of Bifurcation and Chaos in Applied Sciences and Engineering 1993 3 1 35 78

10.1142/S0218127493000040

MR1218718

ZBL0870.92011

Liu

Kou

X. L.

Zhang

The imitating strategy rule about prisoner’s dillema game on lattice

Proceedings of the 5th International Conference on Intelligent Computation Technology and Automation (ICICTA '12)

January 2012

Hunan, China

719 722

10.1109/ICICTA.2012.187

Vukov

Szabó

Szolonoki

Cooperation in the noisy case: Prisoner’s dilemma game on two types of regular random graphs

Physical Review E 2006 73 6 4

067103

10.1103/PhysRevE.73.067103

Chan

C. K.

Szeto

K. Y.

Decay of invincible clusters of cooperators in the evolutionary prisoner's dilemma game

Applications of Evolutionary Computing 2009 5484

Springer

243 252 Lecture Notes in Computer Science

2-s2.0-67650697052

10.1007/978-3-642-01129-0_28

Altrock

P. M.

Wang

Aspiration dynamics of multiplayer games in finite populations

Journal of the Royal Society Interface 2014 11 94

20140077

10.1098/rsif.2014.0077