Packet Forwarding Strategies in Multiagent Systems : An Evolutionary Game Approach

In multiagent systems (MASs), agents need to forward packets to each other to accomplish a target task. In this paper, we study packet forwarding among agents using evolutionary game theory under themechanisms of Carrier SenseMultiple Access/Collision Avoidance (CSMA/CA). Packet forwarding among agents plays a key role to stabilize the whole MAS. We study the transfer probability of packet forwarding of agents at the idle state or the busy state and computer the probability of the packet forwarding for a MAS. When agents make their decisions to select Forward or No-Forward strategy, a packet forwarding evolutionary game model is built to reflect the utilities of different packet forwarding strategies. Two incentive mechanisms are introduced into the gamemodel. One is tomotivate agents to strengthen cooperation; the other is to encourage agents to select theNo-Forward strategy to save energy while they are in the busy state. The parameter value that encourages an agent to select the No-Forward strategy is inversely proportional to the average probability of the packet forwarding. The replicator dynamics of agent packet forwarding strategy evolution are given. We propose and prove the theorems indicating that evolutionarily stable strategies (ESSs) can be attained.The results of simulation experiments verify the correctness of the proposed theorems and the effects of the two incentive mechanisms and the probability of packet forwarding, which assures the robustness of evolutionary stable points among agents in MASs.


Introduction
Multiagent systems (MASs) are computerized systems composed of multiple interacting intelligent agents that have limited energy and self-organization ability.To achieve a common goal, MASs require them to self-organize into a network and collaborate with one another [1].MASs are used most often in the engineering and technology fields.Examples of applications are disaster response, mobile robots, spacecraft, and sensor network [2].
The agents need to interact with each other as much as possible to accomplish a target task.Cooperation is a fundamental problem in the distributed control community; the agents must effectively cooperate with each other for mutual benefit [3,4].The cooperative results lie in the actions taken by the interacting agents.In the process of cooperation, agents transmit messages to each other through packet forwarding.We assume that all agents in MASs are rational and aim at maximizing their own profits.When packet forwarding will incur certain costs, each agent makes a decision whether to forward a packet or not according to its own benefit.Hence, the pack forwarding can be described as a game, where players are rational agents with selfish behaviors and the strategy of a player is to Forward or No-Forward a packet.Agents can make a decision to forward packets or not according to their benefits.
As a mathematical tool, game theory mainly focuses on the competitive and cooperative relationship of the participants, and it has been widely applied in the field of cooperation.Typical examples of evolutionary games include repeated prisoners' dilemma and snowdrift games [5][6][7].In [8], Shen et al. compare different methods of games and regard the evolutionary game as a good method to solve the problem of node cooperation.In the evolutionary game process, the individuals maximize their utilities by selecting a higher gain strategy, the ratio of individuals 2 Wireless Communications and Mobile Computing who select the corresponding strategies tends to be stable, and the population ultimately will attain a dynamic equilibrium state.Thus, the study of this cooperative packet forwarding model and evolutionary game theory in MASs is appropriate.
In the paper, using a Markov chain model, we calculate the probability of packet forwarding while agents are in idle or busy states and analyze the packet forwarding behaviors of the agents.We then build a packet forwarding strategies game model, which defines the payoffs of the different strategies.Moreover, we introduce two incentive mechanisms into the game, one to increase agent cooperation by selecting Forward strategy and the other to increase agent noncooperation by selecting No-Forward strategy while the agent is in the busy state.Furthermore, using replicator dynamics, we obtain and prove the evolutionary stability theorems and inferences, which establish the conditions to attain steady states for the packet forwarding strategies game.The main contributions of this paper are as follows: (1) The Markov chain model is used to calculate the probability of packet forwarding for an agent at the idle or busy state.We establish a packet forwarding strategies game model with two incentive mechanisms.One incentive mechanism is to promote cooperation among agents by forwarding packets and the other is to promote noncooperation among agents by selecting the No-Forward strategy to save energy while the agent is in the busy state.
(2) Replicator dynamics is used to prove the evolutionary stability theorems and inferences, which provide the conditions to attain stable states for the packet forwarding strategies game model in MASs.
(3) The experiments verify the correctness of the provided theorems and inferences.Also, the experiments show the effects of two incentive mechanisms and the probability of packet forwarding of agents on the rate of convergence to stable states.
The rest of the paper is organized as follows.We first discuss the related research in Section 2 and set up a packet forwarding Markov model in Section 3. We propose an evolutionary game model of packet forwarding strategies and present the theorems including the conditions of ESSs in Sections 4 and 5.The simulation results are shown in Section 6 and, finally, Section 7 is the conclusion.

Related Work
In recent years, to provide an incentive for the agents/nodes to cooperate, two mechanisms have been devised: incentive mechanisms and punishment mechanisms, which incentivize agents/nodes to forward the others' packets and punish misbehaving nodes.Most research involves strengthening packet forwarding among nodes/agents by setting up incentive-based systems [9,10] or reputation-based systems [11,12].Most incentive-based mechanisms give a reward to nodes/agents for participating in packet forwarding [9].In a reputation-based system, a node's behavior is monitored and evaluated by its neighbors and each node is required to keep track of its neighbors' reputation values, which are updated based on weighted calculations of the value of the node's own observations and the value of the other nodes' recommendations [12].
Zhu et al. [13] propose an adaptive repeated game scheme to ensure the cooperation among nodes in wireless networks and implement a self-learning algorithm to improve cooperation probabilities.The simulation results show that the selfish nodes are indeed more likely to cooperate with each other by the schemes above, but they may be unsuitable for MASs.To stimulate selfish nodes to cooperate, Xu et al. [17] propose a Win-Stay, Lose-Likely Shift (WSLLS) approach in a Prisoner's Dilemma (PD) game and a utility-based function is applied to evaluate a player's (i.e., node) performance.Experimental results show that the approach is effective in stimulating cooperation in different settings.Akkarajitsakul et al. [18] propose an approach to cooperatively forward packets based on coalition formation among mobile nodes.They use a bargaining game to find the most suitable probabilities in which each node would help other mobile nodes.They present a distributed algorithm to obtain the stable coalitions and they use a Markov-chain-based analysis to evaluate the stable coalitional structures that are obtained from the distributed algorithm.Performance evaluation results show that the mobile nodes with coalition formation have higher payoffs than mobile nodes acting alone.
Among these methods of evolutionary game theory, some stimulation mechanisms are proposed to incentivize the nodes' cooperation.Shen et al. [19] propose stimulating mechanisms to promote trust cooperation among nodes and analyze the dynamic evolutionary process of nodes selecting the trust strategy and derive several conditions with which the networks can attain the stable states.Li et al. [20], based on [19], consider a factor of packet loss in the data retransmission process and introduce a strategy adjustment mechanism into the evolutionary game process.This strategy adjustment mechanism compensates for the fact that the replicator dynamic model cannot reflect the requirements of individual strategy adjustments.The experiments show that the rate of convergence to reach the stable state of the strategy adjustment mechanism is faster than that of the normal replicator dynamic evolutionary method.Based on evolutionary game theory, Chen et al. [22] adopt a suitable dynamic incentive mechanism for WSNs, which emphasizes the nodes' adjust strategies forwardly and passively.This mechanism enables the selfish nodes to cooperate with each other and keep the network normal.
Among these methods of evolutionary game theory, some researchers study packet forwarding among nodes in different scenarios.Li et al. [23] study the adaptive packet forwarding of potential selfish nodes in mobile social networks (MSNs) and propose an incentive compatible multiplecopy packet forwarding (ICMPF) protocol to reduce the delivery overhead and to perpetuate a successful packet delivery.They design an evolutionary game model to guide the forwarding behavior of interaction nodes.Finally, they prove that the strategy dynamics eventually achieves the ESS and develop another strategy dynamics method which achieves the ESS.This method includes the nodes' distributed learning algorithm.Considering the noisy observation of transmissions and the loss of packets, Tang et al. [24] study packet forwarding among cooperative nodes in a one-hop unreliable channel.Based on evolutionary game theory, they propose an indirect reciprocity framework and enforce packet forwarding strategies in mobile ad hoc networks (MANETs).They also analyze the evolutionary dynamics of cooperative strategies game model and calculate the threshold of cost-to-benefit ratio to ensure cooperation among nodes.Lastly, the proposed cooperative solution is verified by the numerical simulations.To enforce cooperation in the case of channel noise, Wang et al. [25] focus on one-hop information exchange and design a packet forwarding game model with imperfect private monitoring and propose a state machine-based strategy to reach Nash equilibrium which proved to be a sequential state with carefully designed system parameters.
Compared to [19][20][21][22][23][24][25], our work is distinctive.Our packet forwarding strategy is derived partly from trust using game theory [19,20].Considering the packet forwarding randomness of an agent's communication state, derived from the Markov model [26], we calculate the probability of packet forwarding of agents at the idle or busy state using properties of a Markov chain, build a packet forwarding game model with the probability of packet forwarding, and introduce two incentive mechanisms in the game model.One incentive mechanism is to promote agents to strengthen cooperation and the other is to encourage agents to select No-Forward strategy to save energy while they are in the busy state.We set the parameter value inversely proportional to the probability of packet forwarding.In contrast, there is only one simple incentive cooperation mechanism in [19,20].Furthermore, we consider the agent's payoff impacted by the probability of packet forwarding, which is related to whether the agent's communication state is idle or busy, while Wang et al. [25] consider only the impact of probability of packet loss on node payoff.Li et al. [23] and Tang et al. [24] calculate the payoffs of forwarding packets between nodes without taking into account the interaction cooperation of nodes, which is different from ours.Thus, our packet forwarding strategies game is different from the related literature described above.

Packet Forwarding Markov Model
3.1.Packet Forwarding Markov Chain.Packet forwarding among agents in MASs is not a deterministic process.The present value of the process is independent of all past values.The process is a no-after-effects process and hence has Markov properties.
Time series of packet forwarding can be expressed by a Markov chain.If  1 <  2 . . .<   ⊂ Γ, then In this section, we build a Markov model of packet forwarding among agents in MASs.Let us assume the following: (1) If an agent has a new packet to forward, it forwards it at the beginning of the next time-slot.
(2) If an agent successfully forwards its packet, it can forward a new packet in the next time-slot.
(3) If an agent detects a collision, it forwards its packet in each subsequent time-slot until the packet is successfully forwarded.
With the assumptions, each agent is in either an idle state or a busy (backlogged) state.An agent is in the busy state if the number of packets that need to be transmitted exceeds its transmitting capacity, which leads to buffer overflow and loss of data packets.Furthermore, because agents have limited computing and communication capability with a multihop and many-to-one communication method, when the wireless communication channels present noise, or the network topology changes, or the packets suddenly increased because of an emergency, the agents may easily be in a busy state that will cause network delay and loss of data packets.The decision to forward the packet or not depends only on whether the agent is busy or not.Therefore, the packet forwarding decision in one time-slot for each agent is Markovian.Since agents always have packets to forward, the notations are shown in Table 1.
Packet forwarding is regarded as a two-state Markov chain in MASs.The state transition diagram of the Markov chain is shown in Figure 1.The arrows indicate the direction of the state transition, and the symbols on the edges indicate the transition probabilities.
The state diagram of a Markov chain describes the probability of all state transitions and their probabilities and is expressed in matrix form as follows [26]: is the state transition matrix of the Markov chain of an agent.

The Probability of Packet
Forwarding.The issue is how to determine the probabilities of the other time-slots once the probability of the initial state is given.
Suppose the initial probabilities of packet forwarding at the idle state and the busy state are P  ( 0 ) and P  ( 0 ).
The initial state is expressed by the initial probability row vector, that is,

𝑁
The number of agents in the MASs.

𝐼 and 𝐵
The idle and busy states of an agent, respectively.P  and P The probabilities of packet forwarding of an agent at the idle state and the busy state, respectively.

𝛼
The probability of packet forwarding at idle state of an agent.The probability of packet no forwarding at the idle state of an agent is 1 − .

𝛽
The probability of packet no forwarding at the busy (backlogged) state of an agent.The probability of packet forwarding at busy state of an agent is 1 − .

𝑃 𝐼𝐵
The state transition matrix of the Markov chain of an agent.

𝑅 𝑇
Gain obtained by an agent selecting Forward strategy, which also helps other agents to forward a packet successfully.

𝑅 𝐷
The incentive for noncooperation, gain obtained by an agent selecting No-Forward strategy to save energy while the agent is at the busy (backlogged) state.

𝑅 𝐶
Cooperation gain of an agent due to its opponent selecting the Forward strategy and successfully forwarding a packet.

𝐶
Cost caused by an agent forwarding others' a packet or sending its own packet.

𝐿
No cooperative loss of an agent for its opponent selecting No-Forward strategy.

W
The incentive for cooperation while an agent selects the Forward strategy.(W ≥ 0) The probability of the next time-slot is As time progresses from n to n+1, the general expression is P ( +1 ) = P (  )   . ( Another way to write these equations is Now, let n → ∞.The Markov convergence theorem says that the probability converges to a steady state and   and   are the eigenvectors of the unit eigenvalues of P  and P  , which are, using Mathematica software, Notice that and   are the probabilities of packet forwarding of agents at the idle state and the busy state.The probability of packet forwarding of agents is simply denoted as P.

Network Model Environment.
A multiagent system is a system that can accomplish a common goal with a number of mutually independent agents which possess the ability of self-organization, learning, and reasoning.It is critical that all agents have the independent ability to cooperate with each other, which ensures accomplishing the expected task for a multiagent system.
When agents perform a task in a noisy environment, they may be disturbed by some external factors.The main reason for loss packet is network congestion during the packet forwarding process.Assume that all agents have the same communication status factors in the MASs and that all agents have the same probability of successfully forwarding and receiving packets at any one time.

Packet Forwarding Strategy Description.
Communication among agents depends on packet forwarding with each other, and each agent has two strategies to select: Forward or No-Forward.In general, if an agent selects the Forward strategy, this will help other agents forward packets so as to enhance cooperation with each other and obtain profit at the same time.However, if an agent selects the No-Forward strategy, which will drop the packets that other agents forwarded, it may obtain gains for saving energy, but the behavior of dropping packets will result in a loss for the other agents.Table 2 displays the symbols used to describe this process.

Packet Forwarding Game Model. When agents interact
with each other in MASs, each agent can make a decision to select the strategy either Forward or No-Forward according to its gains; that is, the agents can get different gains by selecting a different strategy during the interactive process.
When two agents A and B transmit packets, each has two strategies: Forward and No-Forward.The two-agent packet forwarding model is shown in Figure 2.
There are three situations of packet forwarding strategy between two agents: both interactive agents select Forward Table 3: Payoffs of two interactive agents selecting the Forward strategy.
Serial number Serial number

Two Interactive Agents Selecting the Forward Strategy.
An agent might fail to transport packets due to the channel noise in MASs.Each packet forwarding strategy results in either success or failure.Suppose two agents, A and B, select one strategy.There are 4 different events in total.Table 3 shows the gains of the agents at the probabilities of each event occurring.
In Tables 3-5, the symbol  indicates the probability of packet forwarding of an agent, and 0 and 1 indicate failure and success, respectively.
In Table 3, A and B both select the Forward strategy.The expected payoff for each agent is the sum of each probability multiplied by its corresponding payoff [20].Thus, the payoffs of agents A and B are equal, that is, where  is the serial number.

Only One Agent
Selects the Forward Strategy.When two agents A and B interact, agent A selects the Forward strategy, while agent B selects the No-Forward strategy; but agent B can send its packets.Here, the selfish rational node is sending only its own packets without forwarding others' packets.There are 4 different events in total as shown in Table 4.
In Table 4, only one agent selects the Forward strategy.The expected payoff of each agent is the sum of each probability multiplied by its corresponding payoff [20].Thus, the payoff of agent A selecting the Forward strategy is and the payoff of agent B selecting the No-Forward strategy is where  is the serial number.Two agents A and B select the No-Forward strategy, but agents A and B can send their own packets.There are 4 different events in total as shown in Table 5.
In Table 5, two agents A and B both select the No-Forward strategy.The expected payoff of each agent is the sum of each probability multiplied by its corresponding payoff [20].Thus, the payoffs of agents A and B are equal, that is,

Packet Forwarding Game
Definition 1.The packet forwarding game in MASs is a quadruple Γ(Ψ, N, Ω, U), where Ψ is a population comprising an arbitrary number of individuals; (2) N is the set of individuals in the population Ψ; (3) Ω is a space of strategies from which an agent can select Ω = { 1 ,  2 } = {,  − }; (4) U is the matrix of payoffs of all the participating individuals as shown in Table 6.

Evolutionary Game Theory Based
Packet Forwarding

ESSs of Packet Forwarding
Game.The dynamic evolutionary process can be described by many different dynamic replicator models.There is a very widely used and famous dynamic replicator model proposed by Taylor and Jonker [27].In the evolutionary game process, the dynamic game has two mechanisms of selection and mutation [28] and more agents will select the selection mechanism which gets higher fitness in the prior game.Agents will select the mutation mechanisms randomly.The mutation receiving higher fitness will be continuously selected by agents in the game; otherwise the mutation will be eliminated.The replicator equation describes how the frequencies of a certain strategy changes over time.The nonlinear differential equation is given as follows [29]: where U(  , ) indicates the total expected payoff of agents selecting the pure strategy   and U(X, X) indicates the average payoff of the entire population Ψ. U(X, X) = ∑  =1   U(  , ), where k is the number of pure strategies.Equation (13) indicates that if the payoff of an agent selecting strategy   is more than the average payoff, the number of agents selecting the strategy   will increase and the converse is also true.If the payoff of an agent with strategy   is equal to the average payoff, the number of agents selecting the strategy   will stay the same.As time passes, the population will attain an evolutionarily stable strategy (ESS), which is an equilibrium refinement of the Nash equilibrium.The population attaining evolutionary stable states shows that natural selection alone is sufficient to prevent mutant strategies from successfully invading.
We have two strategies in the packet forwarding game model; we denote () = (, 1 − ) as the mixed-strategy for the population Ψ at time t, where  is the rate of agents selecting Forward strategy (i.e.,  1 ) and 1 −  is the rate of agents selecting No-Forward strategy (i.e.,  2 ).According to (13), the expected payoff of agents selecting Forward strategy is and the expected payoff of agents choosing No-Forward strategy is The average payoff of the entire population Ψ is as Therefore, from (13) to ( 16), the dynamic replicator equation of Forward strategy is Setting ( 17) equal 0, there are three solutions of the equation: Proof.Mathematically, the stability theory of the differential equations implies that if  * is a stable state, then F  ( * ) must be less than 0.
When   +  −  −   − L > 0, we get According to the stability theory of the differential equation,  * 1 = 1 is the only ESS of the packet forwarding game.Thus, Theorem 3 is proved.
When   +  −  −   < 0, we can get According to the stability theory of the differential equation,  * 1 = 0 is the only ESS of the packet forwarding game.Thus, Theorem 4 is proved.Under the conditions of Theorem 2, in the initial state when the rate of agents selecting Forward strategy is greater than  * 3 = (+  +−  −)/, the rate of agents selecting Forward strategy will eventually converge to 100%, or the rate of agents selecting Forward strategy will eventually converge to 0%.
Under the conditions of Theorem 3, whatever strategy agent B selects, the total payoff of agent A selecting the Forward strategy is always higher than that of selecting the No-Forward strategy.Regardless of what the initial rate of selecting a strategy is, the MASs will eventually attain a stable state, in which all the agents select the Forward strategy after evolution.
Under the conditions of Theorem 4, whatever strategy agent B selects, the payoff of agent A selecting the No-Forward strategy is always higher than that of selecting the Forward strategy.No matter what the initial rate of selecting a certain strategy is, the MASs will eventually attain a stable state, in which all the agents select the No-Forward strategy after evolution.Under this condition, the Nash equilibrium is harmful to network communication of MASs.
Under the conditions of Theorem 5, agent B selects a strategy with which the game cannot attain the steady state.In this case, the strategy is subjected to mutations and disturbances and cannot be restored to its original state; hence, it cannot be an evolutionary stable strategy.
Under the conditions of Inference 6, if  > ( +   + L − )/  , all the agents will forward packets as much as possible and the MASs will eventually attain a stable state, and if  < ( +   − )/  , the MASs will eventually attain a stable state while all the agents will select No-Forward strategy.agents select in the initial state) all of the agents will finally select Forward strategy and the system will attain a stable state.

The Analysis of the
In the case that an agent's communication environment is backlogged and the agent selects Forward strategy too frequently, the agent' time and energy will be wasted; therefore, selecting No-Forward strategy is favorable in this case.We give the agent incentive   for selecting No-Forward strategy.
From (7), P is the average probability of packet forwarding, P = (1 − )/(2 −  − ).If  or 1 −  is increased, P is increased.The positive changing relations among , 1−, and P can be shown in Figure 3.
There are many ways to improve the probability of packet forwarding in MASs, such as increasing bandwidth or frequency, improving the hardware or the transmission protocol algorithm, etc. From Theorem 3, if the value of P is increased to meet the conditions of Theorem 3, that is,   +  −  −   − L > 0, then agents will increasingly select Forward strategy and the MASs will attain the stable state more quickly.In this case, if we increase P, it will not affect the final evolutionary stable state, but it will improve the rate of convergence to the evolutionary stable state.On the contrary, if the value of P is increased, it is hard to meet the condition of Theorem 4, that is,   +  −  −   < 0; then the MASs will attain the stable state  * 1 = 0 slowly.In this case, if we increase P, it will not affect the final evolutionary stable state, but it will reduce the rate of convergence to the evolutionary stable state.

Simulation Experiments
The experimental environment is MATLAB 2015a.The simulation experiments are divided into two parts.
(1) The experiments in the first part are to verify the correctness of Theorems 2-4.
(2) The experiments in the second part are to confirm the effects of parameters W,   , and P.

Verification Experiment on Theorems 2-4.
We set parameters to meet the conditions of Theorems 2-4 as shown in Table 7.The experimental results are shown in Figures 4-7, in which we can observe the changing trends of evolution curves of agents in MASs.
In Figure 4, meeting the condition of Theorem 2, the parameters of two groups are related to the triangle curve with the critical point of 0.333 and the square curve with the critical point of 0.8.In the triangle curve, the value of ( 17) is initialized to 0.334.This denotes that 33.4% of the agents first select Forward strategy.The rate of agents selecting Forward strategy is changed continuously.After playing the game approximately 30 times (triangle curve), all agents will eventually select Forward strategy.Furthermore, once the initial value of ( 17) is initialized by 0.332, the rate of agents selecting Forward strategy will be fixed at 0% after playing the game approximately 40 times.This means all  of the agents will eventually select Forward strategy.In the square curve, the initial value of ( 17) is initialized by 0.801 and 0.799.Experimental results confirm that when meeting the conditions of Theorem 2, both  * 1 = 0 and  * 2 = 1 are the ESSs of the packet forwarding game.
In Figure 5, the parameters of the third group meet the conditions of Theorem 3, and the values of ( 17) are initialized at 0.01 for the triangle curve, 0.02 for the square curve, and 0.03 for the star curve.When the rates of agents selecting Forward strategy are 1%, 2%, or 3% in the initial state, after playing more than 120 times the rate of agents selecting Forward strategy will be fixed at 100%.This means that all  the agents will eventually select Forward strategy.Experimental results confirm that when meeting the conditions of Theorem 3,  * 2 = 1 is the only ESS of packet forwarding game.
In Figure 6, the parameters of the fourth group meet the conditions of Theorem 4, and the values of ( 17) are initialized at 0.9999 for the triangle curve, 0.97 for the star curve, and 0.8 for the square curve.When the rate of agents selecting Forward strategy is 99.99%, 97%, or 80% in the initial state, after playing more than 70 times, the rate of agents selecting Forward strategy will be fixed at 0%.This means that all the agents will eventually select No-Forward strategy.Experimental results confirm that when meeting the conditions of Theorem 4,  * 1 = 0 is the only ESS of packet forwarding game.
In Figure 7, the parameters of the fifth group meet the conditions of Theorem 5 and the value of ( 17) are initialized to 0.501 for the triangle curve, 0.5 for the square curve, and 0.499 for the vertical line curve.In the initial state, when the rate of agents selecting Forward strategy is 50%, which is the correct critical point, we can see that the rate of agents selecting Forward strategy cannot resist the small deviation from the disturbance and cannot converge to the state of stability.Therefore, Theorem 5 is verified.8.We illustrate the effects on the evolutionary process by the cooperation incentive W, the noncooperation incentive R  , and the probability of packet forwarding P. We observe the changing trends of evolution curves of agents in MASs.In Figures 8-10, the curves show the effects on the evolutionary process of packet forwarding by the cooperation incentive W.

Experiments on the
In Figure 8, we can see that the critical initial value of packet forwarding evolution of agents is 0.666 if W = 3 and 0.333 if W = 4.This means that although the rate of agents selecting Forward strategy decreases from 66.6% to 33.3% as the value of W increases from 3 to 4,  * 1 = 0,  * 2 = 1 will still be the ESSs of the packet forwarding game of MASs.
In Figure 9, when the rate of agents selecting Forward strategy is 1% in the initial state, it takes about 110 times of playing the game if W = 3 to reach the stable point  * 2 = 1, but it only takes about 40 times if W = 4 to reach the same stable point.We can see from Figure 9, meeting the conditions of Theorem 3, the cooperation incentive is bigger and the rate of the MASs attaining the ESS is faster.
In Figure 10, when the rate of agents selecting Forward strategy is 99% in the initial state, it takes more than 100 times of playing the game if W = 3 to reach the stable point  * 1 = 0, but it takes more than 80 times if W = 4 to reach the same stable point.We can see from Figure 10, satisfying the conditions of Theorem 4, the gain of agents selecting Forward strategy is less than that of No-Forward strategy.However, the cooperation incentive will promote agents to cooperate and forward packets to each other and, in turn, slow the rate of attaining the evolutionary stable state of MASs.
In Figures 11-13, the curves show the effects on the evolutionary process of packet forwarding by the noncooperation incentives   .In Figure 11, we can see that the critical initial value of packet forwarding evolution of agents is 0.333 if   = 5, while it is 0.666 if   = 6.This means that although the value of   is increased from 33.3% to 66.6%,  * 1 = 0 and  * 2 = 1 will still be the ESSs of the packet forwarding game of MASs.
In Figure 12, when the rate of agents selecting Forward strategy is 1% in the initial state, it takes more than 30 times of playing the game if   = 3 to reach the stable point  * 2 = 1 but more than 45 times if   = 4 to reach the same stable point.We can see from Figure 12, meeting the conditions of Theorem 3, that the noncooperation incentive is greater and the rate of the MASs attaining the ESS is slower.
In Figure 13, when the rate of agents selecting Forward strategy is 99% in the initial state, it takes more than 100 times   of playing the game if   = 3 to reach the stable point  * 1 = 0 but more than 80 times if   = 8 to reach the same stable point.We can see from Figure 13, meeting the conditions of Theorem 4, that the noncooperation incentive is greater and the rate of the MASs attaining the ESS is faster.
From Figures 12 and 13 we can see, under the same conditions, that the incentive for noncooperation   can increase the game playing times to attain the evolutionary stable state  * 2 = 1 and reduce the game playing times to attain the evolutionary stable state  * 1 = 0 in MASs.Our original intention is not to set the parameter   .Most of the time, the communication environment of agents is at the busy (backlogged) state.In this case, if agents select Forward strategy too frequently, they will consume their own energy without effectiveness, so the agent's selecting No-Forward strategy is favorable.In order to promote the agents noncooperation to select No-Forward strategy for saving energy, we should set the value of   to be inversely proportional to P (the probability of packet forwarding) in a multiagent management system.When P is small, that is to say the packet forwarding is backlogged, more agents select No-Forward strategy and save energy.When the problem of backlogged packets is resolved, P becomes bigger.At this time, we want more agents to select Forward strategy to get more gain and keep the network communication normal.We set the threshold value , 0 <  < 1, showing whether the network communication is backlogged or not.When P > , the network communication is idle, so we set the function as follows: = k ( − P) + b, where k > 0, b > 0 In Figures 14 and 15, the curves show the effects on the evolutionary process of packet forwarding by the probability of packet forwarding P in MASs.
In Figure 14, the thirteenth group of parameters meet the conditions of Theorem 3 and the values of ( 17) are initialized to 0.1, for any P ∈ (0, 1).From the curves, we can see that the packet forwarding game evolution will eventually attain stable state  * 2 = 1.Adjusting the value of P will not affect the final evolutionary stability, but it will affect the rate of attaining the stable state.The higher the value of P, the faster the rate of attaining the evolutionary stability in MASs.
In Figure 15, the fourteenth group of parameters meet the conditions of Theorem 4 and the values of (17) are initialized to 0.99, for any P ∈ (0, 1).From the curves, we can see that the packet forwarding game eventually attains  * 1 = 0 stable state.Adjusting the value of P will not affect the final evolutionary stability, but it will affect the rate of attaining the stable state.The higher the value of P, the slower the rate of attaining the evolutionary stability in MASs.
In Figure 16, as P changes, the value   will change according to (25), and we set  = 1/2, k = 4, and b = 2.The other data are the same as the fourteenth group in Table 8.The data meets the conditions of Theorem 4. We set the initial ratio of selecting the Forward strategy as 99%; from Figure 16, we can see that when the value of P becomes smaller than about 0.8, the values of   will increase.This will promote more agents to select the No-Forward strategy and improve the rate of eventually attaining stable state  * 1 = 0.When the value of P increases to about 0.8, the network communication becomes good and agents will change their strategy No-Forward to Forward and the packet forwarding game eventually attains the  * 2 = 1 stable state, which agrees with our original idea to describe the parameter   by (25).When the network is busy (backlogged), the agents select No-Forward strategy to save energy, and when the network becomes idle, the agents will select Forward strategy to keep the communication normal.

Conclusion
The packet forwarding mechanism is an important aspect of intelligent task allocation and collaborative work research in MASs.In this paper, based on evolutionary game theory, we have studied packet forwarding strategy decisions and the evolutionary process.Considering the real network communication environment, we have explored the probability of packet forwarding while the agent is idle or busy (backlogged).While the agent is in a busy state, we have introduced a noncooperation incentive to encourage agents to select the No-Forward strategy to save energy.Also, we have built a packet forwarding evolutionary game model in MASs.We have analyzed and proved the theorems indicating the conditions of attaining steady states and have analyzed the effects of the parameters on the rate of convergence to the ESSs.Lastly, the results of simulation experiments in MATLAB software have verified the effectiveness of the proposed theorems and the incentive mechanisms.
We have established rules of packet forwarding decisionmaking in the process of dynamics evolution in MASs.In subsequent work, we can further study the relation between the value of noncooperation incentive of agents and the probability of agents in idle state.The packet forwarding strategy game model proposed provides a method of network safety.

Figure 1 :
Figure 1: The state transition diagram of the Markov chain of an agent.
Packet Forwarding Game Model.From Theorem 3, if   +  −  −   − L > 0, then x=1 is the only ESS of the evolutionary game in MASs.Under this certain communication environment, we set P to a constant value.By increasing either   or , or by decreasing either ,   , or L in the formula, we can meet the condition of Theorem 3, which forces agents to avoid selecting the No-Forward strategy.As the value of  is increased to meet the condition of Theorem 3 (regardless of which strategy the

Figure 4 :
Figure 4: The evolution curves of the packet forwarding game (1).
Effect of Parameters.We set many groups of parameters to meet the conditions of related Theorems.Groups 1-6 vary the value of W and keep the other parameters values constant.Groups 7-14 vary the value of R  and keep the other parameters values constant.Groups 13-14 vary the value of P and keep the other parameters values constant.The details are shown in Table

Figure 8 :
Figure 8: Evolution curves of the packet forwarding in terms of W(1).

Figure 9 :
Figure 9: Evolution curves of the packet forwarding in terms of W(2).

Figure 10 :
Figure 10: Evolution curves of packet forwarding in terms of W(3).

Figure 11 :
Figure 11: Evolution curves of the packet forwarding in terms of   (1).

Figure 12 :
Figure 12: Evolution curves of the packet forwarding in terms of   (2).

Figure 13 :
Figure 13: Evolution curves of the packet forwarding in terms of   (3).

Figure 14 :
Figure 14: Evolution curves of the packet forwarding in terms of P(1).

Figure 15 :Figure 16 :
Figure 15: Evolution curve of the packet forwarding in terms of P(2).

Table 4 :
Payoffs of only one agent selecting the Forward strategy.

Table 5 :
Payoffs of two agents both selecting the No-Forward strategy.
strategy, only one agent selects Forward strategy, or both interactive agents select No-Forward strategy.These situations are discussed in what follows.