A Multiagent Cooperation Model Based on Trust Rating in Dynamic Infinite Interaction Environment

To improve the liveness of agents and enhance trust and collaboration in multiagent system, a new cooperation model based on trust rating in dynamic infinite interaction environment (TR-DII) is proposed. TR-DII model is used to control agent’s autonomy and selfishness and to make agent do the rational decision. TR-DII model is based on two important components. One is dynamic repeated interaction structure, and the other is trust rating. The dynamic repeated interaction structure is formed with multistage inviting and evaluating actions. It transforms agents’ interactions into an infinity task allocation environment, where controlled and renewable cycle is a component most agent models ignored. Additionally, it influences the expectations and behaviors of agents which may not appear in one-shot time but may appear in long-time cooperation. Moreover, with rewards and punishments mechanism (RPM), the trust rating (TR) is proposed to control agent blindness in selection phase. However, RPM is the factor that directly influences decisions, not the reputation as other models have suggested. Meanwhile, TR could monitor agent’s statuses in which they could be trustworthy or untrustworthy. Also, it refines agent’s disrepute in a new way which is ignored by the others. Finally, grids puzzle experiment has been used to test TR-DII model and other five models are used as comparisons. The results show that TR-DII model can effectively adjust the trust level between agents and makes the solvers be more trustworthy and do choices that are more rational. Moreover, through interaction result feedback, TR-DII model could adjust the income function, to control cooperation reputation, and could achieve a closed-loop control.


Introduction
Agent is a program module with fitting of human consciousness.As a problem solver, each agent has certain functions and behaviors and can also decide its own consciousness and handle affairs.When an agent fails to solve a problem, requests, cooperation, and negotiation will be unfolded among multiple agents [1].Multiagent system (MAS) plays an important role in the treatment of complex large-scale problems.It is like a microcosm of society, agents living here, working, communicating, bargaining, and watching out for others [2,3].Furthermore, what they want is to sustainably survive, and, meanwhile, to achieve their own values and collective goals.However, because of the existence of a large number of homogeneous and heterogeneous agents in MAS, limitation exists in resource and power distribution [4,5].Agents need to learn to cooperate and share when they have problems or want to stay active.They always keep an eye on keeping their liveness, which means the agents have been assigned a task and will not be eliminated by the multiagent system.Additionally, calling for others' help could be a tough question for them, because, they do not know each other very well sometimes, especially the one with different functions.Hence, in absence of personal experience, cooperator often has to be based on referrals from others [6,7].Acting as human, agents will always trust this referral as well as the cooperation result.Therefore, reputation can be considered as a collective measure of trustworthiness based on the referrals or ratings from members in a community [7,8].
However, the success of cooperation cannot only depend on mutual trust and recommendation.Benefits and selfinterests could change the decision and outcome [9,10].Due to autonomous attribute, the referral will judge gains and losses before promising to cooperate.The autonomous agent with rational sense tends to consider future expectation in decision-making.As such, self-interest could make agents show unstable behaviors, betray partners, and even spare no effort to acquire any favorable opportunity [11].Though, they have already indicated the cooperation determination, this self-interested decision will destroy the reputation, recommended ranking, and next opportunity to cooperate [12,13].
On the other side, time is a good measurement to check agent decisions.The agents with fixed utilities [14] focusing on short-time interests in a finite interaction will magnify their self-interest, making an unstable strategy [15].When some referrals consider long-term interests, they may not be traitors, even if they were before [15].On the contrary, some honest agents with good reputation will turn into traitors, because they believe that there will still be cooperation in the future.As a result, time and interaction times are necessary to be taken into account in MAS.
Thereby, a multiagent cooperation model based on trust rating with dynamic infinite interaction (TR-DII) is proposed in this paper.The infinitely repeated interaction structure is formed with multistage inviting and evaluating actions.According to the cooperation priority, cooperation agents could be selected initiatively.Moreover, the trust rating is proposed to control agents' selection selfishness and reputation, making them plan stage decision more rationally.In addition, the cooperation priority could be adjusted, through the interaction results feedback, to achieve a closed-loop control.
The rest of the paper is organized as follows: Section 2 describes the most representative related models.The multiagent cooperation model is presented in Section 3 and a computational example is given in Section 4. The analyses and results are also discussed in Section 4. Finally, Section 5 contains the conclusions and recommendations for future work.

Background
To find a good cooperator, agent often trusts the referral from others.Reputation is a vital measure of trust by other agents in a network [7,8].It is a public's opinion of an agent which comes from the experiences of multiagent that had worked together before.Reputation is an accumulated measure value, and it represents the ability and honesty in two ways [16,17].One is, do they agree to cooperate?The other is, do they finish the cooperation?SPORAS [18] is a reputation mechanism for a loosely connected environment, in which agents share the same interests.This model suggests a recursive reputation rating method, in which the recent ratings carry more weight compared to previous ratings.However, three main limitations exist in this model [18][19][20].First, the number of collaborators is limited to two agents for giving who give ratings.Second, a good reputation agent would be more self-interested with lower rating changes than the one with low reputation.Third, the reputation gives more attention to recent interactions so as to ignore the more previous interaction [20].An integrated reliability-reputation model (TRR) [21] computes the reputation based on the previous interactive agents.It focuses on every interaction and believes that the referral by agents with higher reputation is more credible than the lower one.However, both SPORAS and TRR consider agent decision purely based on ratings and reputation ranks, while ignoring factors which influencing decision-making is not just the rankings.The decision to cooperate is not arbitrary, but is subject to certain restrictions.Restrictions are considered as rewards and punishments in MAS.Rewards and punishments after multilevel accumulation will affect reputation value and also affect the decision.Nevertheless, reputation ratings are just the indirect influence factor.Therefore, rewards and punishments will be considered in TR-DII model.
In MAS, agents are autonomous and behave by selfinterested reasons, so they could be dishonest and incapable and be given disrepute [22,23].Dissatisfying decisions of agent interactions constitute the disrepute [24].It can be another rating to record agents' behaviors.If the disrepute value is lower, the referral is considered reputable and otherwise disreputable.Formal trust model (FTM) [25] divides the outcomes of past interactions into positive (satisfying) and negative (dissatisfying).It calculates the trust of each agent based on the posterior probability of previous interaction outcomes.Reputation-disrepute-conflict (R-D-C) model, an extension of FTM model [19], additionally included interaction times, the interaction frequency, and the number of rater agents.R-D-C model considered the three evaluation indicators together to find which agents are more trustworthy.
Literature review shows that FTM, R-D-C, and a trustoriented mechanism [26] all focus on distinguishing between trustworthy and untrustworthy agents.It is considered that agents can be categorized into these two statuses in some probabilities after several conflicts.However, it is not the probability of agent category.It is the probability of each interaction positive or negative decision which makes agents trustworthy or untrustworthy.SPORAS, FTM, and R-D-C think disrepute is dishonesty and incapability.While, due to self-interest, agent would perform disrepute in different ways, such as dishonesty but capability, or dishonesty with incapability.R-D-C realizes interaction time and frequency changes agents' attribute.Nevertheless, a defined interaction cycle will have more influence on agent decision as well as reputation rating.According to the previous interaction experiences, agents will adjust the probability of rationality in autonomy based on trust ranking.The autonomy means agent can perceive dynamic conditions in the environment, execute action affecting the environment, and gradually establish its own activity plan to cope with the possible environmental changes in the future.Meanwhile, this result will affect the ranking itself.Therefore, the current study was undertaken with the aim of improving the liveness of agents based on TR-DII model to control trust and cooperation reputation in MAS.

Dynamic Interaction Cooperation Model
Based on Trust Degree  from other agents.Initiative and interactivity would assist other agents in rapid response, in accepting the query and deploying the plan.Nevertheless, with the bionic thinking, an agent has a certain degree of selfishness, which is affected by two important factors in response to interactions.First, when there are more homogeneous agents in system, any agent is afraid of being situated at a vacancy for a long period, with fewer tasks to complete by the system assigned.That could make the agent obsolete after the system phase updating.Second, in the interactive cooperation with other agents, if the request and the requested number are too low, it would affect the reputation rating in system or hierarchy module and make agent get left out in the cold.Therefore, when sometimes agent accepts the request, it will lead to a loss to the requesting agent with lack of thinking about completing result.When the requesting agent is aware of the blind decision-making made by the requested agent, it will affect the later interactive cooperation.Hence, cooperation activities in MAS can be abstracted as a multistage interaction structure shown in Figure 1, the interaction structure is formed through repeated invitation and assessment.When the interaction structure appears repeatedly, the cooperative behavior may not appear in one-shot game but may appear in a repeated interaction [11].Extending the interaction structure into the whole task planning of MAS, cooperation interaction times are uncertainly waving, due to the randomness of tasks.Hence, it has evolved into a cooperation model based on infinite dynamic interaction (DII) in a defined interaction cycle.Suppose the number of agents in MAS is ; then the amount of cooperation ways is 2  .Figure 1 shows two agents' cooperation interaction.Suppose agent , agent  ∈ Γ.If agent  cannot finish the current task, it will request the help from agent ; meanwhile, agent  will give agent  a certain amount of rewards ().In case of agent  accepting the request, it will be rewarded with , or it can choose to add other agent's request, and the resulting opportunity cost is op.When the agent  receives the agent 's request based on rational choice, it can collaborate with agent  to complete the system assignment.Therefore, agent  is rewarded with , as well; agent  makes profit of  with the net profit  − ; if agent  receives the request with selfishness and blindness, it will lead to inability of finishing the cooperation project.
Even though agent  can continue to reap the benefits , and agent  will cause an economic loss of −, encouraging and restricting agents cooperation expectation and selfishness will have an important effect on which kinds of equilibrium the cooperative interaction would reach.

Cooperation Mechanism Based on Trust
Rating.In the interaction of multiagent system, agent individual income not only depends on one-shot interaction's benefit, but also is affected by interaction times, such as long-term cooperation history and future expectation.Based on this, Trust Rating (TR) is proposed, as an evaluation indicator, to control agent rational degree and to improve the cooperation prospect.Trust rating evaluates the degree of mutual trust between agents.The higher TR is, the bigger the value is.Besides, it represents the mutual trust index between two agents.If it is higher, the tolerance between each other to make mistakes is greater.Additionally, the ratio for allowing noncooperative or nonrational choice will be higher; in addition, agents would pay more attention to the long-term interests, and vice versa.Suppose the total number of applications for cooperation is AC, the number of unsuccessful cooperation is NSC, and the reason for being unsuccessful may be caused by another agent rejection or blindness.Trust rating is shown in where TR describes the trust information of agent cooperation interaction.Through numerical adjustment of TR, it can make the agent who requests cooperation still have a certain degree of confidence in the requested agent and hope to cooperate with it in the next time after several times of being refused or betrayed by the requested agent.Moreover, the value of TR can effectively control the rational degree of the requested agent.For a long-term cooperation, in order to improve the degree of trust on the other agents, TR forces the requested agent to make a rational game strategy.According to the definition, TR can be limited to interval [0, 1] and has been given discount characteristic, when finding out equilibrium of interaction structure.Suppose the requested agent will give the rational choice in various stages of interaction structure and it considers "execution cooperation" to not be the optimal decision.The pure income present value of infinitely repeated interaction is    .So,    can be represented by (2), and then turned into (3): When the requested agent receives cooperation request, if it has blind competition attribute, it would make a "nonexecution cooperation" decision.This decision will not only make the request agent suffer a loss and no request for cooperation in the future, but also reduce the requested revenue and decrease the trust ranking.Meanwhile, the agent would be given disrepute due to his dishonesty though it has capability of executing cooperation.For example, if agent  thought nonexecution cooperation decision will be the best choice, the income present value   une would be (5): For agent , if it thought making execution cooperation decision will be beneficial to its own value and benefits,    ≥   une , so TR ( − op) ≥ 0 (7) by (7), TR ≥ 0. If ( − op) ≥ 0, namely, the reward of agent  given by agent  is greater than or equal to the opportunity cost obtained from other agents, agent  believes choosing execution cooperation decision will be the system equilibrium in infinitely repeated interaction, when the benefits of cooperation met the expected revenue of cooperation.
If agent  can get the help from agent  and also successfully complete the task,    will be shown in If the request agent  finds out agent  having blindness which would make agent  suffer a loss in requesting for cooperation, it will stop cooperative task and shield agent , and then   une would be In the case of agent , choosing agent  to complete task together is the optimal choice; then    ≥   une , that is, If  > , cooperation between agents can be successfully established, and the lowest limit of TR can be set to 1.That will meet the highest limitation range of values in the TR definition.Hence, if a request for cooperation can satisfy the conditions of which cooperation can be built, namely,  > , agent  and agent  with limitation of TR ∈ [0, 1] will regard choosing implement cooperation decision as the system equilibrium.
The degree of trust and the expected value of cooperation between agents can be analyzed through TR.The smaller the numerical value of TR is, the less the time of allowing unsuccessful cooperation will be.In addition to this, agents will pay more attention to immediate interests and the level of mutual trust would be lower.If the actual number of unsuccessful cooperation is greater than TR, the cooperation will no longer exist.TR = 0 is the ultimate value of cooperation, as long as the unsuccessful cooperation time is greater than one, and then both sides of agents will never trust each other and cancel cooperation.That could be described by "grim strategy" [27].Additionally, this strategy puts forward high requests for the rational choice between agents.It not only requires the requested agents to grant whatever is requested, but also restricts the agents to perform their duties and to complete the task, according to the plan of cooperation.If not, there will be no more cooperation.On the contrary, if TR has a high value, it depicts that both sides of agents are paying more attention to the long-term interests, and it can also improve mutual trust level.The high value of TR can promote cooperation.That is to say, if the ratio of refused or nonexecution cooperation is less than default value of TR, then the requesting agent will forgive the requested agent's past misdeeds and restart request for cooperation.So, the requested agent will also choose a new round of the interaction.TR = 1 is another ultimate value of cooperation, and it represents infinitely repeated interaction cooperation without considering the cooperation history record.But, due to TR = 1, the interaction system can be abstracted into a static game environment, with strong randomness and blindness of agent game choice.In a static game environment, agent does not know other agents well and has no interactive records.This could cause a decrease of the system efficiency to complete the task.Therefore, TR can effectively control agents' cooperation.
Meanwhile, in the system, reputation rank selection between homogeneous agents can be composed of accumulated revenues gained from cooperative interaction stage.When agent's rank is higher, the agent value will be larger; in the cooperative game, agents will have a good mutual trust and try a variety of cooperation independently, with a wide range of options.Beyond this, it will keep a higher flexibility in replying to request cooperation and choosing to execute the task.Otherwise, when the ranking level is low, in order to avoid the formation of isolated individuals and being eliminated by system, agents will have a strong rationality Input: TR, , , op, reputation Output: Adjusted TR, , , op (1) Initialize system agent ∈ , tasks   (2) for each task do (3) Define the set of cooperation agents   //  ∈  (4) Split into a two-agent interaction structure   // use agent  ∈   as a request (5) for each structure do (6) Define the interaction participates between agent  and agent  and homoplasy as well as low flexibility, in responding cooperation and picking to execute the task.In addition, agents will tend to perform collaborative response cooperation, for the sake of increasing the success rate of cooperation and improving reputation rank.Consequently, the infinitely repeated interaction concept based on trust rating (TR-DII) which has been introduced in MAS, will accelerate agents' cooperation and improve liveness of agents.Through a closed-loop adjustment between cooperation reputation and trust rating, the ability of agents to make rational choice will be reinforced.

Algorithm Description.
The procedure of interaction, negotiation, and cooperation among agents in complex systems is described in Algorithm TR-DII (see Algorithm 1).

Algorithm Complexity.
A dynamic interaction structure consists of a set of  = {1, . . ., } agents and a set {  } of interaction strategies.As dynamic interactions are implemented in a branching structure, the branching exploration is of complexity (), where  is the number of nodes.As shown in Figure 1, the number of nodes of the branching is bounded by:  ≤ 1 + ∑ where  1 and  2 are the action sets of agent 1 and agent 2 before starting the interaction.The branch of each node represents the possible cooperation choice that can be made by an agent.After each choice of agent , agent  will explore a possible combination of cooperation including positive and negative interaction.This limits the size of strategy set of an agent , so the complexity of a dynamic interaction structure is equal to |  | * |  | =  2 .Therefore, the complexity of a dynamic interaction structure () is equal to ( 2 ).However, the request agent can ask more than one requester for cooperation; it also can be split into twoagent interaction as shown in Figure 2. Therefore, the same interaction structure is repeated until the allocation of all tasks in multiagent environment has been completed.That is to say, it is the limit of infinite interaction, and the complexity of system is equal to ( 2 ).

Results of TR-DII.
Grids puzzle is used to test the effectiveness of infinite interaction cooperation agent mechanism which is based on trust rating.Grids puzzle (shown in Figure 2) can be used to describe infinitely repeated dynamic interaction environment.Suppose there are four agents in the system.Set the agent cooperation interaction structure as the basic test unit in grids puzzle.Set agent's initial position in grids as the initial point of stage task allocation in MAS, and the initial position is random.Set agent's terminal position in grids as the terminal point of stage task allocation as well.As shown in Figure 2, solid color grid is set as the initial position.Agent A and Agent B are initial on the left.Similarly, Agent C and Agent D are initial on the right.On the contrary, plaid grid is set to the terminal position.Agent A and Agent B are end on the right.Similarly, Agent C and Agent D are end on the left.Agents can walk freely and choose any neighboring grid.In agent free movement, the meeting of agents will be depicted as the task, needing negotiation for cooperation, which has been distributed by MAS.If agent accepted and executed the cooperation, it means agent performed the task.If agent received cooperation but unexecuted it, it means agent has blindness in making decision.Similarly, if agent rejected cooperation, that means agents cannot fulfil the task.Due to the randomness of meeting, it expresses the dynamic of cooperative interaction and unlimited repeatability in system task allocation.Besides, the different value TR could control rationality degree of choosing to perform cooperation.
In this paper, MAS system uses four agents as basic testing unit; this unit has a good description of agent cooperation relationship.According to different initiators, there are 64 kinds of cooperative combinations.Furthermore, based on sequence of completing task, status of cooperation and execution, and rationality of game selection, the forms of cooperation can be refined into 312 types.Figure 2 shows action choice space with {up, down, upper left, lower left, upper right, lower right} six kinds of actions of agent A, agent B, agent C, and agent D, and the action choice will be random.
To test the effectiveness of TR, experiment assumes agent A as an initiator of cooperation.That is, when agent A meets other agents, agent A initiates cooperation.If the agents choose to cooperate, they will remain in the same grid.If cooperators rationally execute cooperation, they will obtain ( − , ) revenues each as equilibrium.If the requested agent makes decision with blindness and does not perform the cooperation, then the agents will gain revenues (−, ), respectively, and simultaneously it would cause a loss to the request agent.On the contrary, if the requested agent rejects cooperation, the two agents will go back to the last step grids and acquire (0, op) revenues each as equilibrium.When the four agents complete all tasks of their own, MAS will get to the end of stage task assignment.Set the end node of this experiment as four agents all reach the finish lines.It would also be a defined interaction cycle.
The original value of TR is set to 1.It depicts static game environment, in which agents have rational sense and selfinterest and make decision with equal probability of rational choice.When evaluating system cooperation reputation rank, the system uses (11) as the evaluation standard, according to the earnings of agents, where  is weight ratio,  se is the revenue of actual execution cooperation (some are not been executed because of blindness and self-interest),  ac is the revenue of all cooperation being accepted and executed, and  pc is the revenue of all accepted cooperation being executed Equation ( 11) can be simplified into se/ac is ratio of the time of actual execution cooperation and the total number of applications for cooperation; it descripts the assessment of trust level between agents in corresponding to different values of TR. se/pc is ratio of the time of actual execution cooperation and the total number of all the accepted cooperation; it depicts the restriction of rational degree in agent game selection with different TR.
Using ratio combination as the rank standard of cooperation reputation will increase the comparability between agents and eliminate the randomness combination.When agent A is the cooperation initiator, the system will consider about 7 types of cooperation formations, such as AB, AC, AD, ABC, ABD, ACD, and ABCD.According to different values of TR, each test will be done 100 times, and then the average value will be recorded into the test results (because interaction times of ABCD are too low, so this combination will not be considered), and  is equal to 0.5.Table 1 shows the cooperation results of AB, AC, AD, ABC, ABD, and ACD in a static system.Table 2 exhibits the assessment of system stage cooperation priority in accordance with the cooperation result.In static game environment, rational choice is an equiprobable execution choice, namely, the executing rate of execution cooperation without any restriction of TR.After the system phase tasks allocation has been finished, the request agent will update TR values of the requested agents in line with the stage cooperation ranking.The updated result is shown in Table 3.Moreover, agents will prescribe a limit to rational choice in order to improve their cooperation reputation ratings, to enhance cooperation benefits and times, and also to prevent the neglect by other agents and elimination by system.
As shown in Table 3, TR would be assigned different values according to the agent cooperation reputation.It is interesting to see that, the higher the cooperation reputation rating, the bigger the TR, and the tolerance degree of noncooperation or nonexecution which agents allow will be enlarged.It reflects the willingness to pay more attention to the long-term cooperation between the agents.When TR is lower, the credibility of agents in MAS is likewise lower.Therefore, to improve their cooperation level, agents give a higher value of rational choice ratio.If the rational choice ratio is equal to 1, it expresses that the requested agents will execute all the accepted cooperation and the agents belong to completely rational game players without selfishness or blindness.In other situation, if the rational choice ratio is equal to 0.5, agents will have high rating of cooperation reputation.Additionally, agents have strong selectivity and selfishness in their game stages and have flexible choosing options as whether to execute cooperative tasks in the light of benefits and opportunity costs.According to the updating values in Table 3, there is a new stage of cooperation evaluation shown in Table 4.
After analyzing Tables 4 and 5 it could be found that, because of the updating TR, it causes agents to do the corresponding adjustment based on their rational choice ratios.Using 3 types of combination including AB, ABD, and AD as examples, rational choice contributes to executing the cooperation tasks with a high ratio, and the higher frequency of execution cooperation will improve the ratio of accepted cooperation and application cooperation.Thus, the value of se/pc will be increased and makes agents turn into rational players with raised cooperation reputation rank.However, ABC, ACD, and AC have higher values of TR, and the agents have strong flexibility and preoccupancy ability and make the cooperation reputation changing after stage updating.Also, the experiment reflects the higher autonomy of agents in MAS.
In the meantime, in order to increase reputation and trust ratings, the time of choosing accepted cooperation will be increased in all application cooperation (shown in Figure 3 and Table 6).That is to say, the frequency of agents' coexistence in one grid will be raised.However, the frequency of backing to the last step would be reduced.As shown in Figure 3 and Table 6, AB, AD, and ABD were in dangerous place in last stage with low reputation and trust ratings.To survive from this position, AB, AD, ABD increased times of fulfilling applications.The blue plaid stacks were much higher than the blue stacks.So were the yellow plaid columns.ABD, AB, and AD had good reputations in this new stage with more generous rational choice range.But, in the real environment, they will still do the high rational choice making to maintain high reputation.This will put other agents at low reputation ratings, so that they have to make rational choices all the time.Additionally, this adjustment and cycle could make agents more active and improve the liveness of MAS.
Consequently, abstracting this working condition into actual MAS, if agents reject cooperation, it could make this item unable to be disposed of or need to be redistributed.This is not only a waste of system resources and communication costs, but also a damage on the trust degree between agents.Therefore, transforming TR can adjust cooperation tendency and selection ability in MAS.Furthermore, it can make agents have strong cooperation and competitiveness and ensure the system operation efficiency and stability.

Comparison of TR-DII with Existing Models.
To verify the performance of TR-DII model, four types of commonly used models were applied for comparative experiments, which included R-D-C, FTM, SPORAS, and TRR models.In the  comparative experiment, agents had semirational characteristics as participants, which were 50% trustworthy and 50% untrustworthy.Five models were tested under four trust ratings, which include two kinds of limit states, grim strategy and static strategy.Additionally, synthetical evaluation value has been used to reflect the performances of models.The comparison results were shown in Table 7.
As shown in Table 7, the synthetical evaluation values of TR-DII under four trust ratings were higher than those of others.It indicates that the performance of TR-DII is better  It considers reputation, disrepute, and conflict based on each interaction no matter whether it is positive or negative.While FTM, SPORAS, and TRR did not consider the negative interactions, TR-DII makes agent's interaction and reputation openly and transparently stored and displayed in system.Each agent as a requester can think, interact, and choose cooperative partner independently and has stronger autonomy.However, R-D-C is more dependent on advisor, and advisor sometimes is not very sure about the attributes of agent.Its advice will have a certain error probability.That will make it harder to choose the agent which can be trusted and collaborated with.Additionally, TR-DII also considers disrepute in different way, such as the dishonesty of the agent with the ability to complete the task.However, other models do not consider this situation.
Therefore, the experimental results show that TR-DII has a better performance to achieve more successful cooperation.Figure 4 shows the synthetical evaluation of cooperation under four different trust ratings.It demonstrates that more

Conclusions
Aiming at improving the liveness of agents and solving the problem in lack of predictable forecasts about future cooperation, a multiagent cooperation model based on trust rating with dynamic infinite interaction (TR-DII) is proposed.The infinitely repeated interaction structure is formed with multistage inviting and evaluating actions.It controls the probability that the cooperative behavior may not appear in one-shot game but may appear in a repeated interaction.It focuses on each previous interaction, no matter whether it is positive or negative.Moreover, the trust rating is proposed to control agent blindness in selection phase and to make sure it could do the rational stage decision.Meanwhile, with rewards and punishments, TR could monitor agent's statuses which could be trustworthy or untrustworthy.It is more interested in finding the reasons for changing the attributes of agents, which most of the existing trust models did not do.Through the interaction result feedback, TR-DII model could adjust the income function, to control cooperation reputation, and could achieve a closed-loop control.Additionally, TR-DII includes the disrepute under dishonesty but capability, which almost all models ignored.Finally, the experiment used four agents as the basic testing unit to verify the impact of TR on execution cooperation during dynamic infinitely repeated interaction in grids puzzle.Also, five groups of contrast experiments have been done based on four kinds of trust ratings.Results show that trust rating can effectively adjust the trust level between agents and more successful cooperation can be founded and promoted with TR-DII model in multiagent environment.For future work, the intent is to research agent vacancy condition and to find a more efficient model for stimulating agent positivity and also to implement TR-DII model in the real-world system.

( 7 )Figure 2 :
Figure 2: Moving state of agents in grids (solid color grid is initial position; plaid grid is terminal position).

Figure 3 :
Figure 3: Comparison of cooperation times (Accoop means accepted cooperation times; Excoop means execution cooperation times; Apcoop means applications cooperation times).

Table 3 :
Updating TR in stage of system.

Table 4 :
Test results of cooperation in the new stage.

Table 5 :
Test results of collaboration in the new stage.

Table 6 :
Updating TR in new stage of system.
TR-DII pays more attention to what changed the attribute of agent, which made them trustworthy or untrustworthy.

Table 7 :
Test results for comparison of TR-DII and R-D-C, FTM, SPORAS, and TRR model.