A Novel Optimal Strategy for Communication System in the Maritime Industry Based on Game Theory

In this paper, we employ convex optimization and the saddle point equation to ﬁ nd the two-player optimal payo ﬀ in iterated rock-paper-scissors game. We also describe the equivalent of payo ﬀ written in a two-person non-zero-sum matrix in the hypothetical game system, which provides a possible way to make quantitative analyses. In addition, we use the interior-point methods to simulate rock-paper-scissors game and our numerical results verify that our hypothesis of the payo ﬀ equation, max Y min X Y T AX = min X max Y Y T AX , can still work very well even if A changes with the payo ﬀ g , but it is never a ﬀ ected by other factors.


Introduction
It is challenging for human beings to make optimal decisions in noncooperative strategic interactions [1]. The finest approach for scissors of rock paper is to really behave allegedly. This means that each choice is played around a third of the time and a player cannot imagine what is next. Rock-paper-scissors (RPS for short) game widely used to study competitive phenomena in society and biology, especially species diversity and pattern formation [2][3][4][5][6][7], offers a new way. As is known to all, the concept of Nash equilibrium (NE), developed under the assumption that the players are sufficiently rational to ensure that they can accurately learn the strategies of the competing players and to optimize their own strategy accordingly, plays a fundamental role in both classic game theory and evolutionary game theory [1,[8][9][10][11][12][13]. Furthermore, much effort has been devoted to investigating RPS game using a variety of models, such as the "rock-paper-scissors dynamics model" [14,15], scale-free memory model [16], and cyclic dominance model [17][18][19]. Note that rock play is a dominant strategy for both players (i.e., the best choice of rock, whenever your opponent plays! So, the balance for this game is unique: both players always choose rock).
Convex optimization is a reliable and efficient optimization method, which can obtain the global optimal solution precisely by selecting the appropriate algorithm. Convex optimization, frequently used to solve optimization problems due to its strong scalability and wide applications which include electronic technology [20][21][22][23], software engineering [24][25][26][27], and machine learning [28], can be fully applied to the game theory and sheds light on the optimal strategy in iterated RPS game. When we check the payment table for rock, paper, and scissors, we see that such a balance does not exist. There is no option where the selections for the two players are the best answer for the other player. So, there is no true Nash strategy balances.
In this paper, we employ convex optimization (CVX for short) and the saddle point equation to optimize two players' payoff in iterated RPS game, which has a considerable amount to offer in both theoretical researches and practical applications. Make a move that either gives you a win or a stalemate that ensures you will not lose. You can suppose for instance that your opponent will not play it three times in a row, if they toss out scissors twice. Either rock or paper they will play. In order to verify our hypothesis of the study, we assign different values to the convex optimization model and then draw graphs with simulation software to find the pertinent rule, which shows the feasibility of the convex optimization method. Rock scissoring (also known as "rock" or ro-sham-bo) is a hand game usually played between two people, in which each participant produces a single one of three forms with an extended hand (articles of rock, "rocks" are sometimes known by various orders). The hand game normally is played in between two people. The forms "rock," "paper," and "scissors" are "clocked fist" (a fist with the index finger and middle finger extended, forming a V). "Scissors" are the same as the two-fingered V sign (also denoting "victory" or "peace"). It is not held vertically but is directed horizontally.
The marginal contribution of our study is that we introduce the two-player non-zero-sum matrix A = ½1, 0, g ; g, 1, 0 ; 0, g, 1 to describe the payoff of players quantitatively as X T AY and Y T AX; we simulate the system with Newton algorithm. We use the saddle point equation max Y T AX to calculate how players can obtain the maximal payoff and find out that the EPRs of player X and player Y increase with the payoff g. This method takes into account the opponent's previous movements, to decide whether the opponent wants to choose one move over another. In essence, each rock, paper, and scissor have a "score." The rating of the opponent's move is increased after every move. If the rock paper scissors is a game that is bad, then they can only play in the devil's sons, and of course, the frenzied fans are spectators. Therefore, the concept that the game is bad is not even taken into account or that the game is not bad.

Model
For the sake of simplicity, we take a two-person non-zerosum RPS game model as an example to study the noncooperative game system. In this game, each player plays innumerable rounds of the game and can only choose one action among R, P, and S in each round, as shown in Figure 1. The payoff g ðg > 1Þ is defined as the only parameter of the winning action in this game [1] (see Figure 2), and rational players just make decisions according to the value of g. Two players get a unit payoff when they choose the same action. Furthermore, player X will win with payoff g while player Y is going to get zero payoff when player X beats player Y, and vice versa.
During competitions, players often plan three gestures before the tournament begins. Some tourney players utilize methods to mislead or fool other players into an illegal move, which leads to a loss. One such approach is to call the name of one move in order to misdirect and mislead the other.

Convex Optimization Equation.
The expected payoff per round (EPR) W X of player X and the EPR W Y of player Y are as follows: X R , X P , and X S denote the payoff probabilities of player X, and Y R , Y P , and Y S denote the payoff probabilities of player Y.
Compared with the traditional payoff function, our payoff equation is strictly convex. The convex optimization problem is described by (3) and (4).
Here, player X selects a strategy i ∈ f1, 2, 3g, while player Y selects a strategy j ∈ f1, 2, 3g. As (1)-(4) above are convex, all their optima are global optima. Two adversaries randomly throw out motions in the game rock paper scissors, and each wins, loses, or draws with equal probability. It must be a game of sheer luck, not competence-and certainly, if everybody could be perfectly alleged, nobody could take the lead on anybody else.
In this paper, the system we study is a two-person nonzero-sum game, so it can be written as Y T AX + X T AY ≠ 0. In order to study it quantitatively, we suppose that player X makes his decision first and player Y acts according to player X's decision later. In an object larger than another, 2 Wireless Communications and Mobile Computing a paper which covers a rock still makes sense. That is why the paper beats the rock; only because the rock is not harmed does it invisibly render the rock unnecessary to the rest of the world. For rational players X and Y, player X wants to minimize W Y , while player Y wants to maximize W Y . Similarly, player X wants to maximize W X , while player Y wants to minimize W X . Game theory just renders it ineffective as an instrument to analyse the occurrences of the real world with the highly problematic assumptions on "rationality," equilibrium solutions, information, and knowledge.
while player Y should choose Y to maximize one of the payoffs of the system.
We define χ = Y R + gY P , ξ = Y P + gY S , and μ = Y S + g Y R , where χ, ξ, μ denote the values of the expected payoff (gain), respectively.
When χ < ξ, μ and player X wants to minimize Y T AX, then he should maximize X R and its coefficient χ. That is to say, when X R = 1, X P = 0, and X S = 0, Y T AX is the minimal payoff. From the inner minimization in (8), we have e T 1 = ðX R , X P , X S Þ = ð1, 0, 0Þ. The other two cases (ξ < χ, μ ; μ < χ, ξ) proceed similarly. Two-person games are the simplest form of competing situations. These games have only two players; it is dubbed zero-sum games, as one player wins the other player lose. Figure 1: The payoff tree: each player (player X or player Y) has three possible actions: rock (R) beats scissors (S), S beats paper (P), and P in turn beats R. As it is doubtful that the adversary will play rock again, shears are unsurpassable. In the event of paper, scissors win; it is a tie, if the adversary chooses scissors. RPS players are mentally classified as winners and losers. Next time, a losing player will more likely switch to another throw. Figure 2: The payoff matrix: each element of the payoff matrix is from row player to column player. It has just two potential outcomes in a simultaneous zero-sum game: one player draw, a win, and a loss for the other. A player who decides to play rock is going to beat another player who has selected scissors ("rock smashes scissors" or "blunt scissors" at times) but is going to lose to the player who has picked paper ("paper covers rock") ("scissors cuts paper"). The game is tied, usually played to break the tie quickly, if both players decide to play the same form. The game type was created in China and spread through increased contact with East Asia, and various varieties of signs were developed throughout time. A genuinely random opponent is not feasible to acquire an edge. However, it is possible to gain an important advantage by using the psychological flaws of intrinsically nonrandom adversaries. Actually, people tend to be nonrandom players. As a result, competition for algorithms playing rock paper scissors was held.

Wireless Communications and Mobile Computing
For max Y ðmin X Y T AXÞ, inner optimization can be described as follows: where e i denotes the vector that is all zeros except for one in the ith position, that is, deterministic strategy i. These optimization expressions are to be compared with the following standard form of a convex optimization function: the most difficult aspect of making decisions, according to a study presented at the annual conference of the academy of management this month, is not finding the proper answer; it has the fortitude to really act on that information.
In this paper, we introduce a scalar variable α representing the value of the inner minimization: The symbol ≥ denotes being greater or equal to, the symbol ≤ denotes being less or equal to, and the symbol ≡ denotes being equivalent to. Thus, we have obtained the standard convex optimization equation of our model.

The Saddle Point Equation.
The primal problem (11) is convex with convex payoff mainly decided by player X.

Wireless Communications and Mobile Computing
The Lagrangian is Thus, the dual function is For each pair ðλ, νÞ with λ ≥ 0, the Lagrange dual function gives us a lower bound on the optimal value p * of the optimization problem (15). You are calm; scissors shows. You must be very careful to cut an object or open a box with scissors if you wish to cut it. Scissors shows that you are crafty and are awaiting a chance. You have just made a lovely dinner for scissors if you believed you could suffocate rock by throwing paper. Thus, we have a lower bound that depends on some parameters λ and ν. Problem (15) is translated into the optimization problem: This is called the Lagrange dual problem corresponding to problem (15). The Slater condition [28] says that strong duality between (17) and (18) holds if the quadratic inequality constraints are strictly feasible; i.e., if there exists an X with−B T i X < 0,∑ i B T i X = 1, i = 1, 2, 3. Strong duality between (17) and (18) will be proven in the next section. Now, suppose the order of play is reversed; player Y chooses Y R, , Y P , Y S first, and then, player X chooses X R , X P , X S . Following a similar argument, if the players follow the optimal strategy, player X should choose X to minimize max Y Y T AX, which results in a payoff of min where Equations (13) and (19) are standard convex optimization payoff functions of players X and Y. Note that player Y's problem is dual to player X's in game theory. Many powerful algorithms have developed as a result of competitions for programming rock paper scissors, the heuristic compilation of techniques by Iocaine Powder, for instance, who was the winner of the First International RoShamBo Programing Competition in 1999. It also contains six metastrategies for each method it deploys, which defeat the opponent in second, third, and second guessing and so on.
The previous hypothesis is based on the sequence of how player X and player Y make decisions. Now, player X and player Y are making decisions at the same time. In comparison with Equations (13) and (19), we accordingly have We call Equation (20) the saddle point equation. It is a description of the saddle point at which players get the maximal payoff. Equation (13) gives the left part of Equation (20), whereas (19) provides the right part of Equation (20).
Similarly, looking at the payoff of player X, we have another saddle point equation The above equation is also a description of the saddle point equation at which players can achieve the maximal payoff of the system we study. Equations (13) and (20) represent the left part and the right part of Equation (20), respectively. Based on this, we denote the left and right parts of Equation (21) as max δ subject to δI − AX ≤ 0 min γ subject to γI − AY ≥ 0 We use strong duality theorem to prove Equation (20). The same principle can also be used to prove Equation (21).
To prove it, first, it is noted that This means that we can write the optimal value of the primal payoff as We also have by the definition of the dual function. Thus, strong duality can be described as the equality d * = p * . Y T AX satisfies the strong max-min property or the saddle point property [29].

Optimization Algorithm.
With linear equality and inequality constraints reduced to a sequence of linear constraint problems, the optimization problem in standard form is as follows: We have which are called the Karush-Kuhn-Tucker (KKT) conditions [29]. Because problem (15) is convex, the KKT conditions are also sufficient for the points to be primal and dual optimal. Thus, we have Then, Xˇand ðλˇ, νˇÞ are primal and dual optimal, with zero duality gap. Interior-point methods solve problem (15) by applying Newton's method to a sequence of equality constrained problems.
First, we translate problem (15) into an optimization problem, which states that the inequality constraints are implicit in the objective where I : R ⟶ R is the indicator function for the nonposi-tive reals and I 0 is the indicator function of f0g. Then, we define the function to approximate the indicator function by using the barrier method, where t > 0 is a parameter that determines the accuracy of the approximation. The functionÎ − is convex and nondecreasing and takes on the value ∞ for z > 0.Î − is differentiable and closed and increases to ∞ as z increases to 0. Subsequently, we replace I − withÎ − in (30).
The objective function here is convex, since −ð1/tÞ log ð −ZÞ is convex, increasing in z, and differentiable [29]. We obtain the function with dom∅ = fX ∈ R n |−B T i X < 0, i = 1, 2, 3g,which is the logarithmic barrier for problem (15). Finally, we obtain the gradient and Hessian of the logarithmic barrier function ∅: . It means that no matter A = A 1 or A = A 2 , the left part of the saddle point equation stays the same with the right part. As indicated in Figure 3, suppose g 1 = ðg + 5:1Þ/3, the optimal value of EPR of player XðW * X Þ equals to that of player YðW * Y Þ, which increases with g and reaching towards 0.95. Thus, we believe numerical results are consistent with our theoretical hypothesis.

Conclusion
We have demonstrated in this paper how players can obtain the optimal payoff in two-player iterated RPS game with the convex optimization method and saddle point equation and verify the hypothesis of the payoff equation with interior methods by using simulation software. Hence, it can be concluded that convex optimization is a feasible method to maximize the payoff in two-player iterated RPS game regardless of other factors. Furthermore, the research method is operational and the results of our study can be extended to game systems like social cycling, species competition, election, and economical issues and provide insight into further related quantitative research.
Rock paper scissors (RPS) is not only a game popular with children but also a basic and classic model system for studying the mechanism of decision-making in noncooperative strategic interactions in depth. The RPS is a topic of increasing interest and significance for it helps improving our understanding on many complex competition issues (species divergence, price cycling, human decision-making, rationality and cooperation, and so on). It is also a starting point to enter into the interdisciplinary field between statistical physics and game theory [9].
As stated in Bi and Zhou's paper [8], cooperation in a finite-population RPS game system with more than two players may be much more difficult and complex to achieve than the case of only two players; we only provide the simplest model to probe into the optimal strategy in iterated RPS game in the paper; much more complicated related research needs being carried out in the future.

Data Availability
The figures used to support the findings of this study are included in the article.