A Dynamic Stackelberg Game of Supply Chain for a Corporate Social Responsibility

In this paper, we establish a dynamic game to allocate CSR (Corporate Social Responsibility) to the members of a supply chain. We propose a model of a supply chain in a decentralized state which includes a supplier and a manufacturer. For analyzing supply chain performance in decentralized state and the relationships between themembers of the supply chain, we formulate amodel that crosses through multiperiods with the help of a dynamic discrete Stackelberg game which is made under two different information structures. We obtain an equilibrium point at which both the profits of members and the level of CSR taken up by supply chains are maximized.


Introduction
In recent years, companies and firms have been showing an ongoing interest in favor of CSR.This is mainly because of increasing consumer awareness of several CSR issues, for example, the environment, human rights, and safety.In addition, the firms are also forced to accept CSR due to government policies and regulations.Recently CSR has gained recognition and importance as field of research field [1,2].However, the research field still lacks a consistent definition of CSR and this has been the center of discussion for several decades.Dahlsrud [3] presented an overview of different definitions of CSR and summarized the number of dimensions included in each definition.There is a positive correlation between CSR and profit [4,5].Moreover, CSR is an effective tool for supply chain management, for coordination, purchasing, manufacturing, distribution, and marketing functions [6].According to previous studies, the long-term investment on CSR is beneficial for a supply chain.Furthermore, a sustainable supply chain requires consideration of the social aspects of the business [7].Carter et al. [8] established an effective approach and demonstrated that environmental purchasing is significantly related to both net income and cost of goods sold.Carter and Jennings [9] also pointed towards the importance of CSR in the supply chain, in particular the role played by the purchasing managers in socially responsible activities and the effect of these activities on the supply chain.Sethi [10] introduced a taxonomy in which a firm's social activities include social obligations as well as more voluntary social responsibility.And Carroll [11,12] developed a framework for CSR that consists of economic, legal, and ethical responsibilities.
A supply chain usually includes suppliers, manufacturers, distributors, wholesalers, and retailers.The members of a supply chain take their decisions based on maximizing their individual net benefits.When they need to accept CSR due to government's regulations and policies or consumer's concern, additional costs are forced on them.However, many firms adopt CSR for enhancing their reputation due to public concerns over social issues or other benefits, such as increased sales and tax returns and long-term profits in many respects, including stable suppliers, low cost delivery, and extended market demand.These benefits are proved by some wellknown case studies of large international companies, such as Nike, Gap, H&M, Wal-Mart, and Mattel [13].The situation in which the members of supply chain tend to gain individual benefits (meanwhile they have to bear a level of CSR) is a conflict situation and this leads to an equilibrium status.Game theory is one of the most effective tools to deal with this kind of management problems.A growing number of research papers use game theoretical applications in supply chain management [14][15][16][17].Cachon and Zipkin [18] discuss Nash equilibrium in noncooperative cases in a supply chain with one supplier and multiple retailers.Hennet and Arda [19] presented a paper to evaluate the efficiency of different types of contracts between the industrial partners of a supply chain.They applied game theory methods for decisional purposes.Tian et al. [20] presented a dynamics system model based on evolutionary game theory for green supply chain management.Gadimi et al. [21] used cooperative games in supply chain management in order to find fair allocation schemes for dividing the total profit of grand coalition among the members.
In 1934, Stackelberg introduced a concept of a hierarchical solution which is a simple dynamic game [22].Stackelberg game involves players with asymmetric roles which are called leader and follower.The leader chooses a strategy first and the follower, with the knowledge of leader's strategy, chooses a policy.The leader anticipates follower's optimal response and chooses the best possible point.There are different types of dynamic Stackelberg solutions depending on the information's structure, the open-loop, the feedback, and the global Stackelberg solution (that is a closed loop structure of information) structures.In an open-loop strategy, players choose their decisions at time , with information of the state at time zero.In contrast, in a closed loop information structure, leader has perfect knowledge of all the past and current values and, in a feedback information structure, players use their knowledge of the current state at time t in order to formulate their decisions at time  [23].In this paper, we consider feedback and closed loop structures.A discrete time version of the dynamic differential game has been studied.The optimal control theory is the standard tool for analyzing the differential game theory [24].We formulate a model and study the behavior for decentralized supply chain networks under CSR conditions with one leader and two followers.The Stackelberg game model is recommended and applied here to find an equilibrium point at which the profit of the members of the supply chain is maximized and the level of CSR is adopted in the supply chain.We develop a Stackelberg game by selecting the supplier as the leader and the manufacturer as the follower.Using this approach, the supplier, as a leader, can know the optimal reaction of his follower and utilizes such processes to maximize his own profit.The manufacturer, as a follower, tries to maximize his profit by considering all the conditions.We propose a Hamiltonian matrix to solve the optimal control problem in obtaining the equilibrium in this game.
The paper is organized as follows: Section 2 is devoted to the mathematical model.Management policies and numerical examples are illustrated in Section 3 and Section 4 contains a short conclusion.

Mathematical Model
Game theory has often been applied in diverse areas such as business, economics, and management to solve problems involving conflict and cooperation and it analyzes problems which are multicriteria and multidecision-makers. Supply chain management can be considered as a set of management processes.Consequently, game theory is an effective method for supply chain management.Competition among members in a supply chain network is one of the significant topics which are emphasized in supply chain management.Furthermore, members in supply chain networks either are pressured to accept CRS by governments, organizations, and consumer or have to bear at least some CSR under policies and regulations.However, naturally, members in a supply chain network want to maximize their individual net profits; meanwhile they have to take the level of social responsibility in entire supply chain network.These conditions provide a challenge.In order to deal with this situation, we use Stackelberg game model which is often applied to study dynamic problems.As it was mentioned, in a Stackelberg model, leader chooses a strategy first and then follower observes this decision and makes his own strategy choice.Intuitively, the first player chooses the best possible point based on the second player's best response function.

Problem Description and Assumptions.
We establish a Stackelberg game between the supplier as a leader and the manufacturer as the follower regarding the allocation of CSR to each member of a supply chain by dynamic Stackelberg game theory.The goal of each player is to maximize own profit with considering CSR condition.This model is a threetier, decentralized vertical control supply chain network (see Figure 1).All retailers and suppliers at the same level make the same decision.Therefore, the general model can be simplified in a model that has only one supplier, one manufacturer, and one retailer.Moreover, we assume the manufacturer's retail price includes two parts: a fixed retailers' profit of per-unit sale in addition to a per-unit lot sale charge so that we could eliminate the retailer, who is not a decision-maker, from the game.A Stackelberg differential game has two players playing the game over a fixed finite horizon model.
This model has a state variable and control variables like any dynamic game.We define the state variable as the level of social responsibility taken up by companies, and the control variables are the capital amounts invested while fulfilling the social responsibility.Specifically, all of the social responsibilities taken up by the firm  at period  can be expressed as the investment    .We suppose that   evolves according to the following rule: More specifically we have the following assumptions.The function   (  ) represents social benefit which is proportional to social responsibility taken up by the supply chain system [25].The function   =   [1 + (  )] measures the value of the tax return to the members of the supply chain [26].Both  and  are tax return policy parameters.
Specifically,  is the rate of individual posttax return on investment (ROI), and  is the rate of supply chain's posttax return on investment (ROI).The market inverse demand is   (  ) =  −   [27].The accumulation of the level of social responsibility taken up by the firms is given by Here,  1 is the rate of converting the supplier's capital investment in CSR to the amount of CSR taken up by the supply chain and  2 is the rate of converting the manufacturer's capital investment in CSR to the amount of CSR taken up by the supply chain [28].3. Objective Functions.The objective functions are made to depend on the control vectors and the static variable.The members of the supply chain attempt to optimize their net profits, which includes minimizing the cost of raw materials and investment in social responsibility and maximizing sale revenues and benefits from taking social responsibility as well as tax returns.Thus, the objective function of the supplier is

Notations and
where    is the price of the supplier's raw material.Suppose    =  and    (  ) =  2  is the social benefit of the supplier,  is the parameter of the supplier's social benefit, and  S  (   ,   ) is the tax return of the supplier.Similarly, the objective function of the manufacturer is where    (  ) is the retail price of the product of the manufacturer.   (  ) = δ 2  is the social benefit of the manufacturer, δ is the parameter of the manufacturer's social benefit, and    (   ,   ) is the tax return of the manufacturer.

The Feedback Solution.
In Stackelberg game, under the feedback structure of information assumption that the players use their knowledge of the current state at time  in order to formulate their decisions at time .Players select their strategies on current time and they do not depend on the initial condition.Hence, feedback strategies are subgameperfect.To solve Stackelberg game, under the feedback structure of information assumption use the dynamic programming method with appropriate value functions [29].
In fact, feedback equilibrium strategies at any time t are functions of the values of the state variables at that time.In a feedback Stackelberg game the advantage of the leader over the follower is instantaneous not global, as the differential game could be viewed as the limit of the discrete time game as the number of stages becomes unbounded.Therefore, corresponding to the leader's instantaneous strategy, the follower will make an instantaneous response which depends on the current state and the leader's current action.
Let  be the last period of the problem.We solve a Stackelberg game by backward induction for last period of game, which is to substitute the follower's response function derived from solving the optimization problem of the follower, given the leader's response to the leader's objective function in the last period.For fixed    , the reaction function of the is directly given by arg max      T , that is: The objective function of supplier in period  is substituting the value of    given by ( 4) to ( 5) the maximum of    is obtained when After some algebra we get, for the last period, resolution to the problem at the period  − 1.Let   ( − 1, ) denote the value functions for the period  − 1 to .For any given policy by the leader the follower's value function equation is where gives an optimal action for the manufacturer for the period  − 1. Subsequently, the leader's value function equation defined by using again the state equation definition   and value of the   * −1 into above equation, we can obtain   * −1 .This system can be solved in obtaining value functions at any time  and the feedback Stackelberg strategies.With some backward-forward equations we can get the values of    ,    , and   .

The Global Stackelberg Solution.
In this section, the structure of information is considered closed loop model of Bas ¸ar and Olsder [29].This is the derivation of (global) Stackelberg's solutions when the leader has access to dynamic information.In this structure the leader has a perfect knowledge of all the past and current values of the state and controls and the leader tries to find an incentive strategy such that he can reach his global optimum.The idea of declaring a reward (or punishment) for a decision-maker according to his particular choice of action in order include a certain desired behavior on the part of decision-maker is known as an incentive (or in case of the punishment, as a threat) [29].
The global optimum of the supplier is assumed to be unique.The objective function of the supplier is There exists a couple ( ) is strictly concave in   and   and if there is no singularity.In order to avoid this singularity, we need to add a constraint on   or   and this is a zero-point constraint.One should guess that this global optimum of supplier in time  will be reached when the profit of the manufacturer is zero (profit of manufacturer from playing game in time ).Therefore the optimum is obtained for the supplier under a kind of zero-point constraint.
Recall that The profit of manufacturer with playing game in the time  is    (   ,   ) −    that should be zero and involves As    = 0 is not the good choice, so the other one will be chosen.
Since we consider this dynamic game as an optimal control problem, the Hamiltonian function is a practical way for us in solving the game [10].
The Hamiltonian-Lagrangian for the supplier is defined by where    is the objective function of the supplier and, to obtain the Stackelberg strategy of the supplier, we maximize the objective function of the supplier by its Hamiltonian function.The maximization problem of the supplier, over    and    , gives us the solution of the following set of first-order conditions.
We also have Consequently, we can obtain the unique optimal response of the players as follows: And we have After some algebra, we obtain Since we use closed loop information, the structure variables depend on the current time variable and the initial state variables. 1 is given.Furthermore, the boundary condition is   +1 = 0.

Augmented Discrete Hamiltonian Matrix.
In the related literature, one can find a lot methods to solve the optimal control problem.We have chosen an algorithm given by Medanic and Radojevic which is based on an augmented discrete Hamiltonian matrix [30].Firstly, we assume with the boundary conditions   +1 = 0 and  1 given.We can get the values of the matrices We assume a linear relation between    and   ; thus, the optimal controls can be determined at each time step based on the current estimated state where   and   are determined by the backward equations with the following boundary conditions:

Management Policies and Reflexes: A Numerical Example
A supply chain structure usually includes a certain numbers of players.Our model falls in this field.In particular in our work the model is a three-tier, decentralized vertical control supply chain network.All retailers and suppliers at the same level make the same decision.Therefore, consequently the model has only one supplier, one manufacturer, and one retailer.The members of this supply chain-type take their decisions based on maximizing their individual net benefits with a constraint: a given level of CSR that must be reached by the network.This situation leads to an equilibrium status that has relevant management policies reflexes individual both for all players and for the supply chain network therein.
As we can observe by the following numerical example, the model which was elaborated gives us the opportunity to set up the mechanism design of the relationships among the different levels and the players of our Management Game.This output is not simply theoretical but in our opinion contains important issues in solving Management Decisions.As it is well-known, business logistics management refers to the production and distribution process within the company, while supply chain management includes suppliers, manufacturers, and retailers that distribute the product to the end customer.Supply chains include every business that comes in contact with a particular product, including companies that assemble and deliver component parts to the manufacturer.This aspect is strictly correlated with the Corporate Social Responsibility which we discussed by our model.In a vision which includes the social mission of the firms but looking for reaching profits which are fundamental for the economical sustainability of the business, we find a model by which we capture the existing correlations between these two issues.
Another aspect that was deeply studied is related to the information structure of the network.As we can see by the numerical example which follows, if we choose feedback and closed loop information structures among the players, the global environment of the supply chain promotes the interaction between all the members of the network that are naturally oriented in playing the game, that is, strengthening the immaterial but productive structure represented by the supply chain.
We draw the results of the model, in feedback and closed loop Stackelberg dynamic game.Figures 2 and 3 show the trend of supplier's and manufacturer's profits from periods one to ten in a feedback and closed loop Stackelberg game.For supplier, which is the leader, the results of game in feedback and closed loop solutions have same trend and during the period the profit of supplier has increased.However, for the manufacturer, which is the follower, closed loop solution involves zero-profits.That is our closed loop solution which is based on a nonprofit constraint.Moreover, in feedback solution the profit of manufacturer has risen over time.Obviously, both manufacturer and supplier gain extra profit from playing the game in feedback solution.In addition, it is apparent that playing game in the closed loop solution is beneficial to supplier which is the leader of the game.
Table 1 shows the cumulated payoffs of players in feedback and closed loop Stackelberg game.Figures 4 and 5 compare the cumulated profits of the supply chain's members.JSO is supplier's profit without playing the game and JMO is manufacturer's profit without playing the game.In sum, for this case in which supplier is the leader of game, both supplier and manufacturer are motivated to play the game in feedback solution because their benefits have increased; in closed loop situation some incentive strategies are needed in order that follower plays game while profit of the leader has risen in this solution as well.

Conclusion
We investigated a decentralized three-tire supply chain consisting of supplier and manufacturer with the aim of allocating CSR to members of the supply chain system over time.We considered a Stackelberg game consisting of a leader and a follower with two information structures.The members of a supply chain play games with each other to maximize their own profits; thus, the model used was a long-term coinvestment game model.The equilibrium point in a time horizon was determined at where the profit of supply chain's members was maximized and CSR was implemented among members of the supply chain.We applied control theory and used an algorithm (augmented discrete Hamiltonian matrix) to obtain an optimal solution for the dynamic game model.

𝛽 2 :
Definitions.By facilitating the model, certain parameters and decision variables are used.Notations and definitions we use in our model are shown as follows: : Period  : Planning horizon   : Demand quantity at period  : Market potential : Price sensitivity   : State variable, degree of taking SR   : Hamiltonian-Lagrangian function of the supplier    : Objective function of the supplier    : Objective function of the manufacturer   (  ): Social benefit of the manufacturer   (  ): Social benefit of the supplier   (  ): Tax return of the supplier   (  ): Tax return of the manufacturer    : The amount of investment done by the manufacturer    : The amount of investment done by the supplier : The percentage of investment of the supplier payoff : The price of supplier's product which is delivered to manufacturer : The price of the supplier's raw materials : Parameter of the supplier's social benefit δ: Parameter of the manufacturer's social benefit : Deteriorating rate of the level of current social responsibility : The rate of individual posttax return on investment (ROI) : The rate of supply chain's posttax return on investment (ROI)  1 : The rate of converting the supplier's capital investment in CSR to the amount of CSR taken up by the supply chain The rate of converting the manufacturer's capital investment in CSR to the amount of CSR taken up by the supply chain 2.
is known.Using the definition   =  1   −1 + 2   −1 into    and maximizing the value function for any fixed   −1

Figure 2 :Figure 3 :
Figure 2: Evolution of    , where FD and CL are feedback and closed loop solutions, respectively.

Figure 4 :
Figure 4: Comparison of the supplier's profits, playing game and without playing any game in feedback and closed loop solutions.