A Dynamic Stackelberg–Cournot Duopoly Model with Heterogeneous Strategies through One-Way Spillovers

Many works studied on complex dynamics of Cournot or Stackelberg games, but few references discussed a dynamic game model combined with the Cournot game phase and Stackelberg game phase. Under the assumption that R&D spillovers only flow from the R&D leader to the R&D follower, a duopoly Stackelberg–Cournot game with heterogeneous expectations is considered in this paper. Two firms with different R&D capabilities determine their R&D investments sequentially in the Stackelberg R&D phase and make output decisions simultaneously in the Cournot production phase. R&D spillovers, R&D investments, and technological innovation efficiency are introduced in our model. We find that: (i) the boundary equilibrium of the dynamic Stackelberg–Cournot duopoly system, where two players adopt boundedly rational expectation and näıve expectation, respectively, is unstable if the Nash equilibrium of the system is strictly positive. (ii) +e Nash equilibrium of the dynamic Stackelberg–Cournot duopoly system, where two players adopt boundedly rational expectation and näıve expectation, respectively, is locally asymptotically stable only if themodel parameters meet certain conditions. Especially, results indicate that small value of R&D spillovers or big value of output adjustment speed may yield bifurcations or even chaos. Numerical simulations are performed to exhibit maximum Lyapunov exponents, bifurcation diagrams, strange attractors, and sensitive dependence on initial conditions to verify our findings. It is also shown that the chaotic behaviors can be controlled with the state variables feedback and parameter variation method.


Introduction
An oligopolistic market has a structure where there are a tiny number of firms producing the same or homogeneous commodities, which are sold in a common market. e classic oligopoly model, proposed by Augustin Cournot originally in 1838 [1], gives a mathematical description of the competitions in oligopolistic markets, and it shows how the firms influence each other in making production decisions. In static Cournot oligopoly games, all firms know other opponents' strategy space, payoff functions, and make actions simultaneously, which means each firm adopts a naïve expectation to make his production decision, so he assumes that the opponents' yield keeps the same level as that in the previous period. However, there is no doubt that asymmetry information exists widely in production practice. To reflect this phenomenon, significant additions to the formal theory of oligopoly were made by Stackelberg [2], which was named as the "leader-follower model". In this model, two competing firms, where one is called as the leader and the other is called as the follower, determine their outputs successively [3] and the leader know the follower's reaction function, so the leader usually gains more profits than the follower for his first-mover advantage. Exactly due to that, each oligopolistic enterprise must consider not only its own quantity decision but also the reactions of all other competitors, and behaviors of both Cournot games and Stackelberg games become more and more complicated.
Many works focus on games with homogeneous strategies [3,4,16], but the belief that each firm in oligopolistic market behaves with different expectation [6-8, 10, 13, 25, 26], is common in real word, and as a consequence, our paper will apply this belief to a duopoly game.
Technological innovation is the sustained driving force for the survival and development of firms, and R&D activities are important carriers of technological innovation, while they are also crucial means for firms to acquire core competitiveness. As referred in many works [27][28][29][30][31][32], R&D spillovers inevitably occur in the R&D activities, with both positive and negative effects. at means, on one hand, R&D spillovers can lower the enthusiasm of firms because of other firms' "hitchhike", and on the other hand, it could also reduce all firms' production cost due to the positive externalities. In addition, oneway R&D spillovers [27,28] can happen on account of the gap of R&D capabilities between firms, that is, R&D spillovers only flow from enterprises with stronger R&D capabilities to weaker ones in the R&D process, and this phenomenon would be applied in our paper.
Due to enterprises' bounded rationality and the universality of R&D spillovers, we need to consider the following questions: (i) in a perfectly rational duopoly market consisting of two stages of successive R&D and simultaneous production, what is the relationship between equilibrium output and R&D input? (ii) When the duopoly is a bounded rational and adopts a different output adjustment mechanism, is there stable output? If Nash equilibrium output exits, what is the condition? To address these issues, our paper adopts a Stackelberg-Cournot model to analyze the decision-making process, which is divided into a Stackelberg R&D phase and a Cournot production phase. In our model, the oligopolistic market contains two firms where firm 1 is called as the R&D leader and firm 2 is called as the R&D follower. Consider that firms use heterogeneous strategies to adjust their outputs, and we assume that firm 1 represents a boundedly rational player and firm 2 adopts naïve expectation. Finally, our research gives the relationship between Nash equilibrium output and R&D input in a completely rational monopoly market and provides the region where the equilibrium output exists in a boundedly rational duopoly.
Our research contributes to the extant literature on complex dynamics of Cournot or Stackelberg games. e literature on Cournot or Stackelberg games has been studied a lot, respectively, but few references discussed a dynamic game model combined with the Cournot game phase and Stackelberg game phase. Mathematical properties of a stochastic Stackelberg-Nash-Cournot game [33] and a discontinuous Cournot-Stackelberg model [34] have been studied. Flåm et al. [35] discussed continuity properties of the followers reaction and provided sufficient conditions for existence of Stackelberg-Cournot equilibrium in oligopolistic markets. Julien [36] compared the Cournot equilibrium and the Stackelberg-Cournot equilibrium in a mixed markets exchange economy. Ma and Ren [37] analyzed a dynamic Cournot-Stackelberg model, which involved a feedback regulation system with one manufacturer and two retailers in the market. Xu et al. [38] used a novel Stackelberg-Nash-Cournot equilibrium model to discuss the relationship between the subarea managers and the reservoir authority, at the perspective of water rights transaction. ese papers, which studied on Stackelberg-Cournot games or Cournot-Stackelberg games, are primarily based on perfect rationality. By contrast, our paper focuses on a Stackelberg-Cournot game with imperfect rationality.
Our research also complements the literature that studied R&D spillovers in a nonlinear dynamic system. Ever since D'Aspremont and Jacquemin proposed AJ model [27], where completely rational duopoly firms play a two-stage game with Cournot R&D and Cournot production, many papers have studied the influence of technology spillover on enterprise competition and cooperation [28,39], and imperfect rationality plays an important role in dynamic analysis of R&D spillovers [29][30][31][32]. Our paper differs from these aforementioned references in three ways. First, extant literature is generally based on the assumptions of bilateral spillovers and simultaneous actions in a two-stage game [29][30][31][32]. By contrast, this paper considers a Stackelberg-Cournot model which includes the Stackelberg R&D phase with one-way spillovers and the Cournot production phase. Second, in a two-stage game, we assume duopoly firms are bounded rationality in quantity decision-making, the hypothesis same as that of [15], while most previous literature assume that oligopoly enterprises are bounded rational in R&D activities. Unlike [15], we increase the study of the R&D spillover coefficient. ird, like other papers, we study the spillover coefficient, but the difference is that we also study the influence of technological innovation efficiency (TIE) on the equilibrium output.
Two important findings of our research are summarized as follows. First, the equilibrium quantity is ultimately determined by firms' R&D spillover, TIE, and marginal cost in a perfectly rational duopoly market consisting of the successive R&D stage and simultaneous production stage, not the R&D input, which is different from our common sense that the Nash equilibrium output is directly related to R&D input [29,32]. is is because the R&D input is also determined by firms' R&D spillover, TIE, and marginal cost with backward induction. Second, we give the local stability condition of Nash equilibrium. Unlike extant references [29,31,32], our paper particularly studies the influence of R&D spillover and TIE on the stability of the Nash equilibrium output, where two firms adopt boundedly rational expectation and naïve expectation, respectively, and we find that small value of R&D spillovers or big value of output adjustment speed may yield bifurcations or even chaos. e content of this paper is as follows. In Section 2, the nonlinear duopoly Stackelberg-Cournot model is described, and a two-dimensional discrete system with heterogeneous players is formulated. In Section 3, the existence and stability of equilibrium points in the dynamical system are analyzed, and the stable regions are also calculated. To verify our theoretical results, numerical simulations are carried on to show complex dynamic in Section 4, such as maximum Lyapunov exponents, bifurcations, strange attractors, and sensitive dependences on initial conditions. In Section 5, a new method named control strategy of the state variables feedback and parameter variation is employed to control chaos of the system. Finally, the research results are summarized and discussed in Section 6.

e Duopoly Stackelberg-Cournot Model.
In this paper, the duopoly Stackelberg-Cournot game is divided into two stages. Stage 1 is the Stackelberg R&D phase where the strategy space is the choice of R&D investments, two firms with different R&D capabilities sequentially carry out non-cooperative game around R&D investments for the sake of higher revenues, in the innovative process, the R&D leader makes decision on its R&D investments first, the follower determines his input after observing the opponent's decision, and furthermore, we assume that R&D spillovers only flow from the R&D leader to the follower. Stage 2 is the Cournot production phase where the strategy space is the choice of output, and in this phase, the choices of R&D investments made in stage 1 are common knowledge, the two oligarchs decide their outputs simultaneously.
We consider a duopoly Stackelberg-Cournot game where two firms, labelled by i � 1, 2, produce perfect substitute goods with production levels q i , i � 1, 2, respectively, and sell them at discrete time periods t � 0, 1, 2, . . . on a common market. Firm 1 is the Stackelberg leader, and firm 2 is the follower; besides, we denote the output of firm i at time period t by q i (t), which is updated according to discrete time steps.
Assume that the inverse demand function has the linear form p(Q) � a − Q, where the total supply Q � q 1 + q 2 , and the positive constant α represents the maximum amount of outputs that can be brought to the market. e production cost function is denoted by C i (q i ) � c i q i , where c i represents the marginal cost of firm i's products, and c 1 � c 2 � c before the innovation.
In the Stackelberg R&D phase, it generates autonomous cost reductions under decreasing returns to R&D investments, that is, firms should define the optimal R&D level to balance the innovation cost and the marginal cost reduction. With sequential play at the Stackelberg R&D stage, due to R&D's one-way flow, the cost reduction accruing to the firm 1 just depends on his own investments x 1 , and the marginal cost for the R&D leader is given by c 1 � c − β 1 �� x 1 √ (the assumption same as that of [28], while that for the R&D follower is given by vation efficiency (TIE) of firm i. θ is the coefficient of R&D spillovers, which implies that some benefits of firm 1's R&D flow to firm 2 without payment, the external effect of the leader's R&D is to lower the follower's marginal production cost, specifically, and θ ∈ [0, 1], θ � 0 means the technological innovation of the R&D leader is not freely obtained by the follower, while θ � 1 means fully obtained without any payment. Moreover, the cost reduction y i � β i �� x i √ represents the R&D production function, characterized by the inverse mapping of the R&D cost function used by D'Aspremont and Jacquemin [27], with x i � ((1/β i )y i ) 2 and β i � (2/c). As to firm 2, the cost reduction is not only due to the innovation of its own R&D investments but is also attributed to the technology spillover from firm 1.
With these assumptions, the profit of each firm is en, the marginal profit of each firm at point (q 1 , q 2 ) is given by In order to maximize each firm's profits, set (zπ 1 /zq 1 ) � 0 and (zπ 2 /zq 2 ) � 0 and solve for q 1 and q 2 ; then, the Cournot Nash outputs are obtained as the following form: Substituting equations (5) and (6) into equation (2) to get the maximum of π 2 , we calculate a derivative of π 2 with respect to x 2 and set it to zero; then, the optimization problem of the follower has a unique solution as follows: We take equations (5)-(7) into equation (1), equate the partial derivative of π 1 with respect to x 1 to zero, and then, the optimal actions of firms can be obtained as follows: where Δ 1 � 9 − 4β 2 2 and Δ 2 � 18β 1 − 9θβ 2 − 6β 1 β 2 2 . We substitute equations (8) and (9) into equations (5) and (6), and then, the equilibrium solution in the Stackelberg-Cournot game is obtained as the following form: Discrete Dynamics in Nature and Society

Duopoly Stackelberg-Cournot Game with Heterogeneous
Strategies. We consider two firms think with different strategies to decide their outputs for profit maximization. e leader uses bounded rationality, he does not have a complete knowledge of the market demand function, and determines his output on the basis of expected marginal profit (zπ 1 /zq 1 ); in consequence, he increases (decreases) the production if the marginal profit is positive (negative) at the next period. e dynamic adjustment mechanism can be modeled as follows: where v is a positive constant, which represents the output adjustment speed of firm 1. We assume the follower is a naïve player, he computes his output according to the reaction function, which is derived from equation. (4), e.g., the dynamical equation of firm 2 has the form as follows: We combine equations (11) and (12); therefore, the twodimensional system that characterizes the dynamics of a Stackelberg-Cournot duopoly game with heterogeneous players is given by

Equilibrium Points and Local Stability
In this section, we solve the equilibrium points of the dynamic duopoly game to study their qualitative behavior. We set q i (t + 1) � q i (t), i � 1, 2 in (13), and we can get the solution of the nonlinear algebraic system as follows: Easily, the algebraic system (14) has two equilibrium points: where the expressions of x 1 and x 2 are given by equations (8) and (9), respectively and q * 1 and q * 2 are shown in equation (10). Obviously, E 1 is a boundary equilibrium point and E 2 is the unique Nash equilibrium point. E 2 has positive coordinates provided that To investigate the local stability of the equilibrium points E 1 and E 2 , we have to find the Jacobian matrix for the system of equation (13) at any point (q 1 , q 2 ) taking the following form: e equilibrium points will be stable if the eigenvalues φ i , i � 1, 2 of the abovementioned Jacobian matrix satisfy inequalities |φ i | < 1.

Proposition 1. If the Nash equilibrium E 2 is strictly positive, the boundary equilibrium point E 1 of the discrete dynamical system (13) is a saddle point.
At the boundary equilibrium point E 1 , the Jacobian matrix becomes a triangular matrix and takes the following form: The eigenvalues are given by the diagonal entries, i.e., as we are only interested in positive trajectories, and we can deduce φ 1 > 1 from equation (16); therefore, the eigenvalue φ 1 is greater than 1 and φ 2 less than 1, and E 1 is a saddle point (unstable). 4 Discrete Dynamics in Nature and Society Similarly, we analyze the asymptotic stability of the Nash equilibrium for the two-dimensional map (13). e Jacobian matrix at E 2 has the following form: The characteristic equation of the matrix J(E 2 ) is where Tr is the trace and Det is the determinant of the Jacobian matrix J(E 2 ); hence, since (Tr(J)) 2 − 4Det(J) � (1 − 2vq * 1 ) 2 + 2vq * 1 > 0; this means that there are two real roots in the characteristic equation.
As we know from the stability theory, the sufficient and necessary conditions for the local stability of Nash equilibrium E 2 are that the eigenvalues of Jacobian matrix J(E 2 ) are inside the unit circle in the complex plane, and it is true only if following Jury's conditions, Peng et al. [15], hold: The abovementioned inequalities of (i), (ii), and (iii) define a region where the Nash equilibrium point E 2 is locally stable. e violation of any single inequality in (i), (ii), and (iii), with other two being simultaneously fulfilled, leads to (1) a flip bifurcation (real eigenvalue that passes through −1) when 1 + Tr(J) + Det(J) � 0; (2) a fold or transcritical bifurcation (a real eigenvalue that passes through 1) when 1 − Tr(J) + Det(J) � 0; and (3) a Neimark-Sacker bifurcation (i.e., the modulus of a complex eigenvalue pair that passes through 1) when 1 − Det(J) � 0 and |Tr(J)| < 2.
We substitute Tr(J) and Det(J) into the inequalities of (i), (ii), and (iii), and then, the stability conditions in (22) can be written as follows: Obviously, the inequalities (ii) and (iii) are always satisfied. en, condition (23) becomes We can obtain the threshold v * � (4/5q * 1 ) given by the vanishing of the left-hand side of inequality (24). erefore, the Nash equilibrium point E 2 can lose stability only through a flip bifurcation. We will have the following proposition about local stability of Nash Equilibrium point E 2 .

Proposition 2.
e Nash equilibrium point E 2 is asymp- From the foregoing, some information about the effects of the model parameters on the local stability of equilibrium E 2 can be obtained. e Nash equilibrium point E 2 is stable for any given v < v * , which means, if the speed of adjustment of firm 1 is in the interval region defined by 0 < v < v * , the yields of the two firms will tend towards the Nash equilibrium E 2 . Also, with other parameters held fixed, an increase of the output adjustment speed v would cause a destabilizing effect; that means, the trajectory diagram of this point (q 1 , q 2 ) will be shown as follows: it crosses the flip bifurcation surface at v � v * and period-2 points bifurcate from E 2 when v > v * .
The similar analysis applies to one of the parameters a, c, θ, β 1 and β 2 with other model parameters held fixed. Complexity behaviors, such as period doubling and chaotic attractors, will also occur, when the maximum Lyapunov exponents of the system (13) are positive.

Numerical Simulation
e main purpose of this section is to show the qualitative behavior of a Stackelberg-Cournot duopoly game with heterogeneous players, described by the system (13), and exhibit how the system evolves when the model parameters take different level of values. To provide some numerical evidence for the existence of chaotic motions, we present various numerical tools to show it, including bifurcation diagrams, strange attractors, maximum Lyapunov exponents, sensitive dependence on initial conditions, and so on. Figure 1 presents a bifurcation diagram of system (13) in the (v − q 1 q 2 ) plane when a � 10, c � 2, β 1 � 0.6, β 2 � 0.3, θ � 0.2. From Figure 1, we can see that the orbit of the quantity outputs (q 1 , q 2 ) approaches to the stable fixed point E 2 (3.06, 2.58) for the adjustment speed v < v * � 0.261; furthermore, we can get the optimal investments of each firm with x 1 � 1.377 and x 2 � 0.267. As v increases, the Nash equilibrium point E 2 (3.06, 2.58) becomes unstable, a flip bifurcation for system (13) takes place at v � v * � 0.261, period-2 points bifurcate as v > 0.261, and infinitely multiperiod doubling bifurcation of the phase quantity behavior becomes chaotic, which means that dynamical game (13) always converges to complex dynamics for a large value of adjustment speed of the boundedly rational player firm 2. In reality, the quantity outputs of firms acutely fluctuate when bifurcation and chaos occur; therefore, it is hard for the players to forecast their outputs and make decisions in the future. In order to show bifurcations and chaos, the maximum Lyapunov exponent is also plotted in Figure 1, where positive values show that the chaotic behaviors and the maximum Lyapunov exponent equal to zero at bifurcation point. Society Figures 2-4 shows partial bifurcation diagrams with respect to the parameters θ, β 1 , β 2 in system (13). From Figures 2 and 4, we can see that system (13) experiences chaos and periodhalving bifurcations, where the system dynamics is chaotic for small values of θ or β 2 , and period-halving bifurcations occur as θ or β 2 increases. Figure 3 gives the bifurcation diagram with respect to β 1 , the firms' outputs are unstable even for small values of β 1 , and as β 1 increases, complex dynamic behavior occurs, including higher-order cycles and chaos. e strange attractor is a standard tool to characterize the chaos of a dynamic system, and it reflects the inherent regularity of the complex phenomena in a chaotic state. With the help of it, players can forecast their outputs in a short term. Figure 5 shows the graph of a strange attractor of the dynamic game (13) for the parameter values a � 10, c � 2,

Discrete Dynamics in Nature and
To demonstrate the sensitivity of system (13) to initial conditions, we compute two sets of orbits, where one set of orbits comprises two orbits with initial points (q 1 (0), q 2 (0)) and (q 1 (0) + 0.0001, q 2 (0)) and the other with initial points (q 1 (0), q 2 (0)) and (q 1 (0), q 2 (0) + 0.0001). e results are shown in Figure 6, and the time series are indistinguishable at the beginning, but after a number of iterations, the difference between them rapidly builds up. Figure 6 shows sensitive dependence on initial conditions for the q 1 -coordinate (or q 2 -coordinate) of the two orbits for the system (13), plotted against the time with the parameters values a � 10, c � 2, β 1 � 0.6, β 2 � 0.3, θ � 0.2, v � 0.42, and the q 1 -coordinate (or q 2 -coordinate) of initial conditions differs by 0.0001, with the other coordinate kept equal. From Figure 6, we can see that the time series of system (13) is sensitive dependent on initial conditions, i.e., complex dynamics behaviors occur in this model.

Chaos Control
In a nonlinear and dynamic discrete production system, many factors, such as the adjustment speed, R&D investments, TIE, and R&D spillovers, will make the market deviate from the equilibrium state and even become chaotic. In the chaotic states, the market will sensitively depend on the parameters values, and parameter variations will lead to the markets' long-term trajectory unpredictable. Precisely, because the chaos in market are not expected and are even harmful to the participants, certain methods should be adopted to suppress or eliminate the occurrence of bifurcations and chaos. Various methods for controlling chaos have been used in dynamical systems; the OGY method was presented by Ott et al. [40] and had been applied in the dynamic game model to control chaos [41,42], a modified straight-line stabilization method [12], adaptive control [13], time-delayed feedback method [43], and other feedback control methods [6][7][8] had also been studied for the chaos control in an economic model with homogeneous or heterogeneous expectations. It can be known from previous works that feedback and parameter variation are two  Discrete Dynamics in Nature and Society effective methods [9,12,13,16,27,28,[40][41][42][43][44], to achieve chaos control. Recently, a new control method called as control strategy of the state variables feedback and parameter variation was proposed [45] and had been used in the work of [8,13,26]. In this section, the same method will be used to control the chaos of system (13); hence, the twodimensional discrete dynamic system (13) is changed into the following format:  e bifurcation diagram of system (25) with respect to the controlling factor μ with other parameters values (a � 10, c � 2, β 1 � 0.6, β 2 � 0.3, θ � 0.2, a � 0.41). With μ increasing, the system chaos has being gradually controlled, and the system tends to be stable when μ is large enough. 8 Discrete Dynamics in Nature and Society where μ > 0 is the controlling factor. As the adjustment speed for firm 1 goes up, system (13) will fall into an instability region; Figure 1 shows the bifurcation diagram with respect to v, and Figure 5 gives a map of the strange attractor corresponding to the chaotic state (a � 10, c � 2, β 1 � 0.6, β 2 � 0.3, θ � 0.2, a � 0.41). However, after adding the controlling factor μ to the chaotic state, the complex situation could be forced to become steady. Figure 7 indicates that system (13) can get rid of chaos successfully when the controlling parameter μ reaches 0.298, and Figure 8 shows that the chaotic system is controlled at a fixed point when μ � 0.31.
As discussed in Section 3, many parameters, such as the R&D leader's adjustment speed, R&D spillover, and TIE, affect the stability of equilibrium output. erefore, in order to facilitate a stable output, the R&D leader can slow its output adjustment speed, with other parameters, as shown in Figure 1; they can also enhance the atmosphere of technology sharing, as shown in Figure 2; besides, the R&D follower can improve its innovation efficiency, as shown in Figure 4. In addition to the important impact of parameters changes on equilibrium output with different adjustment mechanisms, we should also consider the impact of the previous output, especially the production quantity in the last phase. With this new control method, we know that when the duopoly market is unstable, we should take the output of the previous period more into account for the production adjustment of the next period.

Conclusions
is paper investigates a dynamic Stackelberg-Cournot duopoly game with one-way spillovers. Two types of heterogeneous players, who adopt the bounded rational expectation and naïve mechanism, respectively, determine their R&D investments sequentially in the Stackelberg R&D phase and make output decisions simultaneously in the Cournot production phase. e R&D investments before the Cournot production phase have been solved by backward induction. Dynamics of the system under different regimes of the main parameters, such as the R&D leader's adjustment speed, R&D investments, technology spillovers, and TIE, have been explored. Basic properties of the discrete dynamical system have been analyzed numerically via computing Lyapunov exponents, bifurcation diagrams, sensitive dependence on initial conditions, strange attractors, and chaos controlling. Research results show that complex dynamic behaviors would occur as model parameters vary, such as cycles and chaos, and we can stabilize the chaotic behavior of the system to a stable fixed point by introducing an appropriate controlling parameter.

Data Availability
Because the simulation graph in the paper is based on the virtual data under certain conditions, the data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.