Research on Dynamic Programming Game Model for Hydropower Stations

In the condition of the electricity market, the benet is the result of market gaming for hydropower stations with price-making ability.e traditional energymaximizationmodel is not appropriate in this circumstance. To study long-term operation variation in the market, a dynamic programming game model of hydropower stations is proposed to obtain a Nash equilibrium solution in long-term time series. A dynamic programming algorithm iteratively solves the model. e proposed approach is applied to hydropower stations of Longtan, Xiaowan, and Goupitan in a hypothetical pure hydropower market. e results show that the dynamic programming game model clearly outperforms the energy maximization model in terms of hydropower station benet, and the operation results dierences using the game model and optimization model are analyzed. For the studied cascaded reservoirs, the benet increasing percentages can be 4.1%–8.2% with 0.2%–3.8% energy loss in this hypothetical electricity market, comparing the game model to the energy maximization model.


Introduction
With the development of China's electricity market and the reform of the electricity system, new challenges have been posed to the market-oriented operation of China's power generation enterprises, especially hydropower enterprises. In principle, the objective of hydropower enterprises is to produce electricity and obtain greater bene ts with lower market risks [1]. erefore, it is an inevitable trend for hydropower stations to actively participate in the electricity market. e production objective of hydropower enterprises has changed from the pursuit of the maximum total power generation in the dispatching period to the pursuit of the maximum total bene t. In addition, di erent from the traditional centralized dispatching operation mode, hydropower operation in the electricity market environment takes the market as the center. e power grid operation status and dispatching plan are determined by the market. At the same time, hydropower generation relies heavily on natural runo with uneven spatial and temporal distribution with strong randomness. erefore, in order to meet the power generation goal of the whole year or longer, multiple hydropower stations with di erent regulating performances in the cascade need to operate in a coordinated way in the long, medium, and short term so as to achieve orderly regulation and redistribution of natural runo [2]. On the other hand, there are close hydraulic and electric connections between upstream and downstream hydropower stations of the cascade [3]. In recent years, Yunnan, Guizhou, Shenzhen, and western provinces or regions of Inner Mongolia have implemented new market reform measures, and the operation mode of hydropower stations will undergo signi cant changes [4]. In southwest China, many large hydropower stations have been built, and hydropower systems provide energy for regions and power load centers in east and southeast China [5]. It is worth noting that at present, China's provinces have basically built the electricity market trading system with medium-and long-term trading as the main and short-term trading as the supplement. In the new electricity market environment, the competition for energy becomes more and more erce. In order to adapt to the new electricity market environment, the operation mode of the hydropower station must be changed.
Traditionally, in a centralized operation, both long-term and short-term optimal operation of large hydropower stations aim at energy maximization. Wang et al. proposed a short-term hydrogeneration optimization program that was required by Fujian Electric Power Company Ltd. (FEPCL) to maximize hydropower production [6]. Zhang et al. took the large-scale hydropower system consisting of eleven hydroplants in the middle-upper Yangtze River basin in China; for example, the model of maximizing the energy with firm power of the hydropower system was tested [7]. Wu et al. developed a long-term optimal operation model for absorbed energy maximization of hydropower systems which was proposed for the hydropower system of China's southern power grid (CSG) [8]. Ahmad and Hossain's study explored the maximization of hydropower generation by optimizing reservoir operations based on short-term inflow forecasts derived from publicly available numerical weather prediction (NWP) models [9]. Liu et al. introduced a shortterm optimal operation model for the maximization of large and small combined hydropower consumption [10]. However, power generation does not represent power generation benefit. When scheduling with the energy maximization model, maximum benefit cannot be obtained [11,12]. erefore, it is necessary to take electricity prices into consideration for an in-depth study.
Nash equilibrium, also known as noncooperative game equilibrium, is an important term in game theory. In the course of a game, no matter how the other party chooses the strategy, one party chooses a certain strategy, and then, the strategy is called the dominant strategy. If any participant chooses the optimal strategy when the strategies of all other participants are determined, the combination is defined as Nash equilibrium [13][14][15]. Nash equilibrium theory has laid the fundamental foundation of modern mainstream game theory and economic theory and has been applied more and more in water conservancy engineering. When hydropower stations individually behave on behalf of their respective interests, the optimized deployment of hydropower resources is not achieved. To deal with the problem, Han et al. proposed an agency mechanism based on Nash equilibrium which can ensure the efficient deployment of hydropower resources in the electricity market [16]. Fallah-Mehdipour et al. developed a simulation model coupled with the weighting method and Nash equilibrium in multiobjective models to overcome the impacts of existing conflicts from stakeholders with different utilities [17]. While both hydropower and thermal units are considered as price-makers, Loschenbrand et al. proposed that the market equilibrium under uncertainty is computed via time stage decomposition and nesting of a continuous Nash game into the original discontinuous Nash game that can be solved via a search algorithm [18]. Moiseeva et al. modeled the strategic interaction of multiple producers in hydrodominated power systems under uncertainty as an equilibrium problem with equilibrium constraints (EPEC), reformulated it as a stochastic mixed-integer linear program with disjunctive constraints, and solve the problem results in finding Nash equilibrium [19].
In the past studies, the objective of optimal operation of hydropower stations is to maximize power generation, without considering electricity price. e study of the Nash equilibrium theory in water conservancy projects is not sufficient, and the Nash equilibrium theory is not combined with power generation benefits. e novelty of the proposed model in this paper is the combination of reservoir simulation-optimization and power market game model, using which the Nash equilibrium obtained is for the rule sets of cascades and evaluated by whole simulation horizon profits, not for each single decision like some other studies.
In this paper, a game model of cascade hydropower stations is established under the virtual market competition environment, and Longtan, Xiaowan, and Goupitan hydropower stations are taken as examples to study. First, the energy maximization model is established for three hydropower stations, and the solutions are used as the initial operation process. en, the long-term game model of three hydropower stations is established, and its objective function is the maximum power generation benefit. e model uses a dynamic programming algorithm to calculate and solve iteratively. In the process of solving, the final electricity price is formed by all hydropower stations. Using the game model, when a cascade hydropower station analyzes the strategies of other cascade hydropower stations, it changes its dispatching rules and makes the optimal decision to maximize its own benefits. However, this affects the income of other cascade hydropower stations, and then, other cascade hydropower stations also change their dispatching rules to achieve market equilibrium through repeated games, so that the hydraulic and electric power resources of the whole market can be optimized. Comparing the game model to the energy maximization model, the benefit increasing percentages can be 4.1%-8.2% with 0.2%-3.8% energy loss in this hypothetical electricity market. e organization of this paper is as follows: first, the energy maximization model and game model of three hydropower stations are established, respectively, and then, the dynamic programming algorithm is used to solve the model. Finally, the two models are compared and analyzed from the perspectives of power generation, power generation efficiency, and the dispatching mode in the dry season and flood season.

Construction of a Hydropower Station with the Energy Maximization
where PG m is the overall generating capacity of hydropower station m throughout all periods T; E t m is the power generation capacity of hydropower station m over period t, as computed by the following formula: where e m is the output coefficient of hydropower station m; q t m is the turbine flow of hydropower station m during period t; z t m and z t+1 m are water levels of hydropower station m at the start and the end of t period, respectively; zd t m is the tailwater level of hydropower station m in period t.

Constraints
(1) Water balance constraint is as follows: where s t m and s t+1 m are the water storage capacities of hydropower station m at the start and the end of t period, respectively; Q t m and r t m are the inflow and outflow of hydropower station m in period t, where outflow comprises turbine flow and penstock releases.
(2) Storage bounds are as follows: where s t m is the lower limit of storage capacity, usually the dead storage capacity of a hydropower station or the minimum storage capacity required for comprehensive utilization; s m −t is the upper limit of storage capacity, usually the utilizable capacity, and the storage capacity corresponding to flood control limited level during flood season.
(3) Power constraint is as follows: where p t m and p t m are the output and upper limits, respectively, and the output upper limit is typically installed capacity.
(4) Minimum release constraint is as follows: where r t m and r t m are the release and minimum release, respectively.

Objective Function.
e producing capacity of all hydropower stations participating in the market determines the electrovalence of each hydropower station in the electricity market environment. e electrovalence is inversely proportional to the total power generated by all hydropower stations. All hydropower stations are expected to engage in the electricity market; they belong to different generating companies, and all generating capacity participates in the market. In the electricity market environment, the electrovalence v t of each hydropower station at period t is determined by the power generation M m�1 PG t m of all hydropower stations, which is computed using the following formula: where φ t ( ) is the demand curve in economics and is a function of electricity generation and pricing; M is the number of hydropower stations. Assume there are M hydropower stations in a hydroelectric market that belong to separate generators and have no collusive acts, but each action or scheduling procedure is transparent to the others. Under these conditions, each hydropower station's operation process should be developed to optimize its benefit. e benefit of each hydropower station m is T t�1 PG t m h t v t , and the hydropower dynamic programming game model is established as follows: where F m is the objective function, which reflects the benefit of hydropower station m, t is the period serial number, and T is the total number of periods. PG t m is the generating capacity of hydropower station m in period t; h t is the hours of period t; α is the penalty factor, and 10 is utilized in this study; v t is the electrovalence in period t; r t m and r t m are the release and minimum release of hydropower m in period t, respectively. e energy maximization model transfers water energy from flood season to dry season, ensuring that electric energy is distributed as evenly as feasible throughout the year and that the output limit is guaranteed in the dry season. When the hydropower station participates in the electricity market, market competition solves the uneven distribution of electrical energy, and it no longer needs to guarantee the output. As a result, a noncooperative hydropower station game model is created, in which the participants are hydropower stations, the strategy is the dispatching process, and the benefit function of each hydropower station m is F m in Formula 8. When a hydropower station suspects that other hydropower stations are employing a particular dispatching method, it can modify its own decision to achieve the best dispatching method. Because electrovalence is decided by the total output of hydropower stations, this could result in a loss of benefit for other hydropower stations. Similarly, the operations of the other hydropower stations are reoptimized.
is gaming process is performed numerous times until no hydropower station modifies its scheduling method. At this moment, Nash equilibrium is reached, and the Nash equilibrium solution is the hydropower station scheduling procedure.

Constraints.
e constraints in the hydropower station game model and the energy maximization are the same, as shown in 2.2. When the hydropower station competes in the Mathematical Problems in Engineering electric power market environment, market competition overcomes the problem of unequal distribution of electric power over a year. us, it no longer needs to guarantee output.

Dynamic Programming Game
Model Solution

Dynamic Programming Algorithm and Solution.
Dynamic programming, a subfield of operational research, is a mathematical tool for optimizing decision-making. In the early 1960s, American mathematician Bellman [20] and others proposed the famous optimization principle in the study process of multistage decision optimization problems, which transferred the multistage process to a series of singlestage problems, using the relationship between each stage individually, discovering the new method to solve the problem of this type of process optimization. We see the nature request as the basic thought of reservoir operations in the entire process of the dynamic programming strategy; that is, no matter what the water level and generation decisions are made, how much electricity is generated, when faced with the state of the generation decision right now, the rest generation decisions must be able to be the optimal strategies. In each operation period, dynamic programming algorithms divide the massive reservoir operation problem into the same type of operation problems and solve them one by one. e operation and dispatching of a hydropower station with long-term regulation performance is a typical multistage decision process that can be solved by dynamic programming. e specific operation steps and the dynamic programming calculation flow chart are shown as follows: (1) Identify the stage and the stage variables. e regulating circle can be divided into T phases for hydropower stations with long-term regulating performance, with t denoting variables and t � 1,2,3, ..., T. e facing period is the corresponding time of t∼t+1, while the remaining period is the corresponding time of t+1∼T+1. (2) Identify state variables. To meet the condition of no aftereffect, the water level Z of each stage hydropower station is chosen as the state variable. e water level of the hydropower station at the start and end of time t is denoted by Z t and Z t+1 . e hydropower station's water storage capacity at the start and end of time t is denoted by s t and s t+1 , respectively. (3) Identify the decision variables. When the water level of a hydropower station is given at a specific time t, the decision to transfer the water level Z t at time t to the water level Z t+1 at time t+1 is referred to as decision making. Selecting the output p t as a decision variable, and its value determines the hydropower station's water level in the next period. (4) Determine the equation for the state transition. e state transition equation is listed using the water balance equation principle to obtain the storage capacity s t of the hydropower station at time t and the storage capacity s t+1 of the hydropower station at time t+1, and the relationship between Z t and Z t+1 can be obtained from the reservoir level-storage relation curve, which is the state transition equation, as shown in the following formulas: (5) Identify the objective function. e objective function is discussed in Section 2.1. (6) Develop the recursive equation for optimal hydropower station operating. In general, when deducing in time order, the recursive equation is the maximal sum of the benefit of facing period t and the benefit of the remaining period (t+1∼T+1), that is, the maximal benefit of t ∼ T+1 period.
where F t m (Z t ) is the benefit of station m at period t for water level Z t ; Z t is the water level of station m at the beginning of period t; F t m (p t , Z t ) is the benefit function for station m at period t, given state Z t and output p t ; F t+1 m (Z t+1 ) is the benefit of station m at period t+1 for water level Z t+1 ; Z t+1 is the water level of station m at the beginning of period t+1 . Figure 2 depicts the solution procedure of the hydropower station game model. e specific solution process for a round game is as follows:

Game Model Solution.
Step 1: using historical runoff data, the energy maximization is applied to generate the scheduling process of M hydropower stations, which is used as the game model's starting scheduling process m � 1.
Step 2: if additional hydropower stations' dispatching processes are fixed, optimize the dispatching process of the mth hydropower station and set t � 1.
Step 3: in period t, the generating capacity of the mth hydropower station is calculated using the dynamic programming algorithm, and the generating capacity of other hydropower stations is calculated using the scheduling procedure.
Step 4: using the demand curve (7), the electrovalence in period t after receiving the electricity generation of M hydropower stations is computed.
Step 5: the benefit for the mth hydropower station is computed using Formula (8) in period t.
Step 6: set t � t+1, return to step 3, and compute for all periods. When t > T, go to the next step.
Step 7: the dispatching procedure is the output process for the mth hydropower station that produces the most simulated benefit.
Step 8: set m � m+1, return to step 2, and use the dynamic programming algorithm to optimize each hydropower station in turn. A round is completed when m > M.

Case Study
e studied hydropower system includes hydropower stations of Longtan, Xiaowan, and Goupitan. e hydropower station characters are shown in Table 1. Longtan hydropower station and Goupitan hydropower station are the largest hydropower stations by installed capacity on the Hongshui River and Wujiang River, respectively. Xiaowan hydropower station is the second-largest hydropower station by installed capacity on the Lancang River [21]. Longtan hydropower station in the Hongshui River, Xiaowan hydropower station in the Lancang River, and Goupitan hydropower station in the Wujiang River.
ese three hydropower stations have a significant ability to adjust and are either annual regulating hydropower stations or multiyear regulated hydropower stations. Longtan hydropower station, Xiaowan hydropower station, and Goupitan hydropower station all play important roles in their respective basins; therefore, choosing these Mathematical Problems in Engineering three hydropower facilities as research objects is appropriate. e runoff volumes of three hydropower stations from 1953 to 2008 are used in this paper's calculation. is paper assumes that three hydropower stations are part of the same hydropower market and that the monthly demand curve is defined by the following formula: where m is the current hydropower station, PG t m is the electricity generated by hydropower station m during period t, and P m is the installed capacity of hydropower station m. e monetary unit is used to determine electrovalence. First, the energy maximization models of the Longtan hydropower station, Xiaowan hydropower station, and Goupitan hydropower station are established, and the penalty function is applied when the power constraint and the minimum output of the hydropower station are not satisfied, and the initial generating capacity, generating benefit, and dispatching process of the three hydropower stations are obtained through dynamic programming. Longtan, Xiaowan, and Goupitan hydropower stations have minimum outputs of 1680 MW, 1854 MW, and 746.4 MW respectively. en, based on the initial answer, a game model for these three hydropower stations is developed, and the game model of the hydropower station is solved using a dynamic programming algorithm. e calculation time is within 40 minutes on the Pc machine with 4 cores of 2.90 GHz and 32 G memory using Java language. e dynamic programming game process of three hydropower stations is depicted in Figure 3. Each game is broken down into three stages: first, fix the scheduling rules of Xiaowan and Goupitan hydropower stations, and then use the dynamic programming game model to optimize the Longtan hydropower station, resulting in optimal scheduling rules and benefits. Second, fix the scheduling rules of Longtan and Goupitan hydropower stations, and use the dynamic programming game model to optimize the Xiaowan hydropower station, resulting in optimal scheduling rules and benefits. Finally, fix the scheduling rules of Longtan and Xiaowan hydropower stations, and use the dynamic programming game model to optimize the Goupitan hydropower station, attaining the optimal scheduling rules and benefits and finishing a game round. en, continue in this manner until the benefit of the three hydropower stations stabilizes and Nash equilibrium is established.
In Figure 3, the game number "0" refers to the initial scheduling rules of the three hydropower stations, that is, the benefits associated with optimizing the three hydropower stations using the energy maximization model. e benefits of the three hydropower stations all vary dramatically from the original solution to the results of the first iteration, as shown in Figure 3, demonstrating that the three hydropower stations may significantly boost their  Mathematical Problems in Engineering benefits by adopting the game model. Since then, the benefits of the three hydropower stations have increased continuously throughout the game, until the third game, when the objective function of the three hydropower stations tends to be stable, and none of the hydropower stations intends to change their scheduling rules, at which point Nash equilibrium is reached. Table 2 shows the simulation process and optimization results of the three hydropower stations. In the first game, the returns of Longtan, Xiaowan, and Goupitan hydropower stations increased by 6.6%, 4.1%, and 8.2%, respectively, compared to the maximum generation model, while electricity generation decreased by 3.8%, 0.4%, and 0.2%. With the game's ongoing progress, the benefit of each hydropower station and the auxiliary power generation diminished gradually and gradually stabilized. When the game reaches Nash equilibrium, the benefit of Longtan, Xiaowan, and Goupitan hydropower stations rise by 6.9%, 4.0%, and 7.8%, respectively, but electricity generation falls by 3.4%, 0.5%, and 0.4%, respectively. It is clear that implementing a game model in hydropower stations will greatly improve benefits in the electric market scenario. e three hydropower stations' flood seasons begin and end at various times. In this paper, the flood season is defined as May to September, with the remaining months designated as the dry season. Table 3 shows the generating capacity of the three hydropower stations during the dry and flood seasons. Because the electrovalence is comparatively high in the dry season, the benefit growth of the game model is obtained by transferring the water energy from the flood season to the dry season.
is strategy reduces the hydropower station's power generation during the flood season while increasing the hydropower station's power generation during the dry season. In comparison to the energy maximization model, the power generation of Longtan, Xiaowan, and Goupitan hydropower stations decreases by 17.4%, 1.9%, and 5.1% in flood season, while power generation benefits increase by 1.0%, 13.8%, and 15.3%; in the dry season; electricity generation increases by 14.4%, 0.9%, and 6.3%, while power generation benefits increase by 11.1%, decrease by 2.5%, and increase by 2.0%. Longtan hydropower station has a large installed capacity, so it faces a high risk of generating power during the flood season, as its generating capacity decreases the most during the flood season, while its generating benefit increases the most during the dry season, and it is most inclined to transfer energy from the flood season to the dry season to increase its benefit. e calculation results of the reservoir level process of Longtan, Xiaowan, and Goupitan hydropower stations from 1953 to 2008 are described in Figures 4-6. It can be seen that the water level process presents periodic changes. erefore, the calculation results from January 1953 to December 1954 are chosen for a comparative examination of the water level and output of the Longtan, Xiaowan, and Goupitan hydropower stations in the energy maximization and game model, as illustrated in Figures 7-9. e water level of the game model at Longtan hydropower station from July 1953 to May 1954 is obviously higher than that of the energy maximization model. e reason for this is that the game model transfers water energy from the flood season to the dry season in order to increase the benefit, resulting in a large output in the dry season. For the Xiaowan hydropower station, the water level of the game model is slightly lower than that of the game     model from November 1953 to August 1954, and the output of the energy maximization from June to October 1954 is clearly higher than that of the game model. e output of the game model for the Goupitan hydropower station from June to September 1954 is obviously less than that of the energy maximization. e game model can plainly affect the operating process of hydropower stations, and hydropower stations can acquire more benefits by reducing output in flood season and boosting output in the dry season, according to the detailed study in Figures 7-9. Figure 10 shows a monthly average output comparison between the hydropower station game model and the energy maximization model. When compared to the energy maximization, the output of Longtan hydropower and Goupitan hydropower station in the game model decreases considerably from June to October, transferring to other months. e output of the Xiaowan hydropower station decreases considerably from June to September, whereas the average output of the two models is not statistically different from other months. e average output of the three hydropower stations grows dramatically from January to May and December but falls in the other months. is demonstrates that in the power market context, even in the absence of stringent power limits, hydropower stations tend to shift output from the flood season to the dry season in order to maximize the benefit.

Conclusion
For long-term competition analysis, a dynamic programming game model of hydropower stations is proposed in this paper and takes Longtan, Xiaowan, and Goupitan hydropower stations as the research background. First, the historical runoff and characteristic curves of hydropower stations are adopted to establish the energy maximization model for the three hydropower stations, and the dynamic programming algorithm is adopted to solve the model. e results are used as the initial scheduling curve. en, the noncooperative game model of three hydropower stations is proposed, and the objective function is the maximum power generation benefit. e game model and the energy maximization model are compared and studied in this research in terms of power generation and benefit in flood and dry seasons. e simulation results show that under the electricity market environment, as compared to the energy maximization model, the total power generation falls, but the benefit improves significantly, and power output increases significantly during the dry season. Hydropower stations in the electricity market tend to shift their output from flood season to dry season in order to maximize the benefit.

Data Availability
Some data, models, or code generated or used during the study are available from the corresponding author by request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.