A Parallelized Variable Fixing Process for Solving Multistage Stochastic Programs with Progressive Hedging

Long time horizons, typical of forest management, make planning more diﬃcult due to added exposure to climate uncertainty. Current methods for stochastic programming limit the incorporation of climate uncertainty in forest management planning. To account for climate uncertainty in forest harvest scheduling, we discretize the potential distribution of forest growth under diﬀerent climate scenarios and solve the resulting stochastic mixed integer program. Increasing the number of scenarios allows for a better approximation of the entire probability space of future forest growth but at a computational expense. To address this shortcoming, we propose a new heuristic algorithm designed to work well with multistage stochastic harvest-scheduling problems. Starting from the root-node of the scenario tree that represents the discretized probability space, our progressive hedging algorithm sequentially ﬁxes the values of decision variables associated with scenarios that share the same path up to a given node. Once all variables from a node are ﬁxed, the problem can be decomposed into subproblems that can be solved independently. We tested the algorithm performance on six forests considering diﬀerent numbers of scenarios. The results showed that our algorithm performed well when the number of scenarios was large.


Introduction
Uncertainty is common in all disciplines that involve decision making. In forestry, planners need to prescribe, several decades in advance, actions that should be taken in a forest in order to achieve a given management objective.
ose actions include the segments of roads that should be built in a given period to allow hauling, the forest units or stands that should be treated to reduce the risk of fire, the stands that should be logged in each time period to produce timber and secure employment to the local community, etc. e most common objective is the maximization of the net present value subject to environmental, budgetary, and logistic restrictions [39]. e prescriptive models allowing to assign forest units to different actions in different time periods are known as harvest-scheduling models.
To build harvest-scheduling models, forest planners have traditionally used expected growth and yield coefficients to predict future merchantable timber volumes. However, uncertainties in long-term temperature and precipitation coupled with increasing wildfire, windstorms, or landslides due to climate change may affect forest development. e traditional approach fails to account for uncertainty in forest growth and leaves forest planners unprepared in case of occurrence of any of these uncertainties. e uncertainty in forest planning can be addressed by a particular class of mathematical programming known as stochastic programming. One can distinguish between two-stage stochastic programming and multistage stochastic programming. In two-stage stochastic programming, a decision is made, and then the uncertainty is revealed and a recourse action that depends on the revealed uncertainty is taken. However, in the case of multistage stochastic programming the sequence of decision and uncertainty revealing itself occurs more than once giving therefore more flexibility to the decision maker to take recourse actions as uncertainty unfolds progressively.
Multistage stochastic programs are mainly composed of scenarios and stages. Scenarios are the set of possible future outcomes of the uncertain parameter, and stages represent the level at which decisions can be made and/or recourse actions can be taken. We call "period," the time that separates two consecutive stages. Uncertainty unfolds progressively in each period. Since at the beginning of the planning, we cannot anticipate which scenarios would unfold, we require that the decision made at the first stage must be the same for all scenarios. is requirement is known as nonanticipativity constraints. Nonanticipativity constraints (NACs) impose that if two scenarios cannot be distinguished up to any given stage, then the decision made in the two scenarios up to that stage must be the same. Multistage stochastic programming has been commonly used to model uncertainty in forest harvest scheduling because of the long planning horizon that characterizes forest harvest scheduling and the presence of interdependent relationships between periods such as the maximum contiguous area harvested from one period to another should not exceed a given limit. For instance, [41] solved a multistage stochastic harvest-scheduling model with uncertainty in wood price, wood demand, and productivity while Alonso-Ayuso et al. [1] focused on the uncertainty in the price and risk aversion. Finally,Álvarez-Miranda et al. [2] assessed how key ecosystem services change in forest management if there is growth uncertainty stemming from climate change.
Although climate change might be one of the biggest challenges of forest management, it has received little attention in forest harvest planning in part because stochastic programs, especially multistage stochastic programs, are considered one of the most challenging classes of optimization problems to solve [5,23,40,45]. For instance, the number of scenarios in [2] was limited to 32 although climate scientists forecast at least four climate pathways [31] which may translate into hundreds of possible forest growths.
Real-life applications of multistage stochastic problems lead to large models that are hard to solve directly [37]. e size of the models is directly related to the number of scenarios used to represent the uncertainty. Although uncertain parameters may be continuous, to make the problem suitable for stochastic programming, the discrete realization of the uncertainty is cast in a structure known as a scenario tree. In the scenario tree, each node represents a state decision and the branches of the tree represent the realization of the uncertainty. As more scenarios are included in the tree, the model will better approximate the entire probability space of the uncertain parameter [26]. However, increasing the number of scenarios makes the resulting stochastic programming models hard to solve. It is therefore necessary to develop some decomposition algorithms. e two most common decomposition methods are Benders decomposition [7] also known as stagewise decomposition or vertical decomposition and progressive hedging [36] also called scenario-wise decomposition or horizontal decomposition. e study [38] provided an overview of the algorithms for stochastic programming decomposition. Additional summary of different solution methods for stochastic programming is available in [16].
Benders decomposition (BD) is a delayed constraint generation approach for solving mixed integer programs. It has been mainly applied to two-stage stochastic programming problems with some assumptions on the nature of the first-stage and second-stage variables. e authors in [27] give a summary of conditions for application of BD to twostage stochastic programs. According to [15], BD is well suitable when there is only a small set of constraints that prevent the decomposition of the problem into blocks. is is not the case in forest harvest scheduling characterized by long planning horizons and constraints such as even flow of wood and ending inventory linking variables from one decision stage to another of multistage stochastic programs. For instance, Egging [15] had limited success when extended their BD algorithm to multistage stochastic programs. We acknowledge, however, that there are multistage applications where BD had been used [17,43,44].
Progressive hedging (PH), on the other hand, was developed by [36] for convex two-stage stochastic programs and is proven to produce a global optimal solution for continuous problems. For nonconvex problems such as stochastic mixed integer programs (SMIP), the algorithm is not proven to converge. It relaxes nonanticipativity constraints and iteratively solves the stochastic program by independently solving its scenarios and penalizing the violation of nonanticipativity constraints. PH is appealing because it is a method based on scenario-wise decomposition and thus at each iteration the stochastic program solved is equivalent to a risk neutral problem that is the same as the deterministic problem which ignores uncertainty. Consequently, the algorithm can deal with a large number of scenarios.
Despite its benefits, PH performs poorly for nonconvex multistage stochastic programs. Several researchers have proposed PH-heuristics for solving SMIP in different applications. e most promising PH-based heuristic explored is fixing variables that participate in defining nonanticipativity constraints as they meet consensus (see Section 2.2 for definition of nonanticipativity variables). For instance, Veliz et al. [41] fixed nonanticipativity variables during PH iterations as they meet consensus. After a given percentage of variables, 80%, for instance, consent on the value they should take, the reduced problem, which is the SMIP with some NAC variables fixed (named here and after reduced extensive form, REF), is then solved directly. e inconvenience of this approach is that the reduced extensive form may be infeasible because too many variables have been fixed (for proof, see Appendix). e infeasibility occurs when there is uncertainty in the yield of the forest such as in the case of climate change. In this case, the algorithm wasted considerable time iterating. On the other hand, solving the reduced extensive form problem can still be difficult if the number of variables fixed is too low. e third limitation stems from the fact that the reduced form problem is not separable and thus solving it cannot be parallelized. Similarly, for a two-stage problem in resource management, the authors in [42] proposed a scheme for fixing variables. eir algorithm applicability was limited to a special class of resource management where constraints are one sided. ey proposed "slamming" technique which forces nonanticipativity variables to converge, although they have not met consensus yet. ey showed that this technique accelerated PH convergence. Recently, the authors in [30] proposed fixing variables as well. In their case, they had both binary and continuous decision variables and proposed therefore to fix the binary NAC variables, and the resulting linear stochastic program, since convex, can easily be solved using progressive hedging.
Aside from fixing variables, researchers have investigated other strategies for improving PH application to nonconvex problems. Some of the strategies are only applicable to twostage stochastic programs. For instance, Atakan and Sen [4] proposed a branch-and-bound algorithm for stochastic mixed integer programs and tested the algorithm on stochastic server location problem. Similarly, Barnett et al. [6] proposed a combination of branch-and-bound and PH using the former as a wrapper. In the same context, Gade et al. [19] proposed an algorithm for computing the lower bound of PH for two-stage SMIP. Other researchers invested into reducing the duality gap that arises from relaxation of the nonanticipativity constraints [9,19]. In the same spirit, Boland et al. [10] looked into the minimum number of nonanticipativity constraints that need to be reinforced to ensure the quality of SMIP. e authors in [8,33] looked at reducing the number of scenarios by clustering them into bundles that can be solved independently. All these algorithms remain dependent on some assumptions of the nature of the firststage variables or the second-stage variables, or both. In addition, their applicability is limited to two-stage SMIP and specific to certain disciplines. However, the main challenge that remains is that PH does not converge for nonconvex problems which are common in forest planning. In forest planning, there are some problems that possess only binary decision variables (there could be some accounting continuous variables which do not participate in decision making). Some of those problems are found in spatial explicit forest management planning where the decision variables are the forest units that should receive a specific treatment. is class of problems with pure binary decision variables is the focus of this research.
In this work, we aim at overcoming some limitations that are posed by using PH for multistage stochastic harvestscheduling problems with uncertainty in the yield. e objective is to have a heuristic that efficiently solves multistage stochastic harvest scheduling with little loss of optimality. (i) We propose decomposing SMIP into scenarios and solving them by a special heuristic form of progressive hedging that is completely parallelizable. We capitalize on the idea of fixing variables as they meet consensus and extend it so that the reduced extensive form problem is parallelizable. Our form of PH heuristic exploits the structure of the scenario tree by fixing variables starting from the root-node. (ii) We assess the impact of fixing variables on the value of the optimal solution. (iii) We investigate as well some acceleration strategies that allow to accelerate the algorithm by extending the slamming technique proposed for two-stage stochastic programs by [42] to multistage stochastic programming. (iv) We use climate change to show one source of uncertainty in the yield although there can be other sources of yield uncertainty such as the errors in measurement of forest growth or uncertainty of future yield due to prediction models. e algorithm is suitable as well to price uncertainty. Our aim is not to model perfectly the uncertainty in forest growth due to climate change but to provide an algorithm that can be used for real-life application in stochastic forest harvest scheduling. e remainder of this paper is organized as follows. In Section 2, we provide a background on stochastic programming problem formulation and formally introduce the progressive hedging algorithm. In Section 3, the special form of PH for solving forest harvest planning is presented. We provide in Section 4 the computational experiment and the interpretation of the results. Finally, Section 5 concludes the paper and presents the limitations of the algorithm and future work.

Background
To illustrate variable fixation variant of the progressive hedging algorithm, it is necessary to represent the scenario tree that is the abstraction of the realization of uncertainty in growth and yield and formerly introduce progressive hedging (PH) algorithm.

Scenario Representation.
e stochastic program representation can be visualized as a tree, the so-called "scenario tree." It can be represented as follows. Let T denote the set of periods in the planning horizon with T � |T| being the number of periods. In the tree, a node represents the realization of the uncertain parameter and variable at a given time period. Let n and N describe the node and the lexicographically numbered set of nodes 1, . . . , |N| { } in the tree, respectively. From each node n, for t ∈ T∖ T { }, there is at least one branch leading to another node m with a conditional probability p m with m ∈ I n , that is, the set of nodes, immediate successors of the node n. m∈I n p m � 1. We denote by N t the subset of nodes belonging to period t such that N � ∪ t∈T N t and N t ∩ N t+1 � Φ for t ∈ T∖ T { }. Let Ω represent the finite set of representative scenarios in the tree. A scenario ω ∈ Ω is a particular realization of the uncertain parameter represented as a path from the root-node to a leafnode. Each scenario ω has an associated probability or weight denoted by w ω . Note that ω∈Ω w ω � 1. Similarly, let N ω represent the set of nodes forming the scenario ω. In other words, N ω is the set of nodes in a path from the rootnode to a leaf-node. ere is a number of scenarios that traverse each node. Let Ω(n) denote the set of scenarios that traverse the node n. We have Ω(1) � Ω, Ω(n) ∩ Ω(n ′ ) � Φ ∀n ≠ n ′ and n, n ′ ∈ N t . Finally, let S n denote the set of successor nodes to the node n, for n ∈ N. We have N 1 is a Advances in Operations Research singleton and S n � Φ for n ∈ N T (leaf-nodes). Furthermore, let α n denote the immediate ancestor node of node n for n ∈ N∖ 1 { }, since n � 1 is the root-node of scenario tree. To introduce nonanticipativity, we need to define η as the set of nodes having more than one leaf-node as successor (|S n ∩ N T | > 1 ⟹ n ∈ ηA). For instance, in Figure 1, nodes 2 and 3 have as leaf-node successors the set of nodes {5, 6} and {7}, respectively. erefore, node 2 ∈ η but node 3 does not. Finally, Let X ω represent the matrix of variables for all t ∈ T for the scenario ω ∈ Ω. As the result, the stochastic program can be stated as follows: Equation (1) maximizes the expectation from all scenarios. Equation (2) says that all the solutions should be feasible with respect to the constraints of the scenario. C ω is the feasible set for scenario ω. Finally, equation (3) imposes the nonanticipativity constraints (NACs) which require that the solution up to a period t from two scenarios should be the same if the two scenarios are indistinguishable up to period t. We call this model the extensive form (EF).
Notice that we could write constraint . , x ω sn be the vector of variables at node n ∈ η under scenario ω, and x ω sn is the state variable of the forest unit or "stand" s. Let Z n represent a vector of binary variables at node n. Imposing constraint (4) is equivalent to reinforcing NAC. Progressive hedging exploits the following formulation:

Progressive
Hedging. Progressive hedging (PH) is a scenario-based decomposition algorithm proposed by [36] for stochastic programming models. e idea of progressive hedging is to relax the nonanticipativity constraints (NACs, equations (4)) in an augmented Lagrangian manner [18,22,35,36] so that each scenario (subproblem) can be solved independently. is assumes that solving subproblems independently is much easier and faster. Nonanticipativity constraints require that values of variables that share the same ancestor nodes should be equal across scenarios up to that node. In other words, if two scenarios ω and ω ′ are indistinguishable up to period t, then the solutions of the two scenarios should be the same up to that period. For instance, in Figure 2, each variable at time t � 1 should be the same across all the five scenarios and values of variables in period t � 2 should be the same for scenarios 1 and 2. However, these constraints are not reinforced when scenarios are solved independently. rough progressive penalization of violations of NAC, the algorithm is proven to ultimately converge to the optimal solution for convex stochastic programs. In a nutshell, progressive hedging follows the following steps: (1) Solve each scenario without penalization (2) Compute the average (z) for each variable (3) If solutions have sufficiently converged, then stop (4) Update penalization terms λ � ρ(x − z) + λ (5) Solve scenarios with penalization terms (6) Go to step 2 Algorithm 1 describes progressive hedging for multistage stochastic programming. e inputs of the algorithm are the penalty factor ρ, the maximum number of iterations k max, and the termination criterion ϵ which indicates the level of consensus of nonanticipativity constraints that is acceptable. ε � 0 means that the algorithm stops if all the nonanticipativity constraints are satisfied. In the algorithm, line 2 initializes a Lagrangian multiplier (λ) for each NAC constraint. A Lagrangian multiplier is associated with each equation (4). To compute the average, of each variable, we need to have the conditional probability associated to each branch of the scenario tree in the detached form ( Figure 2). Equation (5) computes the conditional probability from node n to node m which is an immediate successor. at probability (q ω nm ) is proportional to the number of leafnodes associated with node m because the number of leafnodes informs on the number of scenarios passing by node m. For instance, from Figure 2, q 1 12 � q 2 12 � (p 2 /2), q 1 25 � p 5 , and q 3 13 � p 3 . Lines 6 and 17 compute the average of each variable at node n ∈ η. Lines 7 and 18 update the Lagrangian multiplier associated with each nonanticipativity constraint. Line 14 computes the NAC convergence euclidean distance.
is distance is zero if all the NACs are satisfied:

Variable Fixation
Progressive hedging variable fixing (PHVF) algorithm is identical to the classic progressive hedging algorithm (Algorithm 1). However, instead of letting variables converge progressively, PHVF fixes variables as they converge. Line 10  in the progressive hedging algorithm (Algorithm 1) is replaced by Algorithm 2. e algorithm starts by fixing variables in the root-node as they converge. For instance, for a given node n ∈ η, if a variable has a value of 1 across all scenarios ω ∈ Ω(n), that variable will be fixed to 1 across all those scenarios. is process is better described in Algorithm 2. However, unless a variable belongs to the root-node, it can only be fixed if all variables in the immediate ancestor α n have been entirely fixed. e first thing to notice is that once the scenario tree is decomposed, there are more scenarios traversing the root-node (Figure 2). In addition, as we move away from the root-node toward the leaf-nodes, there is a fewer number of scenarios passing by any given node. Hence, if for instance, there are three branches originating from the root-node (as displayed in Figure 2), |I 1 | � 3, then after fixing the root-node, we get three distinct subextensive forms (SEFs) (Ω(2), Ω(3), and Ω(4)) that can be solved independently ( Figure 3). Furthermore, the three SEFs are  ( (9) for ω � 1 to |Ω| do e algorithm description exploits that property of the scenario tree.

Algorithm Description.
During PHVF, at each iteration and for each scenario, the problem online 10 of Algorithm 1 needs to be solved. We refer to C ω as the feasible set created by constraint (2) for each scenario. Note that if the uncertainty is only in the objective function (for instance, price uncertainty), then C ω � C ω′ , ∀ω ≠ ω ′ . In this case each solution of ω is feasible for C ω′ (ω ≠ ω ′ ). Since z(·) is the average of the values of the variables across scenarios (see line 6 of Algorithm 1), if its value is 1 (or 0), then it means all the scenarios met consensus that the variable should take a value of 1 (or 0), respectively. is realization is exploited in (1) function fixVariables θ, ω, k if α n is fixed OR n is root-node then (6) for i � 1 to number of stands do (7) if z n,i ≥ θ t then (12) stop � True (13) end if (14) end for (15) end if (16) if stop is True then (17) break (18) else if n not marked (19) Mark n as fixed (20) end if (21) end for (22) x return fixed nodes (24) end function ALGORITHM 2: Variable fixation module. 6 Advances in Operations Research Algorithm 2 if it is supposed that the slamming factor θ t � 1, ∀t. e algorithm moves to the node at the next stage and performs variable fixation if that node belongs to the set of nonanticipativity nodes η and its predecessor is entirely fixed. e new feasible set created by constraint (2) and fixing some variables is denoted by C ω * . Note that this subproblem is much smaller compared to the original subproblem. In Algorithm 2, the inputs are θ, a function defining the value of the slamming factor depending on the stage t(0 ≤ θ t ≤ 1), see Section 3.3 for more information on this parameter; ω which is the scenario; and k which indicates the iteration.
Fixing variables impact the optimality since some variables may be fixed to values that they may not have taken in the optimal solution. Hence, the algorithm has an impact on the optimality. e objective of fixing variables is not to completely solve the model through the procedure but to obtain a reduced extensive form (REF) that is tractable. Let τ denote the percentage of NAC variables that are fixed before switching to solving the REF. τ is a proxy to the number of nodes fixed. e question is what is the appropriate τ that will make the REF tractable while not severely impacting the optimality? e answer to this question is investigated in Section 4.4.1.

Parallel Implementation.
Algorithm 3 is designed for parallel implementation of PHVF. Hence, line 10 from Algorithm 1 has to be replaced by Algorithm 3. At each iteration of progressive hedging, the subproblems can be solved in parallel because they are independent. However, fixing τ variables and switching to solve the reduced extensive form (the extensive form with some variables fixed, REF) is not parallelized. Nonetheless, since our algorithm (PHVF) fixes variables from the root-node to the leaf-nodes, after fixing each node, the problem can be separated into subextensive forms (SEFs) that can be solved independently. For example, as illustrated in Figure 3, after fixing the rootnode, we get three separate SEF models. Recursively, each subextensive form model can be solved through variable fixation leading to reduced subextensive form (RSEF). Each RSEF can then be solved directly if its size is deemed tractable. At the end, the solutions from all the REFs can be combined to get the solution of the stochastic program.

Acceleration Methods.
Note from Algorithm 2 that if θ t � 1, then we fix a variable to 1 (or alternatively to 0) only if all scenarios of interest agree that the value of the variable should be 1 (or 0 alternatively), respectively. is requirement is hard to meet when dealing with hundreds of scenarios, especially for the root-node variables which are replicated in all the scenarios. A variable may have the same value in all but one scenario for many iterations. Hence, that scenario slows the convergence. To avoid such a situation, "slamming" [42] forces variables to converge if the percentage of concordant scenarios for that variable reaches a threshold θ ≤ 1. However, instead of defining a scalar θ as in [42], we defined θ t � f(θ 0 , Q(t)) as a function that depends on the stage because if a low value of θ is acceptable for the root-node, such a value is not acceptable when closer to the leaf-nodes (to avoid infeasibility). θ 0 is the initial value of θ, whereas Q(t) can be a linear or exponential function term that increases θ value as a function of the t. Notice that if the uncertainty is in the objective function and we apply slamming, then the problem will always be feasible. In our case, preliminary tests allowed us to find the range of acceptable values of θ 0 . Low values of θ 0 led to infeasibility.
In the same line of thought, even with slamming, there is a chance that the algorithm is locked in a situation where there is no improvement of NAC convergence for many consecutive iterations. We define a "cascading" effect which lowers the value of θ to θ min for one iteration if the algorithm does not improve (convergence of some nonanticipativity) for α consecutive iterations. is behavior allows to avoid getting stuck, since lowering θ allows to fix some variables for which almost all the scenario already reached consensus. For those variables, the percentage of variables that agree on the value the variable should take is close to θ t and the consensus could eventually be reached after several iterations. We have noticed that the cascading effect may lead to infeasibility because it may be forcing some variables to a value they would not take otherwise. When infeasibility arises, we roll back and eliminate the cascading effect. In general, there is a tradeoff between infeasibility and the value of θ. Low values of θ lead to a risk of infeasibility while raising that value may make the acceleration methods less efficient.

Stopping Criteria.
In the case of classic progressive hedging, the algorithm stops because we have reached an acceptable level of consensus for NAC or because all the NACs are satisfied. e algorithm terminates as well if the maximum number of iterations (k max ) is reached. In our algorithm, we keep these two stopping criteria, although we know that they may never be reached. We instead rely on the fact that at some points, many nodes will be fixed (NAC consensus is reached for some nodes) and the reduced subextensive forms can be solved directly. e number of NAC consensus reached is checked through the parameter τ which is the percentage of variables fixed. When the percentage of variables fixed is greater than or equal to τ, the algorithm switches to solve the REF or RSEF.

Numerical Experiment
We describe here an empirical performance analysis of our proposed algorithm. For easiness to follow our experiment and replicate the results, we formulate the stochastic version of the so-called "Model I" of forest harvest scheduling.

Problem Definition and Formulation
Indices ω, ω ′ : scenario s: stand t: time period Advances in Operations Research 7 Sets S: set of stands Ω: set of scenarios Ω(n): set of scenarios passing by the node n η: set of nodes on which NAC should be reinforced N t : set of nodes at stage t T: set of time in the planning horizon Parameters r t : profit from selling wood in period t ($/mbf ). It is the discounted profit that includes selling cost c st : cost of harvesting and hauling wood from stand s in period t ($). It is the discounted cost y ω st : volume of wood harvestable per area from stand s in period t according to scenario ω (mbf/ac). is volume depends on the climate that materializes; therefore, it is a parameter that depends on the scenario ω a s : area of stand s (acres) age st : age of stand s at the end of the planning horizon if harvested in year t age s· : current age of stand s age s0 : age of stand s if not harvested during the planning horizon f min : allowable percentage of decrease of volume harvested from one period to another f max : allowable percentage of increase of volume harvested from one period to another w ω : probability or weight of scenario ω Variables x ω st : binary variable taking a value of 1 if stand s should be harvested in period t according to scenario ω, and 0 otherwise n ω s : binary variable: 1 if management unit s should not be harvested during the planning horizon under scenario ω, and 0 otherwise H ω t : accounting variable storing the volume of wood harvested in period t according to scenario ω (mbf ): which is subject to x ω st � x ω′ st , ∀ω ≠ ω ′ ; ω, ω ′ ∈ Ω(n); n ∈ η; t � t|n ∈ N t ; ∀s ∈ S, x ω st , n ω Expression (6) maximizes the profit from timber harvest from all scenarios weighted by their respective probabilities. Constraints (7) require that a stand is at most harvested once during the planning horizon. We use the variable n s to capture stands that are not prescribed to be harvested during the whole planning horizon. We need that variable to compute the average ending age of the forest as shown in constraints (11). Constraints (8) compute the volume harvested in each period during the planning horizon. It uses the parameter y ω st which values depend on the forest growth scenario ω of interest. Note that the set of constraints (8) is not necessary. We could have written the same model without using that set of constraints. However, doing so would require rewriting constraints (9) and (10), and finally, it would negatively affect the readability of the model. Constraints (9) and (10) impose the even flow constraints so that the volume harvested from one period to another remain within an allowed fluctuation range. Constraints (11) require that on average, the forest at the end of the planning horizon is at least as old as the forest at the beginning of the planning horizon. Finally, constraints (12) impose the nonanticipativity constraints. It requires that for each stand s, at time t, if two scenarios are indistinguishable, then the decision should be the same for the two scenarios at that time. If N ω ∩ N ω′ ≠ Φ, then there exists t such that the two scenarios are indistinguishable at t. Constraints (13) enforce that the decision variables are binary, and the accounting variables are continuous. We remind the reader that variables H ω t are not required for this model.

Climate Change Data.
e potential mean annual increment, which is an indicator of forest growth in a year might change in the Pacific North West because of climate change. e change depends on temperature, precipitation, and air moisture content, all driven by human activities and economic development [28]. In addition, the change is not geographically uniform. Hence, the change tends to be negative in Oregon compared to Washington State. Similarly, the change tends to be negative in low altitudes in Oregon. We used the data from [28] which forecast the potential mean annual increment change (pMAI) by the year 2100. e pMAI (m 3 /ha/year) is the potential change in forest growth that will be observed in a given year; hence, it is a volume given as a function of time and area. We assumed linear change of the growth from now until that year. For instance, the growth change for next two decades is the double of the growth change for the next decade. e climate paths defined in the Pacific Northwest are A2, A1B, B1, and Commit (or C). Tables 1 and 2 present the values of potential mean annual increment change for each one of the climate paths. In Table 2, climate paths D1 to D4 are artificial climate paths that suppose higher pMAI. We built forest growth scenarios by assuming it is possible to transit from one climate path to another because of mitigation or intensification of climate change due to human actions.

Experimental Design.
e experiments were conducted under a DELL desktop computer running on Windows with Intel(R) Core(TM) 2 Quad CPU @ 3.70 GHz and 8 GB of memory. During PHVF iterations, each scenario is solved at optimality gap of 1%. is is a premature stop. Nevertheless, this criterion proved to accelerate solution time for each scenario. Furthermore, the last solutions of a mixed integer program are the most difficult ones with no to little improvement of the objective function value. All the models were solved using IBM ILOG CPLEX 12.6 (CPLEX, [12]). e memory allocated for storing the nodes was 3,000 MB, and the nodes were set to be stored in a compressed format on the hard drive. All other parameters were left to their default values. e code was implemented in Java 10 using Concert Technology of CPLEX.
For parallel computation, we used the paradigm of master-workers. e master is in charge of coordinating the PHVF algorithm while distributing the task of solving individual submodels or REF to the workers. Each worker sends back its solution upon completion. e workers compete for access to the memory. erefore, the choice of the number of workers must be judicious to avoid the overhead which is the amount of time required to coordinate parallel tasks, as opposed to doing useful work. Preliminary experiments showed that two workers was the optimal number given the configurations of the computer. e general framework of the model is presented in Figure 4. e input data contain the information on the forest, the climate change (growth change), and the regulations that ought to be met by the harvest planning. e input is fed to the PHVF module that is the master governing the optimization process. e workers are the ones interacting with the optimizer (CPLEX in this case). ey send the model to the optimizer which returns the solution. e advantage of this framework is that we could change the optimizer without having to readapt our algorithm. At the end, the output module collects the harvest planning and informs the decision making.
Forest growth scenarios were generated using the information on the growth change reported in [28] (see Tables 1 and 2). We assumed that due to climate change and climate change mitigation efforts, it is possible to transition from one climate path to another in two consecutive periods. Hence, for instance, it is possible to transition from climate path A2 in year 2020 to climate path B1 in year 2030. We further assume that the probability of transiting from one climate path to the other is the same. As result, the scenarios are considered equally probable. Note that this assumption does not have much incidence on the performance of the algorithm. Based on the number of scenarios generated, we defined small, medium, and big instances. e small instances have 64 scenarios (1 × 4 × 4 × 4 × 1) which is four branching for the first three periods. e medium instances have 256 scenarios which correspond to four branching for the first four periods (1 × 4 × 4 × 4 × 4). For small and medium instances, pMAI corresponding to each climate path is reported in Table 1. e big instances have 512 scenarios made of eight branching for the first three periods (1 × 8 × 8 × 8 × 1). e in-depth description of the scenario tree generation methods and its quality are beyond the scope of this research. e interested reader could refer to (1) function PHVF θ, Ω(j), k (2) fixed nodes � fixVaribles((θ, ω, k), ∀ω ∈ Ω(j)) (3) if problem small enough then (4) Solve reduced extensive form. (5) else (6) for n in fixed nodes which successor are not fixed do (7) for m ∈ I n do (8) PHVF(θ, Ω(m), k + 1) (9) end for (10) end for (11) end if (12) return x ω (13) end function ALGORITHM 3: PH variable fixing. Advances in Operations Research 9 [11,14,21,24,25,29,33] and more importantly [34] who describe scenario generation in the forest management framework. We tested our algorithm on six different forests with three being real forests, and the data of which are publicly available 1 (P1, P34, P36). e other three were computer-generated forests (P75, P83, P100). e six forests are set to be located in three different altitude classes. e forests are supposed to be located in Oregon and in Washington State. e values of pMAI for the six forests for large instances are reported in Table 2. Table 3 shows the characteristics of the forests and the scenario instances. In the tables, values show the potential mean annual increment change by 2100. Hence, negative values mean there will be a decrease in forest growth, whereas positive values mean that there will be an increase in forest growth. e medium instances were used as reference to investigate how many variables should be fixed in order to have a problem that is computationally tractable. For that purpose, we solved the stochastic models, using the serial version of our algorithm (Algorithms 1 and 2), setting τ to just 20%, 40%, 60%, and 70% before switching to solving the reduced extensive form. τ � 20% means some variables are fixed but the root-node is not completely fixed, and τ � 40% means that all the root-node variables are completely fixed and some variables from the second period nodes are fixed (as stated before, τ is proxy to the number of nodes fixed. is scheme is possible because there are 25% of NAC variables at each stage and thus at the root-node as well).
We made the experiment by allocating 15, 30, 45, and 60 minutes for each problem. e time is chosen to allow the   effect of τ to manifest itself. e tests were run with five repetitions, and the average of the objective function value was reported. Notice that if the time is too long, then eventually all problems can be solved, even the extensive form. Conversely, if the time is too short, then PHVF may not have enough time to finish fixing variables before the time limit and hence low values of τ will be favored. During the whole experiment, we defined θ t � min 0.999, { 1.05 (t− 1) θ 0 }. As results, for the first stage (t � 1), θ 1 � θ 0 . Similarly, we defined θ min � max 0.75, θ t − 0.05t . ese values were defined from empirical experimentation.
For assessment of our algorithm performance, each problem was solved using our algorithm (PHVF) and comparing that solution to the one obtained by CPLEX solving directly the EF for the equivalent wall clock time. In addition, we report the optimality gap produced by CPLEX

Results of the Experiment.
In addition to the number of scenarios, SMIP size grows as well with the number of units (stands) because we need to define the nonanticipativity constraint for each stand at the time periods on which constraints (12) should be imposed. e problem size increases almost by tenfold when going from small instances to the medium ones. e smallest problem in this experiment  has 600k + binary variables and over 69k constraints (Table 3).

Impact of τ on the Optimality and Solution Time.
Tables 4-9 report the effect of τ on the value of the objective function with respect to the runtime for forests P1, P34, P36, P75, P83, and P100, respectively. We report as well, improvement (%) which is the percentage of increase (if positive) or decrease (if negative) of the objective function value from the objective function value for τ � 20% for the same runtime.
In summary, as expected, everything else being equal, longer runtimes allow a higher objective function. As we can see for τ > 40% which corresponds to fixing entirely the rootnode and fixing some variables from the second stage, we have numerical issues. e numerical issues have two sources. On one hand, the algorithm may run out of time  is is the case for forest P34 with no solution when τ � 60% even when the time is 1 hour. However, for the same forest, there is an integral solution when τ is raised to 70%. is behavior occurred mainly for forest P34 and P100 and is not observed when τ � 40%. e second aspect is the impact of τ on the optimality. e advantage that higher values of τ have over the small ones tends to vanish when the runtime is long. Similarly, the improvement in optimality from τ � 40% to higher values of τ is low. In fact, for some forest such as P83, over τ � 40%, the objective function value diminishes as τ increases which can be interpreted as during PHVF, many variables are fixed to values that are suboptimal. It clearly appears that fixing just the root-node is the best choice for these problems. It allows sufficient time for REF to find an integral solution while avoiding to impact the optimality, and limiting the disturbance on the structure of the basis matrix. Table 10 presents the results from PHVF using τ � 40% which is equivalent to fixing the root-node and then solving in parallel the SEF. As we can see for the small instances (64 scenarios), solving directly the extensive form outperformed using PHVF. is means the overhead of decomposing the problem into scenarios and solving each one overweighs the benefit. For the medium instances, the results are mixed. Out of the six forests, PHVF outperformed the EF in two cases. Even when PHVF under performed, it had results that were less than 1% away from the optimal solution. e benefit of using PHVF over the state of art commercial solver such as CPLEX is highlighted when dealing with big instances (512 scenarios). PHVF outperformed the EF for the six forests. In the case of P1 which is the biggest model with 1,363 units (Table 3), EF completely failed to solve the model because it could not fit it in the memory. For other cases for the same runtime, PHVF reached gains ranging from 0.84% to over 767% corresponding to forests P34 and P100, respectively. Even after leaving EF for 24 h, PHVF run for 15 min still outperformed in some cases. Comparing the gain to the gap from EF suggests that PHVF almost reached the optimal solution or its solution has a relative optimal gap less than 1%.

Conclusions
In this paper, we have developed a method for solving multistage stochastic mixed integer programs that arise in natural resource management such as forest harvest planning. Although tested for growth uncertainty due to climate change, the method is valid as well for other sources of uncertainty such as errors in the model predicting forest growth, and uncertainty in the price of wood. is algorithm is also applicable to other disciplines where the decision variables are binary.
In the forest industry, managers solve multiple instances of the same forest model. It is therefore necessary to have an algorithm that allows one to quickly solve the SMIP. Furthermore, having an algorithm that is faster allows the practitioners to explore different management options. However, problems tested in this experiment are quite small compared to the models in the industrial standard. Nevertheless, this requirement may be compensated by the availability of more powerful computation resources in the industry. e PHVF presented here overcomes some limitations listed in the literature regarding the choice of the penalty term ρ. It is documented that large values of ρ lead to the phenomenon of cycles [9]. However, because PHVF fixes variables, such a problem is avoided. Notwithstanding all these benefits, PHVF has more parameters that need to be set compared to the classic PH. Fortunately, most of those parameters are easy to set and the limitation remains on the choice of the penalty term ρ. In the preliminary explorations, too low values of ρ led to some numerical issues while too high ones, although accelerated the convergence, negatively impacted the objective function value.
e future extension of this work is to investigate the impact of different parameters on the solution quality. Similarly, having an algorithm that can dynamically determine the value of the penalty term would be an improvement for practitioners. On practical aspects, the question of the optimal number of scenarios necessary to approximate the growth change under climate change remains open. is can be done by computing the value of the stochastic solution, the expected value of perfect information for different scenario trees [3,13,20,32].

Proof of Infeasibility of Classic Progressive Hedging Coupled with Variables Fixation
Let us consider a hypothetical problem under studies for a structure with four scenarios and three stages as shown in Figure 5. Let us suppose the forest has four stands. e decision variable for each scenario is therefore x � e variables x ij in the matrix x are all binary. x ij � 1 means that the stand i should be harvested in period j. Since a stand cannot be harvested more than once, j x ij ≤ 1 ∀i.
During progressive hedging, at iteration k, we can have this solution. We use the superscript x ω to refer to the solution from scenario ω ∈ 1, 2, 3, 4 { }, whenever necessary: Since solutions of the second stage are identical for scenarios 1 and 2, we can fix entirely the variables of the second stage for the two scenarios. Similarly, we can fix entirely the second stage variables for scenarios 3 and 4: Comparing variables of the first stage across the four scenarios, we can see that we can fix all variables except for stand 3 which is scheduled for harvest in period 1 for scenarios 1 and 2 but is scheduled for harvest in period 3 for the two other scenarios (equation (A.4)). erefore, the reduced problem will have all variables fixed except for the third period variables as well as x 31 for all scenarios. For this problem to be feasible, either x 31 � 0 or x 31 � 1 for all scenarios. However, there is a constraint that links the volume harvested in one period to another (Constraints (9) and (10)). Hence, setting x 31 � 0 means there will be less than acceptable volume in period 1 for scenarios 1 and 2. Similarly, setting x 31 � 1 leads to volume of zero in period 3 for scenarios 3 and 4 which may not be acceptable as well.
is situation occurs because the productivity of each stand is different with respect to the scenarios.
Data Availability e code and the data used in this paper are available upon request or in Github link: https://github.com/MartinBagaram/ proressive-hedging.

Conflicts of Interest
e authors declare no conflicts of interest.