An Improved L-Shaped Method for Solving Process Flexibility Design Problems

Process flexibility, where a plant is able to produce different types of products, is introduced to mitigate mismatch risk caused by demand uncertainties. The long chain design proposed by Jordan and Graves in 1995 has been shown to be able to reap most benefits of full-flexibility structure (where each plant is able to produce all products) in balanced systems (where the numbers of products and plants are equal). However, when systems are not balanced or asymmetric or when response dimension is taken into consideration, long chain design may not be the best configuration. Therefore, this paper models the process flexibility design problem in more general settings.The paper considers both balanced and unbalanced systems with asymmetric plants considering response dimension.The problem is formulated as a two-stage stochastic programwhich is solved by an adapted L-shapedmethod, combining it with several enhancements. To the best of our knowledge, this is the first time L-shaped method is used to solve the process flexibility design problem.The effectiveness and efficiency of the proposedmethod and enhancements are evaluated. Finally, the comparison between design methods proposed in this paper and in existing literature shows the superiority of the former.


Introduction
Demand for products or services has become more volatile and unpredictable due to seasonal fluctuations (see [1]), price fluctuations, or competitor uncertainties.In such an environment, matching capacitated supply and demand is vital for manufacturing companies and service centers.However, traditional strategies such as increasing supply capacities or inventories do not perform well (see [2][3][4]).A new operation strategy, namely, process flexibility, is introduced to mitigate such mismatch risk caused by uncertainties (see [5,6]).
Process flexibility, also known as "mixed flexibility" and "manufacturing flexibility," is defined as the ability "to build different types of products in the same plant or production facility at the same time" in [7].Without adequate flexibility, capacity-demand mismatch would occur (see [2,8]), which might further hurt firms' performances.Such an example could be found in [9].Because of the economic crisis and high gasoline prices in 2008, overall vehicle sales declined and consumers preferred buying small or new energy-efficient cars instead of pickup trucks or SUVs.Consequently, many automakers in Detroit without process flexibility suffered severe losses.Conversely, because of their ability to produce different vehicle models in at least one plant, Honda and Toyota survived and even gained in increase in their market share.
Although flexibility is an effective way to hedge uncertainties, it is costly to implement.Motivated by their work at General Motors, Jordan and Graves [7] introduced the concepts of limited flexibility (i.e., a plant produces only a few types of products, also called "partial flexibility" or "sparse flexibility") and chaining.They found that limited flexibility can reap most benefits of full flexibility when configured to chain products and plants together to the greatest extent possible, which is called long chain design.To make things simple, the term plant is used to refer to resource throughout this paper.Figure 1  and only can produce two products, and total flexibility where each plant can produce all products.
Because of the costly investment of flexibility and high performance of the long chain design, it has attracted much attention from both academicians and practitioners.However, it performs well in special settings such as in balanced and symmetric systems and when only demand shortfall is taken into consideration.Recently, several research works found that the long chain design does not perform in the best way when settings change.For balanced systems, Mak and Shen [2] numerically showed that long chain is not the optimal design under conditions with high or low demand variability, and Désir et al. [10] showed that the optimal structure among designs with at most 2 arcs need not be the long chain if there is no connection restriction.For unbalanced systems where the numbers of resources and demands are not equal, Deng and Shen [9] proposed an alternative better design.In addition, in practice managers care more about the cost of flexibility.This means that, besides range dimension, response dimension is another important and practical dimension for process flexibility.Chou et al. [11] studied such a problem and argued that response dimension rather than range dimension should come first when resource is limited.When cost parameters are not perfectly flexible, Mak and Shen [2] found that the long chain design does not always perform in the best way.Legros et al. [12] claimed to have the same findings in the study of asymmetric call center systems.Therefore, in order to implement process flexibility in a practical unbalanced and asymmetric system with nonhomogeneous cost parameters, it requires further exploration to solve the process flexibility design (PFD) problems.
This paper formulates the PFD problem with more general settings, including nonhomogeneous demand/cost and unbalanced system without assuming symmetric capacities as a two-stage stochastic programing problem.The model is based on the recent work of [2,8].Contrary to developing heuristic algorithms, this paper adapts a widely used method applied in stochastic programing, namely, L-shaped method, to solve the PFD problem.To the best of our knowledge, this is the first time L-shaped method is used to solve PFD problem.In order to obtain high solution quality and computational efficiency, this study further explores enhancements to speed up the algorithm.Results of numerical experiments show that the long chain design is not always the best configuration although the optimal design is still sparse.Finally, the proposed design method is compared with existing heuristic methods in the context of both balanced and unbalanced systems.The comparison results show the superiority of the proposed method in this paper.
The rest of the paper is organized as follows.Literature related to PFD and L-shaped method is reviewed in Section 2. Section 3 describes the problem and formulates models for the PFD problem.Following that, in Section 4, the basic Lshaped method adapted for our problem is developed and a set of enhancement strategies to speed up the computations is discussed.The effectiveness and efficiency of the adapted Lshaped method and the enhancement strategies are explored in Section 5, where the proposed L-shaped algorithm is also compared with a recently developed neighborhood search algorithm by Zhang et al. [4].Finally, Section 6 concludes the paper with some fruitful directions for future research.

Literature Review
2.1.Process Flexibility Design.In the seminal paper, Jordan and Graves [7] proposed the following guidelines for process flexibility design: (i) Try to equalize the number of plants (measured in total units of capacity) to which each product in the chain is directly connected.
(ii) Try to equalize the number of products (measured in total units of expected demand) to which each plant in the chain is directly connected.
(iii) Try to create a circuit(s) that encompasses as many plants and products as possible.
They conducted numerical examples and empirical analysis to show that the long chain design using the above guidelines could achieve almost all benefits of total flexibility.Such a configuration requires much less investment without losing many benefits compared with total flexibility design.Another benefit of the guidelines is that they are simple to follow in practice.As a result, many extensions and applications of the above guidelines have been carried out, such as extension to unbalanced system (e.g., [9]), application in queueing systems (e.g., [13,14]), multistage supply chain context (e.g., [15,16]), and cross-training workforce (e.g., [17][18][19][20][21]).For more applications of sparse flexibility, please refer to [22].These extensions and applications support the effectiveness of the long chain design concept.
Another stream of research is trying to find rigorous (or approximate) theoretical justifications to understand the effectiveness of the long chain or sparse flexibility designs (e.g., [3,[23][24][25][26][27][28]).Iravani et al. [24] studied properties of the long chain design by introducing a new perspective of "structure flexibility" which is based on the topological structure.Chou et al. [23] provided a theoretical justification for the effectiveness of limited flexibility using random sampling method by evaluating the asymptotic performance of the long chain design for balanced and symmetric systems.Chou et al. [26] proved the existence of sparse structure under worst-case scenarios using graph expansion properties.Simchi-Levi and Wei [3] were the first to provide rigorous justification of the optimality of the long chain design among 2-flexibility designs (i.e., each plant produces two products and each product can be produced by two plants).Most recently, Simchi-Levi and Wei [27] proved that the long chain flexibility is symmetrically more robust than any other 2flexibility designs.
Existing literature mentioned above concerns minimizing shortfall of demands, which is the range dimension of process flexibility, that is, "the extent to which a system can adapt" (see [11]).In practice, managers may be more concerned about economic aspects like "the rate at which the system can adapt" which is the response dimension (see [11]).Therefore, the objective of PFD problem could be to minimize system cost, including capacity investment cost, flexibility investment cost, production cost, and shortfall penalty cost.Chou et al. [11] were the first to conduct quantitative analysis on this issue.Then, Mak and Shen [2] developed a stochastic program for the problem but they assumed the system to be balanced.This paper follows [2] to formulate the problem as a two-stage stochastic problem, with more general settings including nonhomogeneous demand/cost and unbalanced system without assuming symmetric plants' capacities.
Since the stochastic programing problem is NP-hard (see [8]), Schneider et al. [8] and Mak and Shen [2] designed heuristic algorithms to get approximate solutions, respectively.Instead of heuristic algorithms, this paper adapted Lshaped method with enhancements to optimally solve the balanced and unbalanced flexibility design problems without any restriction on the capacities or the number of products (plants) that can be assigned to a plant (product).

L-Shaped Method.
Ruszczyński [29] and You and Grossmann [30] summarized decomposition-based methods to solve stochastic programing problems, including Benders decomposition also known as L-shaped method [31], nested decomposition [32], augmented Lagrangian decomposition [33], subgradient methods [34], stochastic decomposition [35][36][37], and disjunctive decomposition [38,39].Among these methods, L-shaped method has become the most widely used approach to tackle stochastic programing problem "because of its ease of implementation" [30].It has been proven that the method finitely converges if the second-stage function is either a finite convex function or identically negative infinity (see [31,40]).With finite realizations of scenarios, the stochastic problem can be reformulated into an extensive form, namely, Deterministic Equivalent Problem (DEP).The block structure of DEP is appropriate to be decomposed into a master problem (MP) and subproblems (SP).At each iteration, MP is solved without second-stage variables and then first-stage solutions are passed to the second stage.Solutions for the second stage generate optimality or feasibility cuts which will be added to MP to eliminate infeasible or nonoptimal solutions at the next iteration.The algorithm terminates once optimality criteria are satisfied (see [31,40]).
Classic L-shaped method [31] is also called single-cut version because at each iteration only one cut is generated.It might require too many iterations because a single cut could not transfer adequate information of all scenarios.Therefore, Birge and Louveaux [41] developed a multicut version, where the number of cuts generated at each iteration could be as high as the number of scenarios.Both You and Grossmann [30] and Lei et al. [42] evaluated the efficiency of the multicut version when the first-stage problem is a mixed integer problem.
An obvious disadvantage of the multicut version is that it may become relatively more complicated to solve the firststage problem with more and more cuts added to the MP.To overcome this drawback, Birge and Louveaux [41] suggested a hybrid approach, where the number of cuts generated at each iteration is between one and the number of scenarios.The combined number of cuts is referred to as aggregated level.This raises the problem of whether there exists an optimal aggregated level and if so what it is.Trukhanov et al. [43] suggested a method to change the aggregated level dynamically.Wolf and Koberstein [44] proposed a cut consolidation technique to retain an aggregated cut from those inactive cuts which are to be removed.Both these dynamical adjustment techniques are based on dual solutions of the corresponding cuts.If the dual value is zero, then the corresponding cut is considered to be inactive.Most recently, Song and Luedtke [45] proposed an adaptive approach to dynamically change the aggregated level to solve stochastic linear programs.Their computational results show that their proposed adaptive partition-based algorithm outperforms the basic singlecut L-shaped algorithm of Van Slyke and Wets [31] for their problem instances with fixed recourse and technology matrices.
Adequate scenarios are needed to solve the PFD problem optimally, which increases the computational complexity.Existing single-cut or multicut version of L-shaped method could not solve the model efficiently.Therefore, this study adapts a hybrid-cut version and explores combinations of different enhancements to solve the problem efficiently.

Problem Formulation
It is assumed that there is a set  = {1, . . ., } of products (indexed by ) and a set  = {1, . . ., } of plants (indexed by ).While the demand (D) for these  products is uncertain, the distribution of demand is already known.A fixed investment cost   for flexibility occurs if the connection between plant  and product  exists.For an elementary outcome , let   be the marginal benefit defined as the unit price of product  minus unit production cost of product  produced by plant .Notations are summarized as follows: Mathematical Problems in Engineering i: index of products with values 1, . . ., , : index of plants with values 1, . . ., ,   : the capacity of plant , D: a random demand vector defined on the probability space (Ω, Ξ, ), with elementary outcomes  ∈ Ω,   : fixed investment cost allowing plant  to produce product ,   : marginal benefits of producing product  by plant ,   : binary variable which equals 1 if product  can be produced by plant ,   : the quantity of product  produced by plant .
With these notations, the PFD problem is formulated as the following mathematical program: The objective is to minimize the total cost which equals the total investment cost of flexibility minus the expected benefits from sale of the manufactured products realized through the elementary outcomes  ∈ Ω. Constraint (2) guarantees that the total resource used does not exceed its capacity.Constraint (3) ensures that production does not exceed the demand for each product.Constraint (4) guarantees that no product  is produced by plant  if there is no connection (link) between them.
The PFD problem is a natural two-stage decision problem.In the first stage, strategic decisions are made prior to the resolution of uncertainty.Then in the second stage, after demands are observed tactical decisions are made to allocate limited capacities to demands based on the firststage solution.Let the value function (, ) be the objective function value of each specific scenario subproblem .Then the first-stage problem can be formulated as follows: The allocation of capacities to gain maximum profits is decided with an observed demand scenario  and fixed flexibility configuration  obtained from PFD1.The secondstage formulation is presented as follows, where random variables only appear on the right-hand side: It is obvious, that for any solution () of PFD1, there always exist feasible solutions for PFD2; that is, the problem has complete recourse (see [40]).It is assumed that demands follow random distribution with known mean and variance.Up to this point, a two-stage stochastic programing model with binary variables only in the first stage for the PFD problem has been developed.
To convert such a problem into a set of multiple deterministic problems, Kleywegt et al. [54] provide a technique called Sample Average Approximation (SAA) to handle large scale (or even infinite) scenarios which may be necessary for solving PFD1.They also provided a method to calculate the required sample size to obtain -optimal solution to the true problem with a given level of probability.Furthermore, they showed that the estimated solution converges to the true optimal solution at an exponential rate in the sample size.With this technique, the Deterministic Equivalent Problem (DEP) for the original problem (1) is obtained: where  denotes the sample size,  is a specific scenario,    is the decision variable of quantity of product  produced by plant  in scenario , and    is a realization of demand for product  in scenario .

Adapted L-Shaped Method and Enhancements
Preliminary computational results showed that, for the problem developed above, the single-cut version of L-shaped method is effective and efficient for only small size problems (e.g.,  ≤ 6 and  ≤ 6) but not for medium (e.g., 7 ≤  ≤ 9 and 7 ≤  ≤ 9) or large size problems (e.g.,  ≥ 10 and  ≥ 10).The multicut version of L-shaped method (see [30,40]) performs much better for solving medium and large size PFD problems.Therefore, the multicut version of L-shaped method is explored in this paper.

Multicut Version of L-Shaped Method.
The multicut version of L-shaped method consists of the following main steps.
Step 0. Set upper and lower bounds UB = +∞ and LB = −∞, respectively.Set the iteration accumulated flag  = 0 (indexed by V) and   to be a sufficiently large negative value for all .
Step 1. Solve the master problem (MP): The objective function of ( 10) is equivalent to that of DEP (9), where  is the sample size and   is an estimation of the value function of the second-stage problem with specific scenario .Solving MP can obtain a lower bound for PFD1 (7).Let (  ,   ) and   be the optimal solution and objective value, respectively.If   > LB, update LB =   .Constraints (11) are the optimality cuts generated by each scenario  at each iteration V.The corresponding new introduced variables V and   V will be calculated later in Step 3.
The dual problem (DP) for subproblem ( 8) is as follows: where   ,   , and   are corresponding dual multipliers.Let (  ,   ,   ) and   be the optimal solution and optimal objective value, respectively.Calculate Step 3. If UB − LB <  stop and output   as the optimal solution.Otherwise, for each scenario , if update Add optimality cut     +   ≥    to the master problem.Return to Step 1.
Recall that the PFD problem has complete recourse, which means that its subproblems are always feasible.Therefore, no feasibility cut is generated during the algorithm.

Enhancements.
Several enhancement techniques to speed up the L-shaped method are to be discussed in this section.Multicut version of L-shaped method generates a large number of new cuts at each iteration, which increases the computational effort required to solve the MP.Therefore, most enhancements are designed for speeding up the MP solving process, including hybrid-cut version, trust region, cut strengthening, approximating master solve, and warm startup.Parallel computation of subproblems is an enhancement using modern computing technology to reduce wallclock times when solving numerous independent subproblems.

Hybrid-Cut Version.
In multicut L-shaped algorithm, more information from second stage is passed to the first stage by disaggregate cuts, resulting in fewer iteration steps compared to single-cut L-shaped method (see [40]).However, multicut version would enlarge the first-stage problem, which increases the total computational effort to solve the master problem, especially for problems with integer variables in the first stage.In order to balance the tradeoff between information and total computational effort, this paper follows the static aggregate method [43] to fix an aggregated level.This enhancement is referred to as hybridcut (HC).The steps of the HC algorithm are presented below.
Step 0. Following some grouping scheme, cluster  scenarios into  groups.Each group has a set of scenarios Ω  .Set accumulation index () of each group  to 0. Set iteration index V = 1 and LB = −∞ and UB = +∞.
Step 1. Solve the master problem: Let ( V ,  V ) and  V be the optimal solution and optimal objective value for iteration V, respectively.If  V > LB, update LB =  V . ()  +   ≥  () is optimality cuts generated by each group of scenarios.
Step 2. For each group , there are |Ω  | scenarios.For each scenario  ∈ Ω  , solve the dual problem (DP) (13).Denote the dual optimal solution for scenario  in group  to be ( , ,  , ,  , ).Define Let add an optimality cut and () = () + 1.If inequality (18) does not hold for any , stop and output  V as the optimal solution.
Step 3. Denote  V , to be the optimal objective value of scenario  in iteration V.
Because the convergence of this algorithm follows directly from that of the multicut method (see [30,40,43]), the proof of convergence is omitted.
It should be noted that different aggregated levels and different grouping schemes would lead to different computational efforts.In order to fix an appropriate aggregated level and find out an efficient grouping scheme, Section 5 will discuss these two aspects by conducting numerical experiments.

Trust Region.
Trust region is a kind of regularizing technique adapted from regularized decomposition for continuous problems (see [40,46,55]), which helps to mitigate two kinds of difficulties in cutting plane methods: (a) growth in the number of cuts added to the master problem and (b) the fact that there is no easy way to use a good starting solution (see [55]).Preliminary experiments and existing literature show that solution oscillates wildly in early iterations.Thus trust region is used to limit the number of first-stage binary variables that change from iteration to iteration.For stochastic programing problems with binary variables in the first stage, Santoso et al. [56] and Oliveira et al. [57] showed that the 2-norm or infinity-norm distance is not effective.Therefore, this paper uses Hamming distance.Let  V = {(, ) :  V  = 1} such that the trust region to be added to the master problem for iteration V + 1 is where Δ V is the trust region radius, denoting the limit on the number of variables that can change from iteration V to iteration V + 1.
In the implementation, the initial value of Δ 1 is set to be ( × )/2 so that, at the very beginning, half of the links between products and plants are allowed to be changed.Luis et al. [58] argued that the trust region radius and the time to discard the use of trust region require careful settings to improve performance.In addition, the method of dynamically updating radius Δ with iterations should be well designed, which could improve the computation efficiency (see [55]).This paper follows recommendation from [50,58].That is, if the current objective value estimated by the MP is not better than the estimated lower bound, Δ V+1 = ⌈0.6ΔV ⌉ is used to shrink the trust region intending to obtain a higher objective value in the next iteration.Otherwise, Δ V+1 = ⌈1.3ΔV ⌉ is used to expand the trust region radius.
For integer programing problems, trust region cannot guarantee convergence (see [50]).Therefore, the trust region constraint needs to be dropped once the procedure has reached certain criteria (see [56]).In this study, trust region constraint is dropped once the difference in adjacent gaps is lower than a given tolerance (i.e., 0.004 in this paper).

Cut Strengthening.
The network substructure problem (8) typically has multiple dual solutions; hence a subproblem may generate multiple optimality cuts.Among these cuts, one cut may dominate another one.Since the addition of cuts generated during iterations makes the MP harder to solve, choosing the best one from these alternative cuts would be beneficial in solving the MP by reducing the number of iterations.Let ((  V ) 1 ,(  V ) 1 ) and ((  V ) 2 ,(  V ) 2 ) be coefficients of two alternative cuts.Suppose that  * is the optimal solution of the stochastic problem.If 2 is said to be dominated by cut (  V ) 1 +   ≥ (  V ) 1 (see [51]).A cut is called Pareto-optimal cut (also called the deepest cut in [59]) if it is not dominated by any other cut.Since  * is not known priorly, Magnanti and Wong [51] used a core point instead.A core point is a point in the interior of the convex hull of the feasible region for the original problem.
Given a core point Ŷ, a Pareto-optimal cut can be generated by solving the following subproblem: , ,  ≤ 0, where   is a solution of the master problem and (  ) is the optimal objective function value of subproblem ( 13) with solution .Equality (22) ensures that the dual variable solution to the problem is chosen from one of the optimal solutions of the dual subproblem (see [51,56,58]).Generally speaking, it is intractable to obtain a core point.Fortunately, previous work has shown that points which are close to core points can be strong.Alternative methods to obtain such points are proposed.Santoso et al. [56] used the fractional optimal solution from the LP relaxation of the MP as an approximated core point.Luis et al. [58] updated core point by averaging the last core point and the latest MP solution.Rei et al. [59] simply fixed each component of core point to be 0.5.In this paper, Pareto-optimal cuts are generated by solving the corresponding problem of (20) and adopting the method from [56] to obtain core points.

Approximate Master Solve.
For two-stage stochastic programing problems with integer variables in the first stage, most computational effort is exerted on solving the MP.In early iterations, when the solution is still far away from optimality, it is acceptable to solve the MP within -optimality, where  > 0.1%.When the optimal gap of the whole problem is small enough resume the default tolerance for the MP.If the same master solutions are obtained in consecutive iterations but the optimality gap is still not small enough, update  = 0.5.Such enhancement technique is widely used in related studies (see [42,50,58]).

Warm Startup.
From MIP theory it is known that a good initial solution, called warm startup, can help solve the problem efficiently.There are several means to generate initial solutions.In this paper the initial solution is obtained by solving an equivalent deterministic problem with small number of scenarios.The number of scenarios is set to be 50 for computation experiment in this paper.The effectiveness of warm startup is also studied by Fábián and Szőke [60], Luis et al. [58], and Wolf and Koberstein [44].

Parallel Computing of Subproblems.
A big advantage of decomposition-based algorithm is that all the subproblems are independent.Therefore, parallel computing is conducted to solve the second-stage problems by using modern parallel computing technique [30,44], such as multicore technology, symmetric multiprocessing, distributed computing, and grid computing.Wolf and Koberstein [44] summarized applications of these techniques in stochastic programing.Because this paper intends to compare the computational efficiency of L-shaped method with that of MIP solver, parallel computing only by multicore technology is implemented.Using this method one needs to pay careful attention to processor load balancing [61].For the experimental platform used in this paper (details will be provided in the next section), ten threads are opened to finish solving subproblems, which performed well in preliminary study.

Computational Experiments
In this section, the effectiveness of the L-shaped method with different enhancements to solve the small size problems is firstly evaluated by comparing the solutions obtained with the corresponding result of DEP solved by a commercial solver (here CPLEX 12.6 is used).Before that, preliminary experiments are conducted to test the impact of sample size and discuss some key parameters for hybrid-cut version.Then the efficiency of the improved L-shaped method is analyzed by conducting medium and large size problems.A summary of efficiency of different enhancements is presented in Section 5.3.4.Finally, the superiority of the design method proposed by this paper is shown by comparing with existing heuristic design methods.Abbreviations for different enhancements are shown in the Abbreviations (abbreviation for enhancement strategies).

Data Generation.
Because of a lack of applicable problem instances in the literature, test instances have been generated based on the method proposed by Schneider et al. [8].Small size combinations of product types () and plants () are set to / = {4/4, 5/5, 6/6}, while / = {8/8, 10/10, 20/20} are set for medium and large size problems.Flexibility cost/profit margin ratio (lcpm) is set to be 5.Ten instances are generated for each combination of / for small size problems and four instances are generated for both medium and large size problems.In addition, the algorithm is applied to unbalanced systems where / = {6/4, 8/5, 15/12}.
The algorithm is implemented with Java on a standard desktop PC with 2 Intel R Xeon processors at 2.13 GHz and 16 GB RAM running Windows Server 8 R2 Enterprise, by calling the optimization solver CPLEX 12.6 for solving the master problem and each subproblem.Each experiment is terminated when it satisfies the stop criterion in the algorithm design or the solving time exceeds 3600 seconds. .Furthermore, the performance of Lshaped method is also affected by the sample size.Therefore, preliminary experiments of small size problems are carried out in this subsection to test the impact of different sample sizes.
Firstly, the impact of sample size on objective value is tested.In this experiment, the DEP model ( 9) is used to solve the PFD problem.Sample size  varies within the value set {500, 1000, 2000, 4000}./ is set to be 4/4 and 5/5 because the DEP model of such scale problems could be solved exactly within limited time (3600 s).Other parameters are generated following Section 5.1.Ten instances are generated for each sample size.The average objective values and differences are reported in Table 1.The differences are relative differences between two objective values of adjacent sample sizes.For instance, 1.67% is obtained by |(−22626.65− (−23011.95))/− 23011.95|× 100.The result shows that the differences become smaller as sample size increases, which implies that sample size of 2000 could well approximate the original function for small size problem.Based on this preliminary experiment, sample size will be fixed to be 2000 unless otherwise specified.
Then, the impact of sample size on the performance of L-shaped method is tested.In this preliminary experiment small size problems with / = {4/4, 5/5, 6/6} will be examined, while sample size varies within {500, 1000, 2000}.Other parameters are generated following Section 5.1.Because Lshaped method with different enhancements could solve these problems (nearly) optimally within given time, Table 2 only reports the average solving time (second) which represents the performance.The result shows that the solving time increases as both the problem scale and sample size increase.Furthermore, for small size problems different enhancement combinations have similar performance.However, when problem size increases to / = 6/6, the performance of "HC-PC-TR-AM" becomes much better than others.The performance of L-shaped method with different enhancement combinations will be further examined in the next subsection.

Key Parameter Values for Hybrid-Cut Version.
As indicated in Section 4.2, different aggregated levels and different grouping schemes would result in different computational efforts.However, it is intractable to determine the optimal aggregated level in advance (see [43,44]).Therefore, this subsection will firstly conduct computational experiments to select an appropriate aggregated level.Different grouping schemes will then be discussed. and  are set to be 4. Scenario number is set to be 2000.Then solving time and gap are observed when aggregated level varies within the set {10, 30, 50, 70, 90}.As it is shown in Table 3, with aggregated level being 50, the computational time decreased to 65.2 seconds while the number of iterations decreased to 19 and gap converges to 0.00%, which implies that, for settings (//) = (4/4/2000), an appropriate aggregated level is 50.In the following numerical studies, the aggregated level is fixed to be 50.
Different grouping schemes could lead to different computational efficiencies.In preliminary experiments, the hybridcut version took 3207 CPU seconds to converge to a gap of 0.08% by using round-robin method [43] (i.e., for a fixed aggregated level , cuts 1,  + 1, 2 + 1, . . .are aggregated to group 1, cuts 2,  + 2, 2 + 2, . . .are aggregated to group 2, and so on), while it took only 127 seconds to converge to a gap of 0.10% by using -means cluster method.Therefore, in order to speed up the computations, this paper adopts -means algorithm as the grouping scheme to cluster scenarios into groups, which is also used to speed up Bender' decomposition algorithm in [62].

Effectiveness of of the Adapted L-Shaped Method.
In this subsection, we present the experimental results on combinations of enhancements mentioned above for small and medium and large size problems.Table 4 shows results for small size problems.The measure of algorithm effectiveness used is the gap (%) defined as follows: where Algorithm  represents the solution value using enhancement combination  and LB Best is the best lower bound found using all algorithms tested.

Results
for Small Size Problem.Table 4 shows the average computational results for small size balanced problems.It shows that, for small size problems, almost all combined strategies solve the model to acceptable tolerance level in less CPU time than MIP does, which implies that enhanced Lshaped methods are superior to MIP for small size problems.
As is shown in Table 4, among the combined enhanced strategies, "HC-PC-TR-AM" outperforms others in terms of both the CPU time and percentage gap, which is also illustrated in Figure 2. In 5/5 and so forth.Enhancement strategy combination "HC-PC-TR-AM" solves problems under acceptable tolerance in the least CPU time.
The single-cut version of L-shaped method performs more poorly than any other enhanced methods for problems considered in this paper.Enhancements aiming at solving the master problem (i.e., AM, TR, and WS) improve the algorithm performance significantly.By contrast, enhancements based on subproblems structures (i.e., HC, PC) improve algorithm performance slightly.It is mainly because there would be a long tail for the algorithm.That is, after dramatic decrease during the first iterations, gap reduces extremely slowly, even with no improvement for several successive iterations.5 presents average results of two medium size balanced problems (/ = 8/8, / = 10/10) and one large size (/ = 20/20) balanced problem with four independent instances for each.As is expected, MIP could not solve even medium size problem (/ = 8/8) into tolerant gap within the given CPU time.Several combination strategies are chosen to check their performances on medium size and large size problems due to their good performances in small size problems.

Results for Medium and Large Size Problem. Table
With the problem size increases to medium or large size, no method is able to obtain optimal solutions for all problem instances within the given time."HC-PC-TR-AM" performs in the best way to obtain the lowest gaps within given CPU time.Even for the large size problem instances (e.g., / = 20/20), the percentage gap for the "HC-PC-TR-AM" algorithm is only 0.26% within given CPU time.Table 6 shows computational results for unbalanced systems where number of plants and products is different.Again, "HC-PC-TR-AM" performs best in terms of both solving time and gap.

Efficiency of Cut
Strengthening.Pareto-optimal cut is the strongest cut among alternative cuts to speed up the process of solving the master problem.However, as discussed in Section 4.2.3, to obtain a Pareto-optimal cut, it is needed to solve twice the number of subproblems.There is a trade-off between the CPU time saving for solving the master problem and the CPU time consumed for obtaining the Paretooptimal cuts.That is, if solving time for the master problem is large, then spending twice the time solving subproblems to get the strongest cut is supposed to improve the total solving time.Some experiments are presented in Table 7,  (i) Parallel Computing (PC).This enhancement reduces total computation time for small size problems.However, with the increase of problem size, benefits from "PC" fade away because the computational effort is dominated by the solving of master problem.This conclusion is drawn by comparing results of "HC" with "HC-PC" and "HC-AM" with "HC-PC-AM" in Tables 4 and 6.
(ii) Approximate Master (AM) Solve and Warm Startup (WS)."AM" and "WS" contribute to improvement of L-shaped method.However, "AM" does not add extra benefits to enhancement of "WS."The reason might be that with initial solution the master problem could be solved efficiently.In other words, benefit of "WS" dominates that of "AM." (iii) Trust Region (TR).Benefit from "TR" is obvious.With the increase of problem size, this benefit becomes notable.
From the summarization and numerical study result reported above, it is shown that enhancement combination "HC-PC-TR-AM" performs in the best way.Therefore, the adapted L-shaped method enhanced by this combination will be compared with other PFD methods in the next subsection.

Comparison with Other Design
Methods.The stochastic programing (SP) model and method proposed in this paper could solve PFD problem with general settings, such as no assumption of balanced systems with symmetric cost parameters or unbalanced systems with symmetric capacities.Section 5.4.1 will compare it with a currently proposed heuristic design method under balanced systems, while Section 5.4.2 will conduct a comparison with the design principles for unbalanced systems proposed by Deng and Shen [9].

Comparison with Heuristic Long Chain Design Method.
In this subsection, the results obtained from the proposed Lshaped method are compared with that of Zhang et al. [4].An  [4] and "HC-PC-TR-AM," respectively; "Impr (%)" represents the improvement of "HC-PC-TR-AM" compared to [4] (i.e., |(Obj 2 −Obj 1 )/Obj 1 |); and "# LC/Total" means the number of long chains being the optimal solution by our method out of the total experiment cases.Table 8 shows that the proposed method in this study could obtain better solutions than does [4].This improvement could be quite significant for medium and large cases.The last column tells that long chain structure is not the optimal solution for the majority of experiment cases.There are mainly two reasons for the superiority of the proposed method.Firstly, when nonhomogeneous parameters are taken into consideration, long chain design need not always be the best.The other reason is that model developed in this study considers the uncertainty of demands, while it is ignored in [4].Figure 4 shows an example where (/ = 6/6).Figure 4(a) represents the optimal solution by Zhang et al. [4] which we call "LCD," while the optimal design of our method "HC-PC-TR-AM" is a non-long chain structure as shown on the right panel.The total cost of "LCD" turns out to be −25575.58as compared with −28205.02from "HC-PC-TR-AM."It shows again that the proposed method also generates sparse flexibility design, but at the same time it costs less than long chain design does.

Comparison with Heuristic Unbalanced Network Design
Method.Deng and Shen [9] proposed additional flexibility design guidelines (hereinafter referred to as Deng) for symmetric unbalanced systems where all plants have the same capacity and product demands are ...In this subsection, we compare results of SP by setting the system with 4 plants and 5 products to be symmetric and asymmetric, respectively.Firstly, the system is set to be symmetric.That is, all the plants have the same capacities and all products' demands are ...as designed in Section 5.1.In order to compare our solution and Deng's solution, we vary the flexibility cost (  ) to produce a spectrum of solutions with different flexibility cost.The average sales profit from the secondstage and the total profit (i.e., sales profit minus flexibility investment) are plotted in Figure 5.It is observed that Deng's solution is slightly better than SP solution with regard to sales profit.This is due to the fact that Deng's guidelines are designed specially for minimizing shortfall, while our method is for minimizing total cost.As shown in Figure 5, SP solution slightly outperforms Deng's solution in terms of total profit.With the increasing of flexibility cost, the superiority becomes more obvious.This experiment suggests that our method performs at least as well as Deng's method in terms of response dimension under the conditions of symmetric unbalanced system.Then, the system is set to be asymmetric such that plants have different capacities.Flexibility cost is fixed to be 4 times the marginal benefit ().Capacities are randomly generated from uniform distributions with mean 125.We vary the ranges of the uniform distributions to test the effect of different degrees of capacity asymmetry and simulate 100 instances for each value.The average result is plotted in Figure 6.It is observed that Deng's method is robust for low degrees of capacity asymmetry.For the majority of instances, Deng's design principles outperform SP method in terms of minimizing shortfall.However, SP method performs better than Deng's method if the degree of capacity asymmetry is high.This suggests that SP method is more proper for more general systems.In reality, asymmetric systems could occur by uncertainty of plants, such as machines breakdown, which suggests that SP method might also be applied under the condition of uncertain capacity.

Conclusions
This paper formulated the PFD problem under demand uncertainty using two-stage stochastic programing model.The model considers nonhomogeneous system parameters for balanced and unbalanced systems without assuming symmetric capacities of resources.An exact optimization algorithm, namely, L-shaped method, is introduced solve the proposed model.To the best of our knowledge, this is the first time that L-shaped method is used to solve the PFD problem.Based on classic L-shaped method (the single-cut version) and multicut version, it is found that the hybridcut version, which aggregates all scenarios to a certain level between the former two extreme levels, can help speed up the solution process significantly.
Several prevalent enhancements are discussed in this paper.While all enhancement strategies improve the algorithm in terms of time or quality, computational results show that the adapted L-shaped method using the "HC-PC-TR-AM" combination is the most effective in solving the PFD problem, which cannot be handled by commercial solvers directly.This shows that the proposed developments can be generalized to solve more general problems related to PFD and perhaps some other design problems as well.For instance, in the context of call centers, managers need to decide on how many operators to employ and what kinds of services each operator needs to be trained in.
In practice, manufacturing or service systems are always unbalanced and system parameters could hardly be homogeneous.Therefore, this paper offers an appropriate formulation to model such practical problems.The results show that under such conditions the widely advocated chaining structure is not always the best configuration.However, as shown in the computational experiment, the proposed method still guarantees a sparse flexible structure which is quite easy and economical to implement.Therefore, the developments and solution methods discussed in this paper can be used by managers and executives to solve practical PFD problems.
Several issues are worthy of future investigations.First, considering extensions of the PFD problem that include cases where demands for different products are correlated will enhance the practicality of the proposed approach.Second, more research is needed to further improve algorithm enhancements.From Table 3, it can be observed that the aggregated level affects the computational performance of hybrid-cut version a lot.Therefore, it would be an interesting issue to design effective schemes to find out the best aggregated level in the proposed L-shaped algorithm.Third, development of hybrid solution techniques-like combining the proposed L-shaped method with enhancements and suitably developed metaheuristics-could increase the efficiency of the proposed L-shaped algorithm in this paper.Finally, extension of our results to more complex PFD problems-like the inclusion of transportation costs in cases involving multiple plant sites which could well be separated by cities and countries-will enhance the practicality of the developments in this paper.

Figure 6 :
Figure 6: Comparison with Deng for asymmetric systems.

Table 1 :
Impact of sample size on objective value.

Table 2 :
Impact of sample size on the performance of L-shaped method.

Table 4 :
Computational results for small size balanced problems.

Table 7 :
Efficiency of cut strengthening.

Table 8 :
[4]parison with optimal design by Zhang et al.[4].(2000 in this paper) times PFD2 problems are then solved to get the total cost value, equal to flexibility investment minus the expected benefits from sales of manufactured products.The results are compared with those obtained by the enhanced method "HC-PC-TR-AM" in Sections 5.3.1 and 5.3.2.Table8shows the comparison results, where Obj 1 and Obj 2 are average total cost values from