Various optimization problems in engineering and management are formulated as nonlinear programming problems. Because of the nonconvexity nature of this kind of problems, no efficient approach is available to derive the global optimum of the problems. How to locate a global optimal solution of a nonlinear programming problem is an important issue in optimization theory. In the last few decades, piecewise linearization methods have been widely applied to convert a nonlinear programming problem into a linear programming problem or a mixed-integer convex programming problem for obtaining an approximated global optimal solution. In the transformation process, extra binary variables, continuous variables, and constraints are introduced to reformulate the original problem. These extra variables and constraints mainly determine the solution efficiency of the converted problem. This study therefore provides a review of piecewise linearization methods and analyzes the computational efficiency of various piecewise linearization methods.
1. Introduction
Piecewise linear functions are frequently used in various applications to approximate nonlinear programs with nonconvex functions in the objective or constraints by adding extra binary variables, continuous variables, and constraints. They naturally appear as cost functions of supply chain problems to model quantity discount functions for bulk procurement and fixed charges. For example, the transportation cost, inventory cost, and production cost in a supply chain network are often constructed as a sum of nonconvex piecewise linear functions due to economies of scale [1]. Optimization problems with piecewise linear costs arise in many application domains, including transportation, telecommunications, and production planning. Specific applications include variants of the minimum cost network flow problem with nonconvex piecewise linear costs [2–7], the network loading problem [8–11], the facility location problem with staircase costs [12, 13], the merge-in-transit problem [14], and the packing problem [15–17]. Other applications also include production planning [18], optimization of electronic circuits [19], operation planning of gas networks [20], process engineering [21, 22], engineering design [23, 24], appointment scheduling [25], and other network flow problems with nonconvex piecewise linear objective functions [7].
Various methods of piecewisely linearizing a nonlinear function have been proposed in the literature [26–39]. Two well-known mixed-integer formulations for piecewise linear functions are the incremental cost [40] and the convex combination [41] formulations. Padberg [35] compared the linear programming relaxations of the two mixed-integer programming models for piecewise linear functions in the simplest case when no constraint exists. He showed that the feasible set of the linear programming relaxation of the incremental cost formulation is integral; that is, the binary variables are integers at every vertex of the set. He called such formulations locally ideal. On the other hand, the convex combination formulation is not locally ideal, and it strictly contains the feasible set of the linear programming relaxation of the incremental cost formulation. Then, Sherali [42] proposed a modified convex combination formulation that is locally ideal. Alternatively, Beale and Tomlin [43] suggested a formulation for the piecewise linear function similar to convex combination, except that no binary variable is included in the model and the nonlinearities are enforced algorithmically, directly in the branch-and-bound algorithm, by branching on sets of variables, which they called special ordered sets of type 2 (SOS2). It is also possible to formulate piecewise linear functions similar to incremental cost but without binary variables and enforcing the nonlinearities directly in the branch-and-bound algorithm. Two advantages of eliminating binary variables are the substantial reduction in the size of the model and the use of the polyhedral structure of the problem [44, 45]. Keha et al. [46] studied formulations of linear programs with piecewise linear objective functions with and without additional binary variables and showed that adding binary variables does not improve the bound of the linear programming relaxation. Keha et al. [47] also presented a branch-and-cut algorithm for solving linear programs with continuous separable piecewise-linear cost functions. Instead of introducing auxiliary binary variables and other linear constraints to represent SOS2 constraints used in the traditional approach, they enforced SOS2 constraints by branching on them without auxiliary binary variables.
Due to the broad applications of piecewise linear functions, many studies have conducted related research on this topic. The main purpose of these studies is to find a better way to represent a piecewise linear function or to tighten the linear programming relaxation. A superior representation of piecewise linear functions can effectively reduce the problem size and enhance the computational efficiency. However, for expressing a piecewise linear function of a single variable x with m+1 break points, most of the methods in the textbooks and literature require adding extra m binary variables and 4m constraints, which may cause a heavy computational burden when m is large. Recently, Li et al. [48] developed a representation method for piecewise linear functions with fewer binary variables compared to the traditional methods. Although their method needs only ⌈log2m⌉ extra binary variables to piecewisely linearize a nonlinear function with m+1 break points, the approximation process still requires 8+8⌈log2m⌉ extra constraints, 2m nonnegative continuous variables, and 2⌈log2m⌉ free-signed continuous variables. Vielma et al. [39] presented a note on Li et al.’s paper and showed that two representations for piecewise linear functions introduced by Li et al. [48] are both theoretically and computationally inferior to standard formulations for piecewise linear functions. Tsai and Lin [49] applied the Vielma et al. [39] techniques to express a piecewise linear function for solving a posynomial optimization problem. Croxton et al. [31] indicated that most models of expressing piecewise linear functions are equivalent to each other. Additionally, it is well known that the numbers of extra variables and constraints required in the linearization process for a nonlinear function obviously impact the computational performance of the converted problem. Therefore, this paper focuses on discussing and reviewing the recent advances in piecewise linearization methods. Section 2 reviews the piecewise linearization methods. Section 3 compares the formulations of various methods with the numbers of extra binary/continuous variables and constraints. Section 4 discusses error evaluation in piecewise linear approximation. Conclusions are made in Section 5.
2. Formulations of Piecewise Linearization Functions
Consider a general nonlinear function f(x) of a single variable x; f(x) is a continuous function, and x is within the interval [a0,am]. Most commonly used textbooks of nonlinear programming [26–28] approximate the nonlinear function by a piecewise linear function as follows.
Firstly, denote ak(k=0,2,…,m) as the break points of f(x), a0<a1<⋯<am, and Figure 1 indicates the piecewise linearization of f(x).
Piecewise linearization of f(x).
f(x) can then be approximately linearized over the interval [a0,am] as
(1)L(f(x))=∑k=0mf(ak)tk,
where x=∑k=0maktk, ∑k=0mtk=1, tk≥0, in which only two adjacent tk’s are allowed to be nonzero. A nonlinear function is then converted into the following expressions.
The above expressions involve m new binary variables y0,y1,…,ym-1. The number of newly added 0-1 variables for piecewisely linearizing a function f(x) equals the number of breaking intervals (i.e., m). If m is large, it may cause a heavy computational burden.
Li and Yu [33] proposed another global optimization method for nonlinear programming problems where the objective function and the constraints might be nonconvex. A univariate function is initially expressed by a piecewise linear function with a summation of absolute terms. Denote sk (k=0,1,…,m-1) as the slopes of line segments between ak and ak+1, expressed as sk=[f(ak+1)-f(ak)]/[ak+1-ak]. f(x) can then be written as follows:
(3)L(f(x))=f(a0)+s0(x-a0)+∑k=1m-1sk-sk-12(|x-ak|+x-ak).
f(x) is convex in the interval [ak-1,ak] if sk-sk-1≥0, and otherwise f(x) is a non-convex function which needs to be linearized by adding extra binary variables. By linearizing the absolute terms, Li and Yu [33] converted the nonlinear function into a piecewise linear function as shown below.
where x≥0, dl≥0, zk≥0, uk∈{0,1}, x- are upper bounds of x and uk are extra binary variables used to linearize a non-convex function f(x) for the interval [ak-1,ak].
Comparing Method 2 with Method 1, Method 1 uses binary variables to linearize f(x) for whole x interval. But the binary variables used in Method 2 are only applied to linearize the non-convex parts of f(x). Method 2 therefore uses fewer 0-1 variables than Method 1. However, for f(x) with q intervals of the non-convex parts, Method 2 still requires q binary variables to linearize f(x).
Another general form of representing a piecewise linear function is proposed in the articles of Croxton et al. [31], Li [32], Padberg [35], Topaloglu and Powell [36], and Li and Tsai [38]. The expressions are formulated as shown below.
where ∑k=0m-1λk=1, λk∈{0,1}, and
(6)f(ak)+sk(x-ak)-M(1-λk)≤f(x)≤f(ak)+sk(x-ak)+M(1-λk),k=0,1,…,m-1,
where M is a large constant and sk=(f(ak+1)-f(ak))/(ak+1-ak).
The above expressions require extra m binary variables and 4m constraints, where m+1 break points are used to represent a piecewise linear function.
Form the above discussions, we can know that Methods 1, 2, and 3 require a number of extra binary variables and extra constraints linear in m to express a piecewise linear function. To approximate a nonlinear function by using a piecewise linear function, the numbers of extra binary variable and constraints significantly influence the computational efficiency. If fewer binary variables and constraints are used to represent a piecewise linear function, then less CPU time is needed to solve the transformed problem. For decreasing the extra binary variables involved in the approximation process, Li et al. [48] developed a representation method for piecewise linear functions with the number of binary variables logarithmic in m. Consider the same piecewise linear function f(x) discussed above, where x is within the interval [a0,am] and m+1 break points exist within [a0,am]. Let θ be an integer, 0≤θ≤m-1, expressed as
(7)θ=∑j=1h2j-1uj,h=⌈log2m⌉,uj∈{0,1}.
Let G(θ)⊆{1,2,…,h} be a set composed of all indices such that ∑j∈G(θ)2j-1=θ. For instance, G(0)=ϕ, G(3)={1,2}.
Denote ∥G(θ)∥ to be the number of elements in G(θ). For instance, ∥G(0)∥=0, ∥G(3)∥=2.
To approximate a univariate nonlinear function by using a piecewise linear function, the following expressions are deduced by the Li et al. [48] method.
where uj′∈{0,1}, cθ,j, zj, and δj are free continuous variables, rθ and wθ are nonnegative continuous, and all the variables are the same as defined before.
The expressions of Method 4 for representing a piecewise linear function f(x) with m+1 break points use ⌈log2m⌉ binary variables, 8+8⌈log2m⌉ constraints, 2m non-negative variables, and 2⌈log2m⌉ free-signed continuous variables. Comparing with Methods 1, 2, and 3, Method 4 indeed reduces the number of binary variables used such that the computational efficiency is improved. Although Li et al. [48] developed a superior way of expressing a piecewise linear function by using fewer binary variables, Vielma et al. [39] investigated that this representation for piecewise linear functions is theoretically and computationally inferior to standard formulations for piecewise linear functions. Vielma and Nemhauser [50] recently developed a novel piecewise linear expression requiring fewer variables and constraints than the current piecewise linearization techniques to approximate the univariate nonlinear functions. Their method needs a logarithmic number of binary variables and constraints to express a piecewise linear function. The formulation is described as shown below.
Let P={0,1,2,…,m} and p∈P. An injective function B:{1,2,…,m}→{0,1}θ, θ=⌈log2m⌉, where the vectors B(p) and B(p+1) differ in at most one component for all p∈{1,2,…,m-1}.
Let B(p)=(u1,u2,…,uθ), for all uk∈{0,1}, k=1,2,…,θ, and B(0)=B(1). Some notations are introduced below.
S+(k): a set composed of all p, where uk=1 of B(p) and B(p+1) for p=1,2,…,m-1 or uk=1 of B(p) for p∈{0,m}; that is, S+(k)={p∣∀B(p) and B(p+1),uk=1,p=1,2,…,m-1}∪{p∣∀B(p),uk=1,p∈{0,m}}.
S-(k): a set composed of all p, where uk=0 of B(p) and B(p+1) for p=1,2,…,m-1 or uk=0 of B(p) for p∈{0,m}; that is, S-(k)={p∣∀B(p) and B(p+1),uk=0,p=1,2,…,m-1}∪{p∣∀B(p),uk=0,p∈{0,m}}.
The linear approximation of a univariate f(x), a0≤x≤am, by the technique of Vielma and Nemhauser [50] is formulated as follows.
Method 5.
Denote L(f(x)) as the piecewise linear function of f(x), where a0<a1<a2<⋯<am be the m+1 break points of L(f(x)). L(f(x)) can be expressed as
(9)L(f(x))=∑p=0mf(ap)λp,x=∑p=0mapλp,∑p=0mλp=1,∑p∈S+(k)λp≤uk,∑p∈S-(k)λp≤1-uk,∀λp∈ℜ+,∀uk∈{0,1}.
Method 5 uses ⌈log2m⌉ binary variables, m+1 continuous variables, and 3+2⌈log2m⌉ constraints to express a piecewise linearization function with m line segments.
3. Formulation Comparisons
The comparison results of the above five methods in terms of the numbers of binary variables, continuous variables, and constraints are listed in Table 1. The number of extra binary variables of Methods 1 and 3 is linear in the number of line segments. Methods 4 and 5 have the logarithmic number of extra binary variables with m line segments, and the number of extra binary variables of Method 2 is equal to the number of concave piecewise line segments. In the deterministic global optimization for a minimization problem, inverse, power, and exponential transformations generate nonconvex expressions that require to be linearly approximated in the reformulated problem. That means Methods 4 and 5 are superior to Methods 1, 2, and 3 in terms of the numbers of extra binary variables and constraints as shown in Table 1. Moreover, Method 5 has fewer extra continuous variables and constraints than Method 4 in linearizing a nonlinear function.
Comparison results of five methods in expressing a piecewise linearization function with m line segments (i.e., m+1 break points).
Items
Method 1
Method 2
Method 3
Method 4
Method 5
No. of binary variables
m
q(no. of concave piecewise segments)
m
⌈log2m)⌉
⌈log2m⌉
No. of continuous variables
m+1
m+1
0
2m+2⌈log2m⌉
m+1
No. of constraints
m+5
m+1
4m
8+8⌈log2m⌉
3+2⌈log2m⌉
Till et al. [51] reviewed the literature on the complexity of mixed-integer linear programming (MILP) problems and summarized that the computational complexity varies from O(d·n2) to O(2d·n3), where n is the number of constraints and d is the number of binaries. Therefore, reducing constraints and binary variables makes a greater impact than reducing continuous variables on computational efficiency of solving MILP problems. For finding a global solution of a nonlinear programming problem by a piecewise linearization method, if the linearization method generates a large number of additional constraints and binaries, the computational efficiency will decrease and cause heavy computational burdens. According to the above discussions, Method 5 is more computationally efficient than the other four methods. Experiment results from the literature [39, 48, 49] also support the statement.
Beale and Tomlin [43] suggested a formulation for piecewise linear functions by using continuous variables in special ordered sets of type 2 (SOS2). Although no binary variables are included in the SOS2 formulation, the nonlinearities are enforced algorithmically and directly in the branch-and-bound algorithm by branching on sets of variables. Since the traditional SOS2 branching schemes have too many dichotomies, the piecewise linearization technique in Method 5 induces an independent branching scheme of logarithm depth and provides a significant computational advantage [50]. The computational results in Vielma and Nemhauser [50] show that Method 5 outperforms the SOS2 model without binary variables.
The factors affecting the computational efficiency in solving nonlinear programming problems include the tightness of the constructed convex underestimator, the efficiency of the piecewise linearization technique, and the number of the transformed variables. An appropriate variable transformation constructs a tighter convex underestimator and makes fewer break points required in the linearization process to satisfy the same optimality tolerance and feasibility tolerance. Vielma and Nemhauser [50] indicated that the formulation of Method 5 is sharp and locally ideal and has favorable tightness properties. They presented experimental results showing that Method 5 significantly outperforms other methods, especially when the number of break points becomes large. Vielma et al. [39] explained that the formulation of Method 4 is not sharp and is theoretically and computationally inferior to standard MILP formulations (convex combination model, logarithmic convex combination model) for piecewise linear functions.
4. Error Evaluation
For evaluating the error of piecewise linear approximation, Tsai and Lin [49, 52] and Lin and Tsai [53] utilized the expression |f(x)-L(f(x))| to estimate the error indicated in Figure 2. If f(x) is the objective function, gi(x)<0 is the ith constraint, and x* is the solution derived from the transformed program, then the linearization does not require to be refined until |f(x*)-L(f(x*))|≤ε1 and Maxi(gi(x*))≤ε2, where |f(x*)-L(f(x*))| is the evaluated error in objective, ε1 is the optimality tolerance, gi(x*) is the error in the ith constraint, and ε2 is the feasibility tolerance.
Error evaluation of the linear approximation.
The accuracy of the linear approximation significantly depends on the selection of break points and more break points can increase the accuracy of the linear approximation. Since adding numerous break points leads to a significant increase in the computational burden, the break point selection strategies can be applied to improve the computational efficiency in solving optimization problems by the deterministic approaches. Existing break point selection strategies are classified into three categories as follows [54]:
add a new break point at the midpoint of each interval of existing break points;
add a new break point at the point with largest approximation error of each interval;
add a new break point at the previously obtained solution point.
According to the deterministic optimization methods for solving nonconvex nonlinear problems [29, 33, 38, 39, 48, 49, 53–56], the inverse or logarithmic transformation is required to be approximated by the piecewise linearization function. For example, the function y=lnx or y=x-1 is required to be piecewisely linearized by using an appropriate breakpoint selection strategy, if a new break point is added at the midpoint of each interval of existing break points or at the point with largest approximation error, the number of line segments becomes double in each iteration. If a new breakpoint is added at the previously obtained solution point, only one breakpoint is added in each iteration. How to improve the computational efficiency by a better break point selection strategy still needs more investigations or experiments to get concrete results.
5. Conclusions
This study provides an overview on some of the most commonly used piecewise linearization methods in deterministic optimization. From the formulation point of view, the numbers of extra binaries, continuous variables, and constraints are decreasing in the latest development methods especially for the number of extra binaries which may cause heavy computational burdens. Additionally, a good piecewise linearization method must consider the tightness properties such as sharp and locally ideal. Since effective break points selection strategy is important to enhance the computational efficiency in linear approximation, more work should be done to study the optimal positioning of the break points. Although a logarithmic piecewise linearization method with good tightness properties has been proposed, it is still too time consuming for finding an approximately global optimum of a large scale nonconvex problem. Developing an efficient polynomial time algorithm for solving nonconvex problems by piecewise linearization techniques is still a challenging question. Obviously, this contribution gives only a few preliminary insights and might point toward issues deserving additional research.
Acknowledgment
The research is supported by Taiwan NSC Grants NSC 101-2410-H-158-002-MY2 and NSC 102-2410-H-027-012-MY3.
TsaiJ. F.An optimization approach for supply chain management models with quantity discount policyAghezzafE. H.WolseyL. A.Modelling piecewise linear concave costs in a tree partitioning problemBalakrishnanA.GravesS.A composite algorithm for a concave-cost network flow problemCroxtonK. L.ChanL. M. A.MurielA.ShenZ. J.Simchi-LeviD.On the effectiveness of zero-inventory-ordering policies for the economic lot-sizing model with a class of piecewise linear cost structuresChanL. M. A.MurielA.ShenZ. J. M.Effective zero-inventory-ordering policies for the single-warehouse multiretailer problem with piecewise linear cost structuresCroxtonK. L.GendronB.MagnantiT. L.Variable disaggregation in network flow problems with piecewise linear costsBienstockD.GünlükO.Capacitated network design polyhedral structure and computationGabrelV.KnippelA.MinouxM.Exact solution of multicommodity network optimization problems with general step cost functionsGünlükO.A branch-and-cut algorithm for capacitated network design problemsMagnantiT. L.MirchandaniP.VachaniR.Modeling and solving the two-facility capacitated network loading problemHolmbergK.Solving the staircase cost facility location problem with decomposition and piecewise linearizationHolmbergK.LingJ.A Lagrangean heuristic for the facility location problem with staircase costsCroxtonK. L.GendronB.MagnantiT. L.Models and methods for merge-in-transit operationsLiH. L.ChangC. T.TsaiJ. F.Approximately global optimization for assortment problems using piecewise linearization techniquesTsaiJ. F.LiH. L.A global optimization method for packing problemsTsaiJ. F.WangP. C.LinM. H.An efficient deterministic optimization approach for rectangular packing problemsFourerR.GayD. M.KernighanB. W.GrafT.van HentenryckP.Pradelles-LasserreC.ZimmerL.Simulation of hybrid circuits in constraint logic programmingMartinA.MöllerM.MoritzS.Mixed integer models for the stationary case of gas network optimizationBergaminiM. L.AguirreP.GrossmannI.Logic-based outer approximation for globally optimal synthesis of process networksBergaminiM. L.GrossmannI.ScennaN.AguirreP.An improved piecewise outer-approximation algorithm for the global optimization of MINLP models involving concave and bilinear termsTsaiJ. F.Global optimization of nonlinear fractional programming problems in engineering designLinM. H.TsaiJ. F.WangP. C.Solving engineering optimization problems by a deterministic global approachGeD.WanG.WangZ.ZhangJ.A note on appointment scheduling with piece-wise linear cost functionsWorking paper, 2012BazaraaM. S.SheraliH. D.ShettyC. M.HillierF. S.LiebermanG. J.TahaH. A.FloudasC. A.VajdaS.CroxtonK. L.GendronB.MagnantiT. L.A comparison of mixed-integer programming models for nonconvex piecewise linear cost minimization problemsLiH. L.An efficient method for solving linear goal programming problemsLiH. L.YuC. S.Global optimization method for nonconvex separable programming problemsLiH. L.LuH. C.Global optimization for generalized geometric programs with mixed free-sign variablesPadbergM.Approximating separable nonlinear functions via mixed zero-one programsTopalogluH.PowellW. B.An algorithm for approximating piecewise linear concave functions from sample gradientsKontogiorgisS.Practical piecewise-linear approximation for monotropic optimizationLiH. L.TsaiJ. F.Treating free variables in generalized geometric global optimization programsVielmaJ. P.AhmedS.NemhauserG.A note on ‘a superior representation method for piecewise linear functions’MarkowitzH. M.ManneA. S.On the solution of discrete programming problemsDantzigG. B.On the significance of solving linear programming problems with some integer variablesSheraliH. D.On mixed-integer zero-one representations for separable lower-semicontinuous piecewise linear functionsBealeE. L. M.TomlinJ. A.LawrenceJ.Special facilities in a general mathematical programming system for nonconvex problems using ordered sets of variablesProceedings of the 5th International Conference on Operations Research1970London, UKTavistock Publications447454MR0323296de FariasI. R.Jr.JohnsonE. L.NemhauserG. L.Branch-and-cut for combinatorial optimization problems without auxiliary binary variablesNowatzkiT.FerrisM.SankaralingamK.EstanC.KehaA. B.de FariasI. R.NemhauserG. L.Models for representing piecewise linear cost functionsKehaA. B.de FariasI. R.NemhauserG. L.A branch-and-cut algorithm without binary variables for nonconvex piecewise linear optimizationLiH. L.LuH. C.HuangC. H.HuN. Z.A superior representation method for piecewise linear functionsTsaiJ. F.LinM. H.An efficient global approach for posynomial geometric programming problemsVielmaJ. P.NemhauserG. L.Modeling disjunctive constraints with a logarithmic number of binary variables and constraintsTillJ.EngellS.PanekS.StursbergO.Applied hybrid system optimization: an empirical investigation of complexityTsaiJ. F.LinM. H.Global optimization of signomial mixed-integer nonlinear programming problems with free variablesLinM. H.TsaiJ. F.Range reduction techniques for improving computational efficiency in global optimization of signomial geometric programming problemsLundellA.LundellA.WesterlundT.On the relationship between power and exponential transformations for positive signomial functionsLundellA.WesterlundJ.WesterlundT.Some transformation techniques with applications in global optimization