Motivated by problems coming from planning and operational management in
power generation companies, this work extends the traditional two-stage linear stochastic program by adding probabilistic constraints in the second stage. In this work we describe, under special assumptions, how the two-stage stochastic programs with mixed probabilities can be treated computationally. We obtain a convex conservative approximations of the chance constraints defined in second stage of our model and use Monte Carlo simulation techniques for approximating the expectation function in the first stage by the average. This approach raises with another question: how to solve the linear program with the convex conservative approximation (nonlinear constrains) for each scenario?
1. Introduction
Optimization problems involving stochastic models occur in almost all areas of science and engineering. Financial planning or unit commitment in power systems are just few examples of areas in which ignoring uncertainty may lead to inferior or simply wrong decisions.
Stochastic programming models are optimization problems where the decision have to be made under uncertainty because some of the parameters are random variables, and they may use probabilistic constraints and/or penalties in the objective function. In practice, the numerical solvability of the problem plays an important role and there is a tradeoff between correct statistical modeling and computability. For earlier reviews on the various aspects in stochastic programming see, for example, [1–5].
Two-stage stochastic programming is useful for problems where an analysis of strategy scenarios is desired and when the right-side coefficients are random. The main idea of this model is the concept of recourse, which defines possibility to take corrective actions after a realization of the random event. A decision is first undertaken before values of random variables are known, and then, after the random events have occurred and their values are known, a second stage decision is made to minimize “penalties” that may appear because of any infeasibility. For a good introduction and deepen in various aspects of these models, you should see the books in [4, 6].
Chance constrained optimization problems were introduced in Miller and Wagner [7], and Prékopa [8]. An alternative to the scenario approximation (Monte Carlo sampling techniques) is an approximation based on analytical upper bounding of the probability for the randomly perturbed constraint to be violated. The simplest approximation scheme of this type was proposed in [9] and for a new class of analytical approximation (referred to as Bernstein approximations), see the works by Nemirovski and Shapiro [10]. Another approximation of probabilities constraints, is by using the Boole-Bonferroni inequalities; see, for example, [11, 12].
When the stochastic program includes nonlinear terms or when continuous random variables are explicitly included, a finite-dimensional linear programming deterministic equivalent does not exist. In this case, we must use some nonlinear programming types of procedures, see, for instance, [3, 13–16].
In previous work (see [17]) was extended the traditional two-stage linear stochastic program by probabilistic constraints imposed in the second stage. In the next section, we present a summary with assumptions under which the mixed-probability stochastic program is structurally well behaved and stable under perturbation of both probabilities measures. Moreover, in [17] can be find, under general conditions, first qualitative continuity properties for the expectation of the objective function and the constraint set-valued maps. Hence, we deduced quantitative stability results for the optimal value function and the solution set under perturbations of probabilities measures.
In the third section, two possible applications that could have this model were shown, the first one is a summary of the case of planning and operational management in power generation companies presented in [17] and the other one is an application to the problem of air pollution.
2. Some Preliminaries: Basic Well-Posedness
In previous work (see [17]) was introduced the following parametric family of mixed probability stochastic programs P(μ,λ):min{c′x+∫RsQ(z-Ax,λ)μ(dz):x∈C},(μ,λ)∈Δ×Λ,
where Q(t,λ) is the optimal value function of the problem in second stage:min{q′y:Wy=t,y≥0,λ(Hj(y))≥pj,j=1,…,d}
and
Hj, j=1,…,d, are set-valued mappings from ℝm to ℝr with closed graph;
pj, j=1,…,d, are predesigned probability levels;
if 𝒫(ℝs), 𝒫(ℝr) denote the sets of all Borel probability measures on ℝs and ℝr, respectively, we assume that Δ and Λ are subsets of 𝒫(ℝs) and 𝒫(ℝr);
C is a close subset of ℝm̅.
All remaining vectors and matrices have suitable dimensions.
This model extends the traditional two-stage linear stochastic program by introducing some probabilistic constraints λ(Hj(y))≥pj, j=1,…,d in the second stage of the problem. These types of constraints add nonlinearities to the problem and basic arguments to analyze the well-posedness of P(μ,λ) were studied in [17].
The major difficulty in understanding the structure of P(μ,λ) rests in a dilemma about the function Q.
On the one hand, Q is the optimal-value function of a nonlinear program with parameters t and λ, and parametric optimization mainly provides local results about the structure of Q but global results are very scarce and require specific assumptions that are often hard to verify.
On the other hand, Q arises as an integrand in P(μ,λ). For studying properties of the related integral we require global information about Q.
From this viewpoint, it is not surprising that most of the structural results about two-stage stochastic programs concern the purely linear and the linear mixed integer cases, that is, the widest problem classes where parametric optimization offers broader results about global stability.
To lay a foundation for the structural analysis of Q we formulate the following general assumptions.
Assumption A.1.
For any λ∈Λ there exists a nonempty set ℛλ⊆ℝs and a Lebesgue null set 𝒩λ⊆ℝs such that the function Q(·,λ) is real valued and measurable on ℛλ, and continuous on ℛλ∖𝒩λ.
Assumption A.2.
It holds that
⋃μ∈Δsuppμ⊆⋂λ∈Λ⋂x∈C{Ax+Rλ},
where suppμ denotes the smallest closed set in ℝs with μ-measure one.
Assumption A.3.
There exists a real-valued, measurable function h on ℝs, we call bounding function, with the following properties. (1) Q-Majorization
It holds that |Q(t,λ)|≤h(t) for all t∈ℛλ and all λ∈Λ.
(2) Integrability
It holds that ∫ℝsh(z)μ(dz)<+∞for allμ∈Δ.
(3) Generalized Subadditivity
There exists a κ>0 such that h(t1+t2)≤κ(h(t1)+h(t2)) for all t1,t2∈ℝs.
(4) Local Boundedness
For each t∈ℝs there exists an open neighborhood of t where h is bounded.
The essence of Assumptions A.1–A.3 is the following: since Q(·,λ) is the optimal-value function of a minimization problem it well may attain the values +∞ if the problem is infeasible and -∞ if the problem is unbounded. Indeed, Assumption A.1 makes sure that Q(·,λ) is finite on some set ℛλ and A.2 Guarantees that the arguments z-Ax are in ℛλ for all relevant z and x. Otherwise, Q(z-Ax,λ) would attain infinite values with positive probability, immediately preventing finiteness of the integral:G(x,μ,λ)≔∫RsQ(z-Ax,λ)μ(dz).
The continuity part of Assumption A.1 together with Assumption A.3 provides a framework for applying dominated convergence to show continuity of G(·,μ,λ).
Introducing the exceptional set 𝒩λ in Assumption A.1 makes sense, since Q(·,λ) often lacks continuity on lower-dimensional subsets of its domain of finiteness.
Furthermore, Assumption A.3 ensures an integrable upper bound for the functions |Q(·-Ax,λ)| when x is varying in some neighborhood. Any other set of conditions ensuring this could be placed instead.
Clearly, h reflects the global growth of |Q(·,λ)| whose quantitative analysis is acknowledged nontrivial for nonlinear problems.
3. Applications
Motivated by the study of stochastic programming problems coming from planning and operational management in power generation companies, in previous work (see [17]) was presented an example where was consider a power systems of plants to be operated over a time horizon. In the case of planning and operational management in power generation companies, the first stage variable x in the model represents generation capacity investment decisions, such as changes (continuous) of maximum generation capacity for thermal plants, the variable z is a random demand and y is the second-stage operational variable representing the level of production of energy.
The latter is also limited by emission rights for carbondioxide that may concern single plants or consortia of plants. The level of permitted emission is considered random, since emission rights are traded at predesigned markets via auctions, for instance, whose outcomes are uncertain to market participants. This motivates to model limitations on the operational variables resulting from emission rights by probabilistic rather than deterministic constraints.
Works in [18, 19] have made several applications to model the problem of air pollution; in these papers authers combine different techniques, including two-stage stochastic programming. We now present, based on these previous works, a variation of these models which include restrictions on the type “chance constraints” in the second stage of the model, that is, an example where there are two completely independent probability measures and of different nature.
In air quality management systems, there are uncertainties in a variety of pollution-related processes, such as pollutant characteristics, emission rates, and mitigation measures. These uncertainties would affect the efforts in modeling pollutant. On the other hand, because it is economically infeasible and sometimes technically impossible to design processes leading to zero emission, decision makers and authorities seek to control the emissions to levels at which the effects are minimized. The problem is how to minimize the expected systems cost for pollution abatement while satisfying the policy in terms of allowable pollutant-emission levels.
The SO2 generation rates may vary with the type of coal that is used at the power plants, as well as the related combustion conditions, which could be expressed as a random variable. As an illustrative example, consider a power system consisting of plants i=1,2,…,I to be operated over a time horizon with subintervals t=1,2,…,T and a set of control methods j=1,2,…,J. The first stage variables xijt represent the amount of SO2 generated from source i, to be mitigated through control measures j in period t under the regulated emission allowance, and cjt is the operating cost of control measure j during period t. The second-stage variables are related to the probabilistic excess SO2 from source i to be mitigated through control measures j in period t under SO2 generation rate z(ξ), and djt is the operating and penalty cost for excess SO2 emission during period t. In general, it is considered that this cost is much greater than the cost of operating the first stage variables.
The objective is to minimize the total of regular and penalty cost for SO2 abatement. min∑i=1I∑j=1J∑t=1Tcjtxijt+∑i=1I∑j=1J∑t=1TdjtE(yijt).
If we denote by zit(ξ) the random variable of SO2 generation rate in source i during period t, the constraints of pollution control demand are
∑j=1J(xijt+yijt(ξ))=zit(ξ),∀i,t.
Finally, the function H(yt(ξ),ζ) represents the accumulation of SO2 in a particular area sensible, such as a city that is surrounded by emission sources or power plants and which depends, on the one hand, on the excess amount of emissions from each source i, given the extent j control taken in period t, and the random variable ζ associated with climatic conditions and predicts SO2 concentrations in a specific area under different meteorological conditions, then we add the probabilistic limitations on the second-stage variables:Pr{H(yt(ξ),ζ)≤0}≥pt,∀t,
where pt is the probability levels with which the limitations are to be met.
4. Numerical Method
In order to have some idea about how the two-stage stochastic programs with mixed probabilities can be treated computationally, we will study the following stochastic linear programming problem:
min{cTx+E(Q(x,ξ))∣Bx=b,x≥0},
whereQ(x,ξ)=min{qTy(ξ)∣Ax+Wy(ξ)=ξ,y(ξ)≥0},s.t.:Pr{H(y(ξ),ζ)≤0}≥1-pξ and ζ represents the independent random variables.
ξ∈Ξ is the possible realizations of the random variable ξ supported on Ξ⊂ℝs.
𝔼 stands for expectation with respect to the random variable ξ and y(ξ)∈ℝm for each realization ξ.
B∈Ml×n(ℝ), A∈Ms×n(ℝ), and W∈Ms×m(ℝ) are deterministic matrices and the probability level p∈(0,1).
ζ∈Θ is the possible realizations of the random variable ζ supported on Θ⊂ℝr.
The fundamental idea is to gives a convex conservative approximation of the chance constrained subproblems (4.2), for this, we will fallow the work by Nemirovski and Shapiro (see [10]) and then, have an efficiently solvable deterministic optimization program with the feasible set contained in the chance constrained subproblem.
Let H:ℝm×Θ→ℝ, defined byH(y,ζ)=h0(y)+∑j=1rζjhj(y)
and we assume that the functions hj(y), j=1,2,…,r are convex, the components ζj, j=1,2,…,r, of the random vector ζ are independent of other random variables and the moment generating functions
Mj(t)∶=E[exp(tζj)],j=1,2,…,r
are finite valued for all t∈ℝ and are efficiently computable.
Then, we have that the problem:min{qTy∣Ax+Wy=ξ,y≥0},s.t.:inft>0[h0(y)+∑j=1rtΛj(t-1hj(y))-tlogp]≤0
is a conservative convex approximation of the chance constrained subproblems (4.2), for each realizations of the random variable ξ (ξ∈Ξ⊂ℝs), where
Λj(t)=logMj(t).
Note that this approximation (it is known as the Bernstein Approximation) is an explicit convex program with efficiently computable constraints and as such is efficiently solvable.
Now, we can use the Monte Carlo simulation, that is, suppose that we can generate a sample ξ1,ξ2,…,ξN of N replications of the random vector ξ and then, we can approximate the expectation function by the averageE(Q(x,ξ))=1N∑k=1NQ(x,ξk)
and consequently, we have the sample average approximation method:min{cTx+1N∑k=1NQ(x,ξk)∣Bx=b,x≥0},
whereQ(x,ξk)=min{qTy∣Ax+Wy=ξk,y≥0},s.t.:inft>0[h0(y)+∑j=1rtΛj(t-1hj(y))-tlogp]≤0.
If we denote byω(y)=inft>0[h0(y)+∑j=1rtΛj(t-1hj(y))-tlogp]
we have that ω(yk)≤0 is a convex constraints and conservative for each k=1,2,…,N, in the sense that if for
yk∈{y∈Rm∣Ax+Wy=ξk,y≥0}
it holds that ω(yk)≤0, then
Pr{H(yk,ζ)≤0}≥1-p
or equivalently
Pr{h0(yk)+∑j=1rζjhj(yk)≤0}≥1-p
and we obtain the following deterministic problem, with nonlinear constraints:minx,y1,…,yNc⊤x+1N∑k=1Nq⊤yk,s.t.Bx=b,Ax+Wyk=ξk,∀k=1,2,…,N,ω(yk)≤0,∀k=1,2,…,N,x≥0,yk≥0,∀k=1,2,…,N.
Remark 4.1.
It was demonstrated in theoretical studies and numerical experiments that Quasi-Monte Carlo techniques could significantly improve the accuracy of the sample average approximation problem, for a general discussion of Quasi-Monte Carlo methods see the works by Niederreiter in [20, 21]. Moreover, the problem (4.5) is not the only way to get to the conservative convex approximation of the chance constrained problems, we also can use the convex approximation obtained by Conditional Value at Risk (see [10] and the work by Rockafellar and Uryasev [22]). However, our aim in this paper is more focused on showing a numerical methodology to tackle this type of models.
Denote by X={x∈ℝn∣Bx=b,x≥0} and Y={y∈ℝm∣w(y)≤0,y≥0} the convex subsets of the feasible set of problem (4.14) that do not depend on the sample generated by the random vector ξ, then the problem (4.14) can be rewritten asminc⊤x+1N∑k=1Nq⊤yk,s.t.Wyk=ξk-Ax,k=1,2,…,N,x∈X,yk∈Y,k=1,2,…,N
and then, we can take advantage of separability. If we denotev(uk)=min{q⊤y∣Wy=uk,y∈Y},
where uk=ξk-Ax, for all k=1,2,…N, we havev(ξk-Ax)=max{r(λ)-λ⊤(ξk-Ax)∣λ∈Rs},
for all k=1,2,…,N andr(λ)=inf{q⊤y+λ⊤Wy∣y∈Y}.
Note that r(λ)-λ⊤(ξk-Ax) is the dual function corresponding to v, and the master problemminc⊤x+1N∑k=1Nv(ξk-Ax),x∈X
can be solved using a differentiable descent method if r(λ)=inf{(q+W⊤λ)⊤y∣y∈Y} is strictly concave function over the set {λ∣r(λ)>-∞}. However, this last assumption is very restrictive, in fact, in our specific case is not satisfied because the objective function is linear, so we would have to study under what conditions the gradient of the value function v(u), can be explicitly calculated.
Since the master problem (M.P.) has linear constraints, this can be solved using Frank-Wolfe method, for which only we need to know the gradient of v(ξk-Ax), but∇xv(ξk-Ax)=-A⊤∇ukv(uk)=A⊤λk*
for each k=1,…,N, where λk* is the Lagrange multiplier associated to linear constraint in the optimal solution of subproblem (PK). Therefore, our problem now is how we find specifically the value of this multiplier λk*.
5. Normal Distribution
In this section we investigate the case when the random vector ζ=(ζ1,ζ2,…,ζr)⊤ supported on Θ⊂ℝr, has all its components normally distributed.
Let us suppose that ζj~N(μj,σj2), j=1,2,…,r, then the moment generating function is defined as
Mj(t)=exp(μjt+σj2t22),Λj(t)=logMj(t)=μjt+σj2t22
for each j=1,2,…,r.
Proposition 5.1.
The Bernstein Approximation of the chance constrained subproblems is given by
ω(y)=h0(y)+∑j=1rμjhj(y)+-2logp∑j=1rσj2hj2(y).
Proof.
As we saw before, the Bernstein Approximation is a conservative convex approximation of the chance constraints defined as
ω(y)=inft>0[h0(y)+∑j=1rtΛj(t-1hj(y))-tlogp]
and substituting the expression given in (5.2) in the above relationship, we obtain
ω(y)=inft>0{h0(y)+∑j=1rt[μjhj(y)t+σj2hj2(y)2t2]-tlogp}=inft>0{h0(y)+∑j=1rμjhj(y)+12t∑j=1rσj2hj2(y)-tlogp}
and then
ω(y)=h0(y)+∑j=1rμjhj(y)+inft>0{12t∑j=1rσj2hj2(y)-tlogp}.
Let us denote by f(t)=a/2t-tlogp the auxiliary function, it is easy to see that the stationary point t̂=-a/2logp is a global minimum of the function f, therefore, of the given equation (5.6), we can conclude, after some calculations, that
ω(y)=h0(y)+∑j=1rμjhj(y)+-2alogp
and finally, substituting a=∑j=1rσj2hj2(y), we have
ω(y)=h0(y)+∑j=1rμjhj(y)+-2logp∑j=1rσj2hj2(y).
Proposition 5.2.
Let
y¯k∈argmin{q⊤y∣Wy=uk,y≥0}
and suppose that ω(y¯k)>0. Then, there is
yk*∈argmin{q⊤y∣Wy=uk,y≥0,ω(y)≤0}
such that ω(yk*)=0.
Proof.
The existence of yk* depends only on whether the feasibility set
S={y∈Rm∣Wy=uk,y≥0}∩{y∈Rm∣ω(y)≤0}
is not an empty set. By the other hand, ω(y) is a convex function, and then S is a convex set, so
yk(α)=αyk*+(1-α)y¯k∈S,∀α∈[0,1]
and by continuity of ω, if ω(y¯k)>0, there is α¯∈[0,1] such that ω(yk(α¯))=0.
Denote by θ(α)=q⊤yk(α), then θ′(α)=q⊤(yk*-y¯k)≥0 for all α∈[0,1] because q⊤yk*≥q⊤y¯k. This implies that the function θ(α) is monotone increasing in [0,1], and therefore
θ(0)≤θ(α¯)≤θ(1)
and then we have q⊤yk(α¯)≤q⊤yk* and
yk(α¯)∈argmin{q⊤y∣Wy=uk,y≥0,ω(y)≤0},
where ω(yk(α¯))=0. Finally, it is enough to assign to yk*=yk(α¯).
Now, we analyze the two possible cases for each k=1,…,N. Lety¯k∈argmin{q⊤y∣Wy=uk,y≥0}.
Case 1.
If ω(y¯k)≤0, then by yk*=y¯k and λk* is the Lagrange multiplier associated to linear constraint Wy=uk.
Case 2.
If ω(y¯k)>0. Using the results of the previous proposition, we have to find the solution to the penalized problem:
minq⊤y+Ckω2(y),s.t.Wy=uk,y≥0
for a penalty parameter Ck sufficiently large. To resolve this problem, we can apply again the iterative method of Frank and Wolfe, where in each iteration, we solves a linear problem and then, we have
yk(j+1)=yk(j)+γj(y¯k(j)-yk(j)),
where γj is chosen by the limited minimization rule or the Armijo rule,
y¯k(j)∈argmin{(q+2Ckω(yk(j))∇ω(yk(j)))⊤(y-yk(j))∣Wy=uk,y≥0}
and, if we denote by λk(j) the Lagrange multiplier associated to linear constraint Wy=uk in (5.18), and let yk* be the accumulation point of the sequence {y¯k(j)}, that is, there is a subsequence {y¯k(j)}j∈𝒥 that converges to yk*, then we define by λk* the corresponding limit point of subsequence {λk(j)}j∈𝒥.
6. Conclusions
In this paper, we present a strategy or methodology to be followed to solve a two-stage stochastic linear programs numerically, when the chance constraints are included in the second stage. It suggests treating the two measures of probabilities involved in the problem differently. Since the major difficulty of the problem is in the second stage, we chose to assume that we had a sample of replications of random vector involved in the expected value function in the objective function and approximate it by the average. For the case of chance constraints defined in second stage, the main idea was to obtain a convex conservative approximation and then get to an efficiently solvable deterministic nonlinear optimization program for each scenario considered in the previous sample. Since the number of replicas or sample size is generally very large, and for each one must solve a nonlinear optimization problem because a method of decomposition of general deterministic problem were proposed, then although the problem looks very computationally unwieldy for the special case when the random vector of probability constraint of the second stage has all its components normally distributed, an explicitly Bernstein approximation function was obtained and we showed how each nonlinear optimization problem can be solved separately.
Acknowledgments
The author wish to thank the referees for their careful reading and constructive remark. This work has been supported by CONICYT (Chile) under FONDECYT Grant 1090063.
BirgeJ. R.LouveauxF.1997New York, NY, USASpringerxx+421Springer Series in Operations Research1460264DupačováJ.Stability and sensitivity analysis for stochastic programming1990271–4115142108899010.1007/BF02055193ZBL0724.90049WetsR.BachemA.GrötschelM.KorteB.Stochastic programming: solution techniques and approximation1983Berli, GermanySpringer566603717415ZBL0551.90070WetsR. J.-B.NemhauserG. L.Rinnooy KanA. H. G.ToddM. J.Stochastic programming19891Amsterdam, The NetherlandsNorth-Holland573629Handbooks in Operations Research and Management Science110510710.1016/S0927-0507(89)01009-1ZBL0752.90052WetsR. J.-B.Challenges in stochastic programming199675211513510.1016/S0025-5610(96)00009-31426635ZBL0874.90151RuszczyńskiA.ShapiroA.2003Amsterdam, The NetherlandsElsevierMillerL. B.WagnerH.Chance-constrained programming with joint constraints196513930945PrékopaA.On probabilistic constrained programming1970Proceedings of the Princeton Symposium on Mathematical Programming (Princeton Univ., 1967)Princeton, NJ, USAPrinceton University Press1131380351452ZBL0229.90032Ben-TalA.NemirovskiA.Robust solutions of linear programming problems contaminated with uncertain data200088341142410.1007/PL000113801782149ZBL0964.90025NemirovskiA.ShapiroA.Convex approximations of chance constrained programs200617496999610.1137/0506223282274500ZBL1126.90056KibzunA. I.KanY. S.1996WileyWiley-Interscience Serie in Systems and OptimizationPrékopaA.Boole-Bonferroni inequalities and linear programming198836114516294303810.1287/opre.36.1.145ZBL0642.60012BirgeJ. R.QiL. Q.Subdifferential convergence in stochastic programs199552436453133020510.1137/0805022ZBL0839.90087BirgeJ.TeboulleM.Upper bounds on the expected value of a convex function using gradient and conjugate function information1989144745759103121110.1287/moor.14.4.745ZBL0688.90007ErmolievY.WetsR. J.-B.198810Berlin, GermanySpringerxvi+571Springer Series in Computational Mathematics957304UriasievS.Adaptive stochastic quasigradient methods1988Berlin, GermanySpringer373384BoschP.JofréA.SchultzR.Two-stage stochastic programs with mixed probabilities200718377878810.1137/0506487542345968ZBL1211.90145LiY.HuangG. H.VeawabA.NieX.LiuL.Two-stage fuzzy-stochastic robust programming: a hybrid model for regional air quality management2006568107010822-s2.0-33749133832SaenchaiK.BenedicentiL.HuangG. H.A mixed-integer two-stage interval stochastic programming model for regional air quality management20075168176NiederreiterH.199263Philadelphia, Pa, USASIAMvi+241CBMS-NSF Regional Conference Series in Applied Mathematics1172997NiederreiterH.BraH.HammerlinG.Quasi-Monte Carlo methods for multidimensional numerical integration198885Basel, SwitzerlandBirkhäuser157171International Series of Numerical Mathematics1021532ZBL0662.65021RockafellarR. T.UryasevS. P.Optimization of conditional value at risk200022141