The Flattened Aggregate Constraint Homotopy Method for Nonlinear Programming Problems with Many Nonlinear Constraints

and Applied Analysis 3 whereas Ω0 = {x ∈ Rn | g(x) < 0} is the interior of Ω and ∂Ω = Ω \ Ω0 is the boundary of Ω. The symbols Rm + and Rm ++ denote the nonnegative and positive quadrants of R , respectively. The active index set is denoted by I(x) = {i ∈ {1, . . . , m} | g i (x) = 0} at x ∈ Ω. For a function F : R n → R , ∇F(x) is the n × m matrix whose (i, j)th element is ∂F j (x)/∂x i ; F−1(Y) = {x | F(x) ∈ Y} is the setvalued inverse for the set Y ⊆ R. Definition 1 (see [10]). The set Ω satisfies the normal cone condition if and only if the normal cone of Ω at any x ∈ ∂Ω does not meetΩ; that is,

From the mid-1980s, much attention has been paid to interior point methods for mathematical programming, and many results on theory, algorithms, and applications on linear programming, convex programming, complementarity problems, semidefinite programming, and linear cone programming were obtained (see monographs [1][2][3][4][5][6] and references therein).For nonlinear programming, the typical algorithms used were the Newton-type methods to the perturbed first-order necessary conditions combined with line search or trust region methods with a proper merit function (e.g., [7][8][9]).The general conditions of global convergence for these methods required that the feasible set be bounded and that the Jacobian matrix be uniformly nonsingular.Another typical class of globally convergent methods for nonlinear programming was probability-one homotopy methods (e.g., [10][11][12]), whose global convergence can be established under some weaker conditions than the ones for Newton-type methods.The excellent feature is that, unlike line search or trust region methods, they do not depend on the descent of a merit function and so are insensitive to the local minimum of the merit function, in which any search direction is not a descent direction of the merit function.
In [10,11], Feng et al. proposed a homotopy method for nonlinear programming (1), which was called the combined homotopy interior point (abbreviated by CHIP) method; its global convergence was proven under the normal cone condition (see below for its definition) for the feasible set as well as some common conditions.On the basis of the CHIP method, some modified CHIP methods were presented in [13,14]; the global convergence was established under the quasinormal cone and pseudocone condition for the feasible set, respectively.In [12], Watson described some probabilityone homotopy methods for the unconstrained and inequality constrained optimization, whose global convergence was established under some weaker assumptions.Recently, Yu and Shang proposed a constraint shifting combined homotopy method in [15,16], in which not only the objective function but also the constraint functions were regularly deformed.The global convergence was proven under the condition that the initial feasible set, which approaches the feasible set of (1) as the homotopy parameter changes from 1 to 0, not necessarily the feasible set of (1), satisfies the normal cone condition.
Let  max () = max 1≤≤ {  ()}; then (1) is equivalent to min  () , which has only one, but nonsmooth, constraint.In [17], the following aggregate function was introduced, which is a smooth approximation of  max () with a smoothing parameter  > 0 and induced from the max-entropy theory: It is also known as exponential penalty function (see [18]).By using it for all constraint functions of problem (1), an aggregate constraint homotopy (abbreviated by ACH) method was presented by Yu et al. in [19], whose global convergence was obtained under the condition that the feasible set satisfies the weak normal cone condition.Although the ACH method has only one smoothing constraint, the gradient and Hessian of the aggregate constraint function where are complicated combinations of gradients and Hessians of all constraint functions and, hence, are expensive to calculate when  is very large.Throughout this paper, we assume that the nonlinear programming problem (1) possesses a very large number of nonlinear constraints, but a small number of variables, and the objective and constraint functions are not sparse.For such a problem, the number of constraint functions can be so large that the computation of gradients and Hessians of all constraint functions is very expensive and cannot be stored in memory and, hence, the general numerical methods for solving nonlinear programming are not efficient.Although active set methods only need to calculate gradients and Hessians of a part of constraint functions, require lower storage, and have faster numerical solution, the working set is difficult to estimate without knowing the internal structure of the problem.
In this paper, we present a new homotopy method called the flattened aggregate constraint homotopy (abbreviated by FACH) method for nonlinear programming (1) by using a new smoothing technique, in which only a part of constraint functions is aggregated.Under the normal cone condition for the feasible set and some other general assumptions, we prove that the FACH method can determine a smooth homotopy path from a given interior point of the feasible set to a KKT point of (1), and preliminary numerical results demonstrate its efficiency.
The rest of this paper is organized as follows.We conclude this section with some notations, definitions, and a lemma.The flattened aggregate constraint function with some properties is given in Section 2. The homotopy map, existence and convergence of a smooth homotopy path with proof are given in Section 3. A numerical procedure for tracking the smooth homotopy path, and numerical test results with some remarks are given in Section 4. Finally, we conclude the paper with some remarks in Section 5.
When discussing scalars and scalar-valued functions, subscripts refer to iteration step so that superscripts can be used for exponentiation.In contrast, for vectors and vectorvalued functions, subscripts are used to indicate components, whereas superscripts are used to indicate the iteration step.The identity matrix is represented by .

The Flattened Aggregate Constraint Function
In this section, we suppose that the following assumptions hold.
Assumption 5. Ω 0 is nonempty and Ω is bounded.Assumption 6.For any  ∈ Ω, {∇  () |  ∈ ()} are positive independent; that is, ∇  () = 0,   ≥ 0,  ∈  () ⇒   = 0,  ∈  () .The first assumption is the Slater's condition and the boundedness of the feasible set, which are two basic conditions.The second assumption provides the regularity of constraints, which is weaker than the linear independence constraint qualification.The last assumption, the normal cone condition of the feasible set, is a generalization of the convex condition for the feasible set.Indeed, if Ω is a convex set, then it satisfies the normal cone condition.Some simple nonconvex sets, satisfying the normal cone condition, are shown in [10].
(1) By (, ) ≥ 0, we have Then the left inequality can be obtained.By (, ) ≤ 1, we have Then the right inequality can be obtained by Proposition 2.3 in [19].
where the first equality comes from that  *  = 0 for  ∉ ().Then, we have By Assumption 6 and using (34), Hence {  } ∞ =1 is a bounded sequence, and there must exist a subsequence converging to  > 0; then we have by (33) which contradicts Assumption 7.

The Flattened Aggregate Constraint Homotopy Method
In [19], the following aggregate constraint homotopy was introduced: where ĝ (, ) =  ln ∑  =1 exp(  ()/) with  ∈ (0, 1],  is the Lagrangian multiplier of the aggregate constraint function ĝ (, ), and ( 0 ,  0 ) is the starting point in Ω ×  ++ .Under the weak normal cone condition, which is similar to the normal cone condition, it was proved that the ACH determines a smooth interior path from a given interior point to a KKT point.Then, the predictor-corrector procedure can be applied to trace the homotopy path from ( 0 ,  0 , 1) to ( * ,  * , 0), in which ( * ,  * ) is a KKT point of (1).

The Flattened Aggregate Constraint Homotopy.
Using the flattened aggregate constraint function ĝ(, ) in (10), we construct the following flattened aggregate constraint homotopy: where  is the Lagrangian multiplier of the flattened aggregate constraint and ( 0 ,  0 ) is the starting point and can be randomly chosen from Ω ×  ++ .
We give the main theorem on the existence and convergence of a smooth path from ( 0 ,  0 , 1) to ( * ,  * , 0), in which ( * ,  * ) is a solution of the KKT system of (1), and hence the global convergence of our proposal, namely, the FACH method, can be proven.

The Modified Flattened Aggregate Constraint Homotopy.
From Proposition 8(3), we know that (, ) → 0 as  ↓ 0 for any  ∈ Ω and, hence, we can use the following modified flattened aggregate constraint homotopy (MFACH) instead of the FACH: where Remarks.(i) Since (, ) is dropped in (51), the expressions of the homotopy map (50) and its Jacobian and hence the corresponding code become simpler.Moreover, the computation of H in (50) and its Jacobi matrix DH is a little cheaper than that for the FACH method.
(ii) To guarantee that (, , ) in (39) be a  2 map, (, ) must be  3 ; hence, if (, ) is chosen as a polynomial, the total degree should be seven; in contrast, for the homotopy map in (50), (, ) is only needed to be  (53) (iii) The existence and convergence of the homotopy path defined by the MFACH can be proven in a similar way with that of the FACH.

The FACH-S-N Procedure.
In this section, we give a numerical procedure, FACH-S-N procedure, to trace the flattened aggregate constraint homotopy path by secant predictor and Newton corrector steps.It consists of three main steps: the predictor step, the corrector step, and the end game.The predictor step is an approximate step along the homotopy path: it uses a predictor direction   and a steplength ℎ  to get a predictor point.The first predictor direction uses the tangent direction, and others use the secant direction.The steplength is determined by several parameters.It is set to no more than 1, which can ensure that the predictor point is close to the homotopy path and hence stays in the convergence domain of Newton's method in the corrector step.If the angle of the predictor direction and the previous one   is greater than /4, the steplength will be decreased to avoid using the opposite direction as the predictor direction.If the corrector criteria are satisfied with no more than four Newton iterations for three times in succession, the steplength will be increased or kept invariable.Otherwise, the steplength will be decreased.
At each predictor step and corrector step, the feasibility of the predictor point (  ,   ) and the corrector point ( (,) ,  (,) ) needs to be checked.If   <   in the predictor step or   < 0 in the corrector step, a damping step is used to get a new point ( (,0) , 0).Then, if  (,0) is feasible, the end game, a more efficient strategy than predictor-corrector steps when the homotopy parameter is close to 0, is invoked.Starting with  (,0) , Newton's method is used to solve where   is a small positive constant.For other situations, the steplength will be decreased to make new predictor-corrector steps (see Algorithm 1).

The Numerical Experiment.
Although there exist so many test problems, such as the CUTEr test set [21], we cannot find a large collection of test problems with moderate variables and very many complicated nonlinear constraints.In this paper, six test problems are chosen to test the algorithm.Problem 4.1 is chosen from the CUTEr test set and it is used to illustrate a special situation; others are derived from the discretized semi-infinite programming problems.We also give two artificial test problems 4.2 and 4.3 and use three problems 4.4-4.6 in [22].The numbers of variables in problem 4.2 and the number of constrains in problems 4.2-4.6 can be arbitrary.For each test problem, the gradients and Hessians of the objective and constraint functions are evaluated analytically.
The FACH-S-N procedure was implemented in MAT-LAB.To illustrate its efficiency, we also implemented the ACH method in [19] using a similar procedure.In addition, we downloaded the state-of-the-art solver KNITRO, which provides three algorithms for solving large-scale nonlinear programming problems, and we used the interior-point direct method with default parameters to compare with the FACH and MFACH methods.For any iterate point   , if  max (  ) < 10 −6 , it was treated as a feasible point.The test results were obtained by running MATLAB R2008a on a desktop with Windows XP Professional operation system, Intel(R) Core(TM) i5-750 2.66 GHz processor, and 8 GB of memory.The default parameters were chosen as follows.(v) Initial Lagrangian multipliers: 1 for ACH, FACH, and MFACH methods.
For each problem with different parameters, we list the value of objective function () and the maximal function of constraints  max () at  * , the number of iterations , the number of evaluations of gradients of individual constraint functions in the FACH and MFACH methods   (in contrast,   is  *  in the ACH method), and the running time in seconds .For problems that were not solved by the conservative setting, we also give the reason for failure.The notation "fail 1 " indicates that the steplength in predictor step is smaller than 10 −10 ; it is generally due to poor conditioned Jacobian matrix.The notation "fail 2 " means out of memory.The notation "fail 3 " means no result in 5000 iterations or 3600 seconds.
Example 14 (see [21]).Consider (55) Remark 15.In this example, the starting point happens to be an unconstrained minimizer of the objective function and we found that the -components of iterative points generated by the FACH and MFACH methods (not other homotopy methods) remain invariant.This is not an occasional phenomenon.
In fact, if ( 0 , 0) ∈ Ω 0 ×   + is a solution of the KKT system, that is,  0 is a stationary point of the objective function (), when the parameters ,  1 , and  2 in FACH and MFACH methods satisfy  max ( 0 ) ≤ −( 1 +  2 ), the -components of points on the homotopy path Γ  0 will remain invariant, which can be derived from the fact that ( 0 ,  0 (1)/(), ) is the solution of (39) and (50) for  ∈ (0, 1].Moreover, since   ( 0 ,  0 , 1) = (  0 0 0 ĝ ( 0 , 1) * ) , the -component of  1 is 0. By a similar discussion, the component of  (1,) is 0, and hence  1 =  (1,) =  0 .In this analogy, we know that   =  (,) =  0 for any  and  in algorithm FACH-S-N.( Example 18 (see [22]).Consider (59) Example 19 (see [22]).Consider ,  = 0, . . .,  − 1, Example 20 (see [22]).Consider ,  = 0, . . .,  − 1,  0 = (−1, 100) ∈  2 . (61) To explain the numerical efficiency, we make the following remarks by numerical results in Tables 1, 2, 3, 4, 5 and 6. (i) If  max () ≤ −() for any (, , ) ∈ Γ  0 , the FACH and MFACH methods do not need to calculate the gradient and Hessian of any constraint functions.(ii) For problems whose gradients and Hessians of constraint functions are expensive to evaluate, the performance of the FACH and MFACH methods is much better than the ACH method based on similar numerical tracing procedures.(iii) Compared to the interior-point direct algorithm of the state-of-the-art solver KNITRO, the test results show that the FACH and MFACH methods perform worse when  is small but much better when  is large for most problems.In addition, we can see that their time cost increases more slowly than KNITRO solver as  increases.(iv) The function (, ) is important for the flattened aggregate constraint function.Theoretically, the parameters  1 ,  2 , and  in the FACH and MFACH methods can be chosen freely.However, they do matter in the practical efficiency of the FACH and MFACH methods and should be suitably chosen.If these parameters are too large, then too many gradients and Hessians of individual constraint functions need to be evaluated and, hence, cause low efficiency.On the other hand, if they are too small, the Hessian of the flattened aggregated constraint function (10) may become ill-conditioned.In our numerical tests, we fixed  1 = 0.05,  2 = 0.5 * 10 −5 , and  = 2.In addition, the function (, ) can be defined in many ways, and preliminary numerical experiments show that the algorithms with different functions (, ) have similar efficiencies.
(v) In algorithm FACH-S-N, we gave only a simple implementation of the ACH, FACH, and MFACH methods.To improve implementation of the FACH and MFACH methods, a lot of work needs to be done on all processes of numerical path tracing, say, schemes of predictor and corrector, steplength updating, linear system solving, and end game.Other practical strategies in the literature for large-scale nonlinear programming problems (e.g., [9,[23][24][25][26]) are also very important for improving the efficiency.

Table 1 :
Test results for Example 14.

Table 2 :
Test results for Example 16.

Table 4 :
Test results for Example 18.

Table 5 :
Test results for Example 19.

Table 6 :
Test results for Example 20.