On the Simplex Algorithm Initializing

,


Introduction
There are two main methods for solving linear programming problem: the Simplex method and the interior point method. In both of these two methods it is necessary to determine the initial point. It is known that the application of the simplex algorithm requires at least one basic feasible solution. For the interior point method, in most cases it is enough that the point belongs to the so-called central path. Note also that in most algorithms based on interior point methods, the starting point need not to be feasible. However, in practical implementation it is shown that the numerical stability of the algorithm still depends on the choice of the starting point. These issues are investigated in papers 1, 2 .
Two classical algorithms for generating the initial point for the Simplex algorithm are the two-phase Simplex algorithm and the Big-M method. The main drawback of these algorithms is requiring the introduction of artificial variables, increasing the dimension of the problem. Table 1 x N,1 x N,2 · · · x N,n −1 a 11 a 12 · · · a 1n b 1 −x B,1 · · · · · · · · · · · · · · · · · · a m1 a m2 · · · a mn b m −x B,m c 1 c 2 · · · c n d f From that reason, algorithms which do not require the introduction of artificial variables are developed. A heuristic for finding the starting point of a linear programming problem, which is based on the method of minimal angles, is proposed in 3 . Unfortunately, these heuristics can be used only for certain classes of problems. Similar heuristics are investigated in the papers 4-6 . Different versions of the algorithm which does not introduce artificial variables are given in the papers 7-12 . One variant of these algorithms is introduced and discussed in the paper 11 , which was the inspiration for our paper. We compare three different methods for finding an initial basic feasible solution the starting point for the Simplex algorithm throughout performed numerical test examples. Two methods, called M1 and M2, are introduced in 13 , while the third method, called NFB, is established in 11 .

The Simplex Method and Modifications
Consider the linear programming LP problem in the standard matrix form: subject to Ax b, where A ∈ R m× m n is the full row rank matrix rank A m , c ∈ R n m , and the system Ax b is defined by x ∈ R m n , b ∈ R m . It is assumed that i, j -th entry in A is denoted by . , x n m , and c T c 1 , . . . , c n m . Then we choose m basic variables x B,1 , . . . , x B,m , express them as a linear combination of nonbasic variables x N,1 , . . . , x N,n , and obtain the canonical form of the problem 2.1 . We write this canonical form in Table 1.
Coefficients of the transformed matrix A and the transformed vector c are again denoted by a ij and c j , respectively, without loss of generality.
For the sake of completeness we restate one version of the two-phase maximization simplex algorithm from 12, 14 for the problem 2.1 , represented in Table 1.
Within each iteration of the simplex method, exactly one variable goes from nonbasic to basic and exactly one variable goes from basic to nonbasic. Usually there is more than one choice for the entering and the leaving variables. The next algorithm describes the move from the current base to the new base when the leaving-basic and entering-nonbasic variables have been selected.

2.2
The next algorithm finds an optimal solution of the LP problem when the condition b 1 , . . . , b m ≥ 0 is satisfied.
Step S1A. If c 1 , . . . , c n ≤ 0, then the basic solution is an optimal solution.
Step S1B. Choose c j > 0 according to the Bland's rule 15 .
Step S1C. If a 1j , . . . , a mj ≤ 0, stop the algorithm. Maximum is ∞. Otherwise, go to the next step.
Step S1D. Compute If the condition b 1 , . . . , b m ≥ 0 is not satisfied, we use the algorithm from 12, 14 to search for the initial basic feasible solution. In contrast to approach used in 16 , it does not use artificial variables and therefore does not increase the size of the problem. It is restated here as the following Algorithm 2.3.
Step S4. Otherwise, find a ij < 0, compute choose x N,j as the entering-nonbasic variable, x B,p as the leaving-basic variable, apply Algorithm 2.1, and go to Step S2.
Note that Algorithm 2.3 and the NFB algorithm from 11 are equivalent. As shown in the paper 13 , the lack of the algorithm from 12 is that when it is applied to transform the selected coordinate from the basic solution into a positive, it can increase the number of negative coordinates.
The problem of selecting a leaving-basic variable and corresponding enteringnon-basic variable in the two-phase simplex method is contained in Step S1D of Algorithm 2.2 and Step S4 of Algorithm 2.3. We observed two drawbacks of Step S4. By i we denote the index of the last negative b i .
in the next iteration x B,t becomes negative: Although there may exist b t < 0, t < i such that In such case, it is possible to choose a tj for the pivot element and obtain Also, since b t /a tj ≤ b k /a kj , each b k > 0 remains convenient for the next basic feasible solution: Therefore, it is possible that the choice of entering and leaving variable defined by Step S4 reduces the number of positive b's after the application of Algorithm 2.1. Our goal is to obviate the observed disadvantages in Step S4. For this purpose, we propose a modification of Step S4, which gives a better heuristic for the choice of basic and nonbasic variables.
The following statements are valid.
1 It is possible to produce a new basic solution x 1 {x 1 B,1 , . . . , x 1 B,m } with at most q − 1 negative coordinates in only one step of the simplex method in the following two cases: a q m, b q < m and there exist r ∈ I and s ∈ {1, . . . , n} such that 2 It is possible to produce a new basic solution Proof. We restate the proof from 13 for the sake of completeness.
1 a If q m, for an arbitrary pivot element a js < 0, we get a new basic solution with at least one positive coordinate: The existence of negative a js is ensured by the assumption that the problem 2.1 is feasible. b Now assume that the conditions q < m, r ∈ I, and 2.11 are satisfied. Choose a rs for the pivot element and apply Algorithm 2.1. Choose arbitrary b k ≥ 0, k / r.
In the case a ks < 0 it is obvious that In the case a ks > 0, using b k /a ks ≥ b r /a rs , we conclude immediately On the other hand, for b r < 0 we obtain from Algorithm 2.1 Therefore, all nonnegative b k remain nonnegative and b r < 0 becomes nonnegative. 2 If neither condition a nor condition b is valid, let r / ∈ I and s ∈ {1, . . . , n} be such that By choosing a rs as the pivot element and by applying the transformations defined in Algorithm 2.1 we obtain the same number of negative elements in the vector b. This fact can be proved similarly as the part 1 b .
Remark 2.5. From Proposition 2.4 we get three proper selections of the pivot element in Step S4: i arbitrary a js < 0 in the case q m; ii arbitrary a rs < 0 satisfying 2.11 when the conditions 0 < q < m, r ∈ I are satisfied; iii arbitrary a rs > 0 satisfying 2.16 when 0 < q < m, and there is no a rs < 0 satisfying conditions in the previous case.
In accordance with Proposition 2.4 and considerations in Remark 2.5, we propose the following improvement of Algorithm 2.3.

Algorithm M1 (modification of Algorithm 2.3)
Step Step 2. Select the first b i s < 0.
Step 3. If a i s ,1 , . . . , a i s ,n ≥ 0, then STOP. LP problem is infeasible. Otherwise, construct the set initialize variable p by p 1 and continue.
Step 4. Compute Step 5. If b i s /a i s ,j p ≤ b h /a h,j p , then interchange entering-nonbasic variable x N,j p and leavingbasic variable x B,i s apply Algorithm 2.1 and go to Step 1. Otherwise go to Step 6.
Step Algorithm M2 Step Step 2. Set s 1 and perform the following.
Step 2.1. If a i s ,1 , . . . , a i s ,n ≥ 0 then STOP. LP problem is infeasible.
Step 2.2. Find the minima:

2.21
If b k /a k,j p ≤ M j p , then choose a p ,j p for the pivot element, apply Algorithm 2.1, and go to Step 1.
In the next iteration b k becomes positive.
Step 2.3. If p < t, then put p p 1 and go to Step 2.2, otherwise continue.
Step 2.4. If s < q, then put s s 1 and go to Step 2.1, otherwise continue.
Step 3. If q < m and there do not exist r ∈ I and s ∈ {1, . . . , n} such that , a rs < 0, then move to the following.
Step 3.2. Compute Step 3.3. Choose a p ,j 0 for pivot element, apply Algorithm 2.1, and go to Step 1.

Abstract and Applied Analysis
Algorithm M1 and Algorithm M2 are well defined. If there is no b r < 0 such that the condition 2.11 is valid, we choose pivot element according to Remark 2.5 to obtain a solution with the same number of negative b's. To avoid the cycling in this case, we will present an anticycling rule which is based on the following result from 13 . Therefore, according to Proposition 2.6, finite number of iterations value of b i s will start to increase or we will conclude that the problem is infeasible a i s ,j are positive for all j 1, . . . , n .
In the sequel we show that our methods have better performances than the Nabli's NFB method from 11 . Algorithms M1 and M2 are implemented in the software package MarPlex, reused from 13 and tested on some standard linear programming test problems.

Results of Comparison
We already mentioned that we were motivated with the paper 11 to consider the number of iterations needed to find the starting point. This yields an exact indicator of the quality of the applied linear programming solvers. In the paper 11 that the quotient of the total number of iterations required to determine the initial solution and the number of variables is equal to 0.334581 268/801 for the NFB, 0.483249 439/985 for the two-phase algorithm and 0.445685 476/985 for the Big-M method. It is obvious that the NFB algorithm is better than the Two-phases and the Big-M methods with respect to this criterion.
Note that the results of the paper 11 are obtained on a small sample of test problems in which the number of variables is relatively small. To resolve this drawback, we applied the NFB algorithm to the Netlib collection test problems. In addition, we tested our two algorithms, marked with M1 and M2. In this paper we consider the results of practical application of these algorithms in accordance with the methodologies used in the papers 11, 18 .
The comparison of different optimization softwares from 11 is based on the number of iterative steps. In the tables below, columns labeled by n contain the number of variables in the problem, while the columns marked with NFB, M1, and M2 contain the number of iterations that is required for the corresponding algorithm. The names of the Netlib problems are given in the column LP. The number of iterations that are necessary to find initial basic feasible solutions in particular test examples is presented in Table 2.
The quotient of the total number of iterations required to determine the initial solution Simplex algorithm and the total number of variables is equal to 32856/52509 0.625721 for the NFB algorithm, for the Algorithm M1 is equal to 27005/52509 0.514293, and for the M2 method is 16118/52509 0.306957. For the NFB algorithm we get a higher value compared to the result from 11 , as consequence of the larger size and the computational effort to solve the tested problems. Algorithm M1 shows a slightly better efficiency, while M2 produces significantly smaller quotient.
From Tables 2 and 3 it follows that the total number of iterations spent in the Simplex algorithm to generate the optimal solution is equal to 15409,16400,14170 for the NFB, M1, and M2 methods, respectively. Therefore, it is clear that the total number of iterations required for the initial basic feasible solution is greater than the number of iterations required by the Simplex algorithm to find the optimal solution. This fact is very important, especially if we observe from Table 2 that about 25% of the considered problems are suitable for determining the starting point in one step. Taking into account the results of the paper 11 from which it follows that the number of iterations of the classical two-phase method for determining the starting point is significantly higher than the number of iterations of algorithms that do not introduce artificial variables, the fact becomes more important. This means that finding the starting point represents a very important part in solving LP problems.
The better performances of the M2 method primarily and particularly the M1 method , compared to the NFB method, can also be confirmed by using the so-called performance profile, introduced in 18 . The underlying metric is defined by the number of iterative steps. Following the notations given in the paper 18 we have that the number of solvers is n s 3 the NFB, M1, and M2 and the number of numerical experiments is n p 49. For the performance metrics we use the number of iterative steps. By i p,s we denote the number of iterations required to solve problem p by the solver s. The quantity where τ ∈ R and P represents the set of problems. Figure 1 shows the performance profiles for NFB, M1, and M2 regarding the number of iterations. Figure 1 a resp., Figure 1 b illustrates data arranged in Table 2 resp., Table 3 . It is clear from Figure 1 a , that M1 and M2 methods show better performances compared to NFB. Comparison of M1 and M2 algorithms gives somewhat contradictory results. Namely, the probability of being the optimal solver regarding the number of iterative steps required for generating the starting point is in favor of the M1 algorithm, which is confirmed by ρ M1 1 0.64 > ρ M2 1 0.60 > ρ NFB 1 0.38. On the other hand, the probability ρ M2 τ of Algorithm M2 is greater than the probability ρ M1 τ for all values τ > 1. On the other hand, ρ M2 τ > ρ NFB τ is always satisfied.
From Figure 1 b , it is evident that the inequality ρ M2 τ ≥ max{ρ NFB τ , ρ M1 τ } is always satisfied, which means that Algorithm M2 has the highest probability of being the optimal solver with respect to the total number of iterations required to solve the problem.
In sequel, the total CPU time is the comparison criteria. The required CPU time for NFB, M1 and M2 methods is given in Table 4. The time ratio and the number of   Table 2. b Performance profile regarding the number of iterations in Table 3. iteration ratio, of Algorithms M1 and M2 with respect to the NFB algorithm are also given in Table 4. The situation in this case is not quite clear. Namely, comparing the time ratio with respect to the number of iterations ratio given in columns labeled M1/NFB and M2/NFB, we conclude the number of required iterations of M1 and M2 algorithms for problems czprob, share1b, ship04l, and ship08l is smaller then the number of iterations of NFB algorithm but this is not in accordance with required CPU time ratio. This is the consequence of the greatest number of conditions that must be checked in the M1 and M2 algorithms with respect to the NFB algorithm. The difficulties of that type may occur especially when the LP problem has dense matrix. Note that for all others test problems accordance of the number of iterations ratio with respect to the required CPU time ratio does exist. Figure 2 shows the performance profiles for NFB, M1, M2 regarding the required CPU time.
It is clear from Figure 2 that the probability of being the optimal solver regarding the CPU time required is in favor of the M2 algorithm, which is confirmed by the inequalities ρ M2 1 > ρ M2 1 > ρ NFB 1 . On the other hand, further comparison when τ increases gives contradictory results. Unfortunately, M2 algorithm is not the best solver for all value τ. In the case 1.5 ≤ τ ≤ 4.5 it follows that M1 algorithm is the best option except for τ 2 when NFB algorithm takes advantage. Further, for 5 ≤ τ ≤ 9 we have ρ M2 τ ρ NFB τ ρ M1 τ , which means that the considered solvers are equivalent. Finally, for values τ > 9 Algorithm M2 is the best again.

Conclusion
The importance of selecting the starting point is known problem from the linear programming theory. In this paper, we investigated that problem in terms of the number of iterations required to find the initial point and the optimal point as well as in the terms of the required CPU time.
We observed that our Algorithm M2 achieved the lowest total number of iterations for finding the starting point for the Simplex method as well as the minimal total number of iterations for determining the optimal point. On the other hand, Algorithm M1 has the highest probability to achieve the minimal number of iterations to find the initial basic feasible solution. Algorithm M2 has the highest probability of being the optimal solver with respect to the total number of iterations required to solve the problem.
Comparison with respect to the required CPU time gives that the probability of being the optimal solver is in favor of the M2 algorithm for τ 1 and for τ ≥ 9.5. It is clear that CPU time generally depends on the number of iterations, but it is obvious that there are some others important factors. Further research will be in that direction.