Solving the Multiscenario Max-Min Knapsack Problem Exactly with Column Generation and Branch-and-Bound

Despite other variants of the standard knapsack problem, very few solution approaches have been devised for the multiscenario max-min knapsack problem. The problem consists in finding the subset of items whose total profit is maximized under the worst possible scenario. In this paper, we describe an exact solution method based on column generation and branch-and-bound for this problem. Our approach relies on a reformulation of the standard compact integer programming model based on the DantzigWolfe decomposition principle. The resulting model is potentially stronger than the original one since the corresponding pricing subproblem does not have the integrality property. The details of the reformulation are presented and analysed together with those concerning the columngeneration andbranch-and-boundprocedures. To evaluate the performance of our algorithm,we conducted extensive computational experiments on large scale benchmark instances, and we compared our results with other state-of-the-art approaches under similar circumstances. We focused in particular on different relevant aspects that allow an objective evaluation of the efficacy of our approach. From different standpoints, the branch-and-price algorithm proved to outperform the other stateof-the-art methods described so far in the literature.


Introduction
The multiscenario max-min knapsack problem is a variant of the well-known knapsack problem.The problem is characterized by a set of items with a given weight and profits, and by a knapsack whose capacity determines the unique constraint that applies.In this context, a scenario is defined through the set of profits that apply to each item, respectively.Hence, the profit of an item depends directly on the scenario that is considered.The objective of the multiscenario max-min knapsack problem is to determine the subset of items whose total weight is smaller than or equal to the knapsack capacity, and whose total profit is maximized in the worst scenario over all the possible scenarios.
The standard knapsack problem is a special case of the multiscenario max-min knapsack problem in which there is only one scenario.In this case, the worst scenario is always the only one that applies, and the max-min knapsack problem reduces to the problem of finding the items with the maximum total profit under this scenario.As a consequence, the multiscenario max-min knapsack problem is NP-hard.In fact, Yu showed in [1] that the problem is pseudopolynomially solvable when the number of scenarios is bounded, while it is strongly NP-hard when this number is unbounded.The complexity of min-max combinatorial optimization problems was further studied in [2] where different approximation results are provided.They proved in particular that the minmax regret knapsack problem is not at all approximable even in the case of two scenarios, and that the problem is also strongly NP-hard for a nonconstant number of scenarios.Specific variants of the min-max knapsack problem were also explored in [3] for the case where the item sizes are all equal to 1, and one has to choose a given number of items so as to minimize the total cost under the existing scenarios.In [3], Kasperski et al. showed that the problem is not approximable within a constant factor unless P = NP.
Unlike the standard knapsack problem and other variants which have been explored in depth in the literature [4,5], the multiscenario max-min knapsack problem has received much less attention, and only recently different approaches began to be explored [1,[6][7][8][9][10].In practice, solving large scale instances of the multiscenario max-min knapsack problem 2 Mathematical Problems in Engineering up to optimality remains a challenge.In this paper, we describe and analyse an exact solution approach for the problem that relies on the combination of branch-and-bound and column generation.To the best of our knowledge, this is the first time that these methods are used together to solve this particular problem.
The first contributions towards the solution of the multiscenario max-min knapsack problem are due to Yu [1] and Iida [6].Yu [1] proposed and analysed lower and upper bounds for the problem computed through surrogate relaxation.He described an exact branch-and-bound algorithm for the problem which solved instances with up to 60 items and 30 scenarios.Different lower and upper bounds based on linear programming were proposed later by Iida in [6].To evaluate their effectiveness, the author embedded the bounds on the branch-and-bound algorithm proposed by Yu [1].Although Iida claimed that the bounds were not as tight as those proposed by Yu [1], his computational experiments showed that the bounds were sufficient to solve the instances used by Yu up to optimality in less than one minute on average.
Taniguchi et al. [7] explored the use of the pegging test to reduce the size of the problem.The pegging test is applied after lower and upper bounds have been computed through surrogate relaxation.The optimal solution of the multiscenario max-min knapsack problem is then searched by applying branch-and-bound on the reduced problem.The authors report on computational experiments conducted on instances generated as described in [1].The size of the instances goes up to 30 scenarios and 1000 items, although the method frequently found much difficulty in solving the largest instances up to optimality within the time limit of 1200 seconds.The authors also compared their approach with those by Yu [1] and Iida [6] on smaller instances with 60 items and up to 30 scenarios and concluded that it outperformed them both.
The same authors extended their work in [11] focusing on the two-scenario max-min knapsack problem.They described a heuristic algorithm with which they found solutions for instances with up to 16000 items.In their approach, lower and upper bounds are still obtained through surrogate relaxation, while the pegging test is used to reduce the size of the instance.The authors also describe a method to further decrease the size of the problem based on a so-called virtual pegging test.
In [8], a cooperative approach based on tabu search and combining two local search algorithms is proposed.The author described a generalized local search procedure which is used to diversify the search and a restricted local search procedure which is used to intensify the search in a given region.To generate initial feasible solutions for the problem, the author resorted to an iterative greedy heuristic.The computational experiments reported in [8] showed that good approximate solutions can be found with this approach within small computing times.
In [9], Hanafi et al. explored a hybrid approach to solve the max-min knapsack problem with two scenarios.Their approach is based on the iterative improvement of lower and upper bounds obtained through relaxations of mixed integer programming models, temporary variable fixing, and by enforcing pseudocuts that exclude prior solutions of the problem.The authors reported on computational experiments using instances with up to 20000 items which showed a better performance on correlated instances of the max-min knapsack problem with two scenarios.
More recently, Song et al. [10] proposed a fast heuristic to solve the multiscenario max-min knapsack problem.Their approach remains based on the solution of a surrogate relaxation of the original problem obtained by applying a subgradient algorithm.The authors explored different incomplete -exchange algorithms consisting in swapping the values of some of the binary variables of the problem.Their approach showed to be able to generate good approximate solutions for large scale instances within reasonable time limits although its effectiveness seems to decrease with strongly correlated instances.
In this paper, we explore a novel approach for the exact resolution of the multiscenario max-min knapsack problem based on column generation and branch-and-bound.Prior preliminary results on the use of column generation to compute lower and upper bounds were discussed in [12] for the problem with only two scenarios.Here, we extend this approach by generalizing it to the case of multiple scenarios, and we propose a branch-and-price algorithm to search for optimal integer solutions.The performance of the global approach is evaluated through extensive computational experiments performed on large scale benchmark instances from the literature.The results illustrate the effectiveness of the branch-and-price algorithm in finding optimal solutions within reasonable computing times even for strongly correlated instances.Branch-and-price was never used before to solve exactly the multiscenario max-min knapsack problem.However, it is important to note that there are already some successful attempts reported in the literature concerning the application of column generation to other variants of the standard knapsack problem.Such an example can be found in [13] for the multiple-choice knapsack problem.
The outline of the paper is as follows.In Section 2, we describe formally the elements that characterize the multiscenario max-min knapsack problem, and we introduce the notation that will be used throughout the paper.In Section 3, we describe two integer programming formulations for the problem: a standard compact formulation and a column generation reformulation that relies on a Dantzig-Wolfe decomposition of the previous model.The components of our branch-and-price algorithm are described in Section 4. In Section 5, we report on the computational experiments performed to evaluate the performance of our approach.Some conclusions are finally drawn in Section 6.

The Multiscenario Max-Min Knapsack Problem
As referred to above, the multiscenario max-min knapsack problem is a generalization of the well-known standard knapsack problem.In the latter, one is given a set of items characterized by a weight   and a profit   .The objective of this standard problem is to find the subset of items with the maximum total profit that fit into a given knapsack of capacity ; that is, whose total weight is smaller than or equal to .Let  be the total number of items.The standard knapsack problem can be formulated using the following integer programming model: with the binary variables   indicating whether the item  is selected or not.The max-min knapsack problem shares most of the defining characteristics of the standard knapsack problem.It is defined from a set of  items of weight   ,  = 1, . . ., , and from a knapsack of capacity .The difference lies in the definition of the profits associated with the items.In the maxmin knapsack problem, the profit of an item depends on the particular scenario that is considered.For a given scenario , the profit of a given item ,  = 1, . . ., , under this scenario will be denoted by    .Hence, a scenario is defined as a set of profits that apply to the items of the problem.On the contrary, the weights of the items are independent from the scenarios, and remain equal to   for an item  whatever the scenario that may apply.Throughout the paper, we will denote by  the total number of scenarios.In the general case where the number  of scenarios is unbounded, the problem is referred to as the multiscenario max-min knapsack problem.In this context, the objective of the problem consists in finding the subset of items that fit in the knapsack and with the maximum total profit under the worst scenario, that is, under the scenario with the minimum total profit over all the scenarios.In the case where  = 1, the problem reduces to the standard knapsack problem, and, as a consequence, the latter can be considered as a special case of the multiscenario max-min knapsack problem.
In this paper, we will address the general multiscenario max-min knapsack problem where the number  of scenarios is unbounded.Furthermore, we will assume that all the items have positive integer profits, a weight which is smaller than or equal to the capacity  of the knapsack, and such that the items do not fit all in the knapsack, that is, ∑  =1   > .

Integer Programming Formulations
where the binary variables   represent the selection or not of an item ,  = 1, . . ., .This model is a direct extension of the integer programming model (1) for the standard knapsack problem.
The minimization part of the objective function can be reformulated using a set of equivalent constraints stating that the total profits under every scenario must be greater than or equal to a given variable .The resulting model becomes a single-objective problem as illustrated next: . . .
In [9], Hanafi et al. explored different mixed integer programming reformulations of the two-scenario max-min knapsack problem, and described different approaches to convert the multiobjective formulation (2) into a singleobjective one.

A Column Generation Reformulation.
A stronger model for the multiscenario max-min knapsack problem can be obtained from (3)-( 9) by applying an appropriate Dantzig-Wolfe decomposition.The result is a column generation reformulation of (3)-( 9) which is the base of the solution approach described in this paper.
Applying the Dantzig-Wolfe decomposition principle to a linear integer problem leads to a reformulation into a master problem and one or more subproblems defined from the constraints of the original formulation.Here, we consider a reformulation of (3)- (9) in which the master problem is defined from the constraints (4)- (7), and the subproblem from the knapsack constraints (8) and (9).Since the subproblem has not the integrality property, the resulting model is potentially stronger than the original formulation (3)- (9).
The number of variables   ,  = 1, . . ., , of the master problem is exponential, as is the number  of extreme points of .Constraint (20) corresponds to the convexity constraint (12).An upper bound for the multiscenario maxmin knapsack problem can be computed from the linear relaxation of the master problem.This bound is greater than or equal to the bound computed from the linear relaxation of (3)- (9).
In a column generation approach where the master problem is solved iteratively by considering only a restricted subset of its variables, the subproblem defined from the constraints ( 8) and ( 9) of the original problem is used to price out the variables (columns) that are not in the restricted master problem and that may eventually improve its solution.The solutions provided by the subproblem are in fact the extreme points E p of the knapsack polytope related to the constraints ( 8) and ( 9) of the original formulation.
Let  1 ,  2 , . . .,   ,  0 denote, respectively, the dual variables related to the constraints ( 16), (17), . .., (19), and (20) of the master problem.A variable   ,  = 1, . . ., , of the master problem is attractive if and only if its reduced cost is positive.The most attractive column corresponds to the extreme point of  that maximizes (22).As a consequence, the pricing subproblem related to the Dantzig-Wolfe decomposition described in this section states formally as follows: where   ,  = 1, . . .,  are the decision variables of this standard knapsack problem.Note that (23)-(25) provides a lower bound for the multiscenario max-min knapsack problem since it generates solutions which are feasible for this problem.Hence, a column generation algorithm applied to the reformulation discussed in this section generates at each iteration not only an upper bound from the solutions of the master problem, but also a lower bound from the solutions of the subproblem.

A Branch-and-Price Algorithm
As alluded above, for any medium or large size instance, the size of the model ( 15)-( 21) prevents its resolution based on a complete enumeration of the variables (columns).As an alternative, we will consider an iterative resolution of a restricted master problem defined from (15)-( 21) where only a subset of the variables is used.Columns that are not in the restricted master problem, but which may potentially improve its solution are added on the fly using a column generation procedure.
The solution of (15)-( 21) provides an upper (continuous) bound for the value of the optimal integer solution of the problem.Since the columns of the restricted master problem required to find the optimal solution of ( 15)-(21) may not be enough to find the optimal integer solution of the problem, we will resort to branch-and-bound where at the nodes of the search tree, the column generation procedure is used to find attractive columns that may not have been generated yet.This approach results in a so-called branch-and-price algorithm.The details of our branch-and-price algorithm are described in the sequel.

Column Generation
Procedure.In our implementation, the restricted master problem is initialized with a limited subset of the variables of ( 15)-(21).A simple approach to generate an initial set of columns is to choose the solutions with only one item, and to add the corresponding columns to the restricted master problem (by assumption, all these solutions will satisfy the constraint on the capacity of the knapsack).The restricted master problem is then solved up to optimality.
The values of the dual variables related to the optimal solution of the restricted master problem are used as an input for the pricing subproblem.A set  of integer and best solutions for the subproblem is then computed using a dynamic programming algorithm.The maximum size of  is set through a parameter  max .Let    denote the value of a solution  ∈ .If −   −  0 is greater than 0, then the column related to this solution is attractive.Note that here we will consider that the set  includes only solutions that correspond to attractive columns.Hence, in practice, the size of  may be smaller than  max .The columns in  are added to the restricted master which is solved once again.Meanwhile, the set  is used to update (eventually) the value of the best lower bound for the problem.The process stops when there are no more solutions of the subproblem such that −   −  0 is greater than 0.
The optimal solution of the restricted master problem provides an upper bound for the value of the optimal integer solution of the problem, while the solutions generated by the pricing subproblems may improve the value of the lower bound for the problem as referred to above.Hence, our column generation procedure can be seen as a bounding algorithm for the multiscenario max-min knapsack problem.
In our implementation, instead of generating only one attractive column at each iteration which is the most usual approach in column generation, we generate a set of different attractive columns with the objective of visiting a significant number of feasible solutions for the problem.Note that each solution of the subproblem is feasible for the original multiscenario max-min knapsack problem.In our computational experiments, this strategy proved to be effective in the search for good incumbents.
Algorithm 1 describes formally our column generation procedure.
Algorithm 1 does not ensure that an optimal integer solution for the problem is found.At the end of the column generation procedure, the optimality gap may still be greater than 0. To overcome this issue, we resort to a branch-andbound procedure which is described in the next section.

Branch-and-Bound.
The branch-and-bound algorithm devised in this section is an exact algorithm that ensures that an optimal integer solution of the multiscenario max-min knapsack problem is found.The efficiency of branch-andbound approaches depends critically on two main issues: the bounding strategy used to compute lower and upper bounds for the problem, and the branching strategy.By focusing on tightening the optimality gap, the former allows to reduce the size of the search tree, while the latter is used to guide the search so that the bounding strategy can be more effective.
In our algorithm, upper and lower bounds are computed using the column generation procedure described in the previous section (Algorithm 1).For the branching part of the algorithm, we exploit the information provided at each node of the search tree by the last pricing subproblem defined within the column generation process.
Our branch-and-bound algorithm is described formally in Algorithm 2. The algorithm is based on a LIFO (Last In, Fist Out) strategy.At each node of the search tree, the column generation algorithm (Algorithm 1) is used to solve the master problem ( 15)-( 21) taking into account the set of variables fixed according to the branching strategy.After completion, we get an upper bound (  ) at the current node.Let   denote the coefficients (profits) in the objective function (23) of the variables   in the last pricing subproblem (23)-(25) solved during the column generation process, and let x  be the solution of this last pricing subproblem.Note that we exclude from x  those variables whose value has been fixed during the branching process.Furthermore, let V * be the value of the best solution of the multiscenario max-min knapsack problem found so far (incumbent solution).
Our branching scheme is based on the variables of the original compact formulation (3)- (9).In particular, the focus is put on the variables whose value has greater propensity to change in the optimal solution.From a heuristic standpoint, we consider that the th variable is in this situation if it satisfies one of these two conditions: (1) the value    in x  is equal to 1, and its reduced cost (its approximation    /  ) is weak, that is,   = argmin  {   /  and    = 1}; (2) the value of    in x  is equal to 0, and its reduced cost (its approximation    /  ) is strong, that is,   = argmax  {   /  and    = 0}.These two conditions translate the fact that the  th  variable may have more chances to change its values in the optimal solution if this one has not been found yet.The  th  variable is then selected for branching.It is fixed first to the value 1, while in the backtracking phase, a solution is sought for the case where its value is fixed to 0. These branching constraints are easily translated into the master problem ( 15)-( 21), and In order to compare it objectively with current state-of-theart methods, we divided our tests in two parts.In the first part, we report on results obtained using instances of the max-min knapsack problem with two scenarios.Our branchand-price algorithm is evaluated from different standpoints, namely by considering a limit on the total computing time, by analysing the quality of the upper bounds obtained from it, and by analysing its potential in finding optimal integer solutions in reasonable amounts of computing time.In the first case, the focus is put on the capacity of the algorithm in generating good incumbents for the problem, as if the algorithm was being used as a heuristic.In the second part of our experiments, we will analyse the behavior of our approach in the general case where there are more than two scenarios.All the experiments were conducted on a PC with 2.4 GHz and 4 GB of RAM.Additionally, we used the version 11.1 of the CPLEX optimization solver.

Test Set I: Tests on Instances with Only Two Scenarios.
The experiments reported in this section were conducted on the weakly and strongly correlated instances used in [9] which were obtained from the generator described in [11].We used three sets of instances which are divided according to the capacity of the knapsack.The capacity  of the knapsack is set to  × ∑  =1   , where  ∈ {0.25, 0.5, 0.75}.For each value of  (number of items) and , there are 15 instances.For the set of weakly correlated instances, we have  ∈ {5000, 7000, 10000, 13000, 15000, 18000, 20000}, while for the strongly correlated instances, we have  ∈ {500, 1000, 4000, 5000, 7000, 10000}.Our results are compared with the exact resolution of the original model ( 3)-( 9) through the commercial solver CPLEX, and with the approaches described by Taniguchi et al. in [11] and by Hanafi et al. in [9].
The presentation of the results is organized as follows.In the first part, we compare all these approaches on the aforementioned instances for a very limited execution time (5 seconds).The objective is to evaluate the capacity of each approach in finding good feasible solutions quickly, as if they were used as heuristics.In the second part, we compare the quality of the upper bounds given by our approach with those obtained with CPLEX and the approach of Taniguchi et al. [11].Note that the algorithm described by Hanafi et al. [9] uses initially the same upper bounds as CPLEX before improving it by adding cuts during the execution of their algorithm.Finally, we compare the results of the different approaches for a larger execution time to evaluate the capacity of each one in finding proven optimal solutions for the problem.

Evaluating the Branch-and-Price Algorithm as a Heuristic.
In this section, we evaluate the capacity of our branchand-price algorithm in finding quickly good feasible solutions for the problem, and we compare it with the methods described by Taniguchi et al. [11] and by Hanafi et al. [9].For this purpose, we used a time limit of 5 seconds.The results for instances described above are presented in Tables 1 and 2. The meaning of the corresponding columns is the following: (i) : number of items in the corresponding instance; (ii) #t: number of times the corresponding algorithm found the best solution; (iii) : average computing time (in seconds) required to find the corresponding solution.
Table 1 shows that CPLEX provides the best solutions for an important number of cases, although this performance tends to decrease with the size of problem.The approach of Taniguchi et al. [11] exhibits the worst results among all the methods that were tested.The performance of the approach of Hanafi et al. [9] is comparable to the performance of our branch-and-price algorithm, although it tends to be better on these instances for larger values of .In fact, the algorithm of Hanafi et al. fixes the values of the variables by exploiting its improved upper and lower bounds.These fixing rules are useful when the correlation between the coefficients of the problem is weak.The fixation of an important number of variables in the algorithm of Hanafi et al. helps it to deal with this type of instances independently of the value of .On the contrary, this parameter has a non-negligible impact on our algorithm.When  = 0.25, our algorithm is the best among all the methods, while it is the second best for  = 0.5 and  = 0.75.The value of  defines the capacity of the knapsack.The capacity of the knapsack becomes larger as the value of  increases, which impacts on the dynamic programming algorithm used to solve the pricing subproblems in column generation.
For the strongly correlated instances (Table 2), the performance of CPLEX when used directly on (3)-( 9) tends to decrease with the size of the problem independently of the value of .The algorithm of Taniguchi et al. found much difficulty in finding the best solutions among all the algorithms even for the small size instances.This difficulty increases with the size of the problem, while it seems independent of the value of .Table 2 shows the algorithm of Hanafi et al. outperforms the last two approaches, while its performance decreases also with the size of the problem.The branch-and-price algorithm, whose results are provided in last columns of Table 2, clearly outperforms all the other approaches.The algorithm found all the best solutions in a small amount of time.The computing time increases with the size of the instance, but it remained almost always less than one second on average.

Quality of the Upper Bounds.
In this section, we compare the upper bounds provided by the different approaches.The quality of the upper bounds is important to improve the convergence of branch-and-bound algorithms, and in methods that rely on fixing the values of the variables.Using the same instances as in the previous tests, we compare the upper bounds given by the linear relaxation of (3)-( 9) (solved with CPLEX) with the algorithm of Taniguchi et al. [11] and Algorithm 1 applied to the master problem (15)-(21).We do not compare these upper bounds with those obtained with the method of Hanafi et al. [9] because the latter relies on the resolution of the model ( 3)-( 9) and, as a consequence, it provides upper bounds which are similar to those reported for CPLEX.3 gives the average results for the weakly and strongly correlated instances.Column  associated with CPLEX presents the average upper bound provided by CPLEX over the different groups of 15 instances.The columns  related to the algorithm of Taniguchi et al. and Algorithm 2 give the average absolute gap between their corresponding upper bound and the upper bound provided by CPLEX ( = CPLEX upper bound -upper bound provided by the corresponding algorithm).Hence, in Table 3, when  is positive, the upper bound of the corresponding algorithm is better than the one provided by CPLEX.
Table 3 shows that the bounds provided by our column generation algorithm are always better than those provided by the other two approaches.Recall that since our pricing subproblem has not the integrality property, the quality of the upper bounds provided by our reformulation (15)-( 21) are potentially better than those obtained by using the compact model (3)- (9).Furthermore, the approach of Taniguchi et al. relies on a surrogate relaxation that is worse than the linear relaxation of (3)- (9).The comparative quality of the upper bounds provided by our column generation increases as the instances become more strongly correlated.

Evaluating the Branch-and-Price Algorithm as an Exact
Algorithm.Here, we report on the results obtained with our branch-and-price algorithm on the strongly correlated instances with a time limit of 10 minutes.The objective is to evaluate the capacity of our algorithm in finding proven optimal solutions for the problem.The results are presented in Table 4. Column # gives the number of proven optimal solutions found for each set of 15 instances, while column  indicates average computing time required to reach the corresponding solution.
Our branch-and-price algorithm found the optimal solution for 181 instances out of 270 within the time limit, which represents more than 67% of the tested instances.In many cases, the solution is found in the first nodes of the branching tree, which is due to the quality of the bounds computed using our column generation approach.For the other cases where the optimality of the solution has not been proved, the optimality gap remains much smaller than the one obtained by using CPLEX with the same time limit.The other approaches from Taniguchi et al. [11] and Hanafi et al. [9] are also clearly outperformed by the branch-and-price algorithm described in this paper.Indeed, the approach of Taniguchi et al. solved only 3 of the 270 instances up to optimality, while the algorithm of Hanafi et al. solved exactly only 5 of these 270 instances.

Test Set II: Tests on Instances with More Than Two Scenarios.
To evaluate the performance of our branch-and-price algorithm on general instances of the multiscenario max-min knapsack problem, we generated a set of instances using the generator proposed in [7].In particular, to test the limits of our algorithm, we consider the case of strongly correlated instances with  = 0.75.Recall that the capacity of the knapsack increases with .This increase has a non-negligible impact on the resolution of the pricing subproblem through

Table 1 :
Comparative results on weakly correlated instances for a limited execution time (5 seconds).

Table 2 :
Comparative results on strongly correlated instances for a limited execution time (5 seconds).