Top-k Based Adaptive Enumeration in Constraint Programming

1Pontificia Universidad Católica de Valparaı́so, 2362807 Valparaı́so, Chile 2Universidad Autónoma de Chile, 7500138 Santiago, Chile 3Universidad Central de Chile, 8370178 Santiago, Chile 4Universidad Finis Terrae, 7501015 Santiago, Chile 5Facultad de Ingenieŕıa y Tecnologı́a, Universidad San Sebastián, 8420524 Santiago, Chile 6CNRS, LINA, University of Nantes, 44322 Nantes, France 7Universidad de Valparaı́so, 2362735 Valparaı́so, Chile 8Universidad Técnica Federico Santa Maŕıa, 2390123 Valparaı́so, Chile 9Escuela de Ingenieŕıa Industrial, Universidad Diego Portales, 8370109 Santiago, Chile


Introduction
Constraint programming (CP) is a modern and efficient programming paradigm for solving complex constraint satisfaction and optimization problems.It is commonly employed for solving real-life problems in various application domains; practical examples can be seen in rostering [1], scheduling [2], manufacturing [3], engineering [4], and bioinformatics [5], among others.Under this paradigm, problems are seen as constraint networks, which consist in a sequence of variables owning a domain of values and a set of constraints imposing conditions that those variables must satisfy.A solution is then a set of values for the variables that satisfies the whole set of constraints.Constraint networks, also named constraint satisfaction problems (CSPs), are usually solved by a backtracking-based algorithm that explores a search tree where the potential solution is distributed.The exploration proceeds by combining two main phases: enumeration and propagation.Enumeration is the process of assigning values to variables in order to generate partial solutions, while the propagation attempts to remove from the constraint network the conflicting values that do not lead to any feasible solution.
The enumeration process involves two actions.Firstly, a variable from the constraint network is selected and then a value is assigned to this variable.These actions are handled by the variable and value ordering heuristics, which together form the enumeration strategy.Enumeration is a crucial phase in constraint satisfaction that may conduct to very different solving processes depending on the strategy employed.Indeed, by selecting the correct strategy, a solution can be reached without performing backtracks and as a consequence requiring a minor solving time.However, a main concern in this context is the difficulty of a priori selecting the correct strategy as their performance is notably hard to predict.
During the last years, various approaches have been proposed to tackle this concern; most of them are based on performing adaptive enumeration during the search process.The idea is to enable the search algorithm to control and automatically choose its own ordering heuristics based on information collected during the solving process.A preliminary framework is proposed in [6] which allows the solver to choose and replace its heuristics during solving time by using priorities.The idea is to penalize bad performing strategies while the efficient ones receive more credits.The quality of strategies is evaluated by using performance indicators of the search process.A more modern and efficient approach is presented in [7], where a hyperheuristic is proposed to control the enumeration.A hyperheuristic [8,9] can be seen as a method to choose heuristics [10]; in particular a meta-heuristic is used on the top of the hyperheuristic in order to select and adapt the ordering heuristics.Although this hyperheuristic approach has illustrated promising results [11][12][13], the top part is quite expensive adding a significant overhead to the whole resolution of the constraint satisfaction problem.
In this paper, we propose a novel and more lightweight framework for adaptive enumeration.We follow a different approach avoiding the use of hyperheuristics by incorporating a simple and efficient classification technique named Top- (see Figure 1) in order to rapidly evaluate and rank the strategies along the resolution.We illustrate encouraging results where our approach noticeably competes with classical and modern adaptive enumeration methods for constraint satisfaction.
The rest of this paper is organized as follows.The related work is given in Section 2 followed by an overview of constraint programming and associated concepts.The new Top- based approach for adaptive enumeration is illustrated in Section 4. In Section 5, we present the experiments and the analysis of results.Finally, we give conclusions and some future research directions.

Related Work
In constraint solving, the enumeration strategy is composed of the pair variable and value ordering heuristics.The enumeration is adaptive when the solver is able to control and automatically choose its own ordering heuristics.We can distinguish two dimensions depending on the sources of feedback information: online and offline methods.The feedback, when used, corresponds to the information that is learned while solving (online) or while using a set of training instances (offline).In the context of offline methods, preliminary approaches proposed to sample and learn good strategies after solving a problem.For instance in [14][15][16] learning algorithms are used to analyze the resolution process of problems successfully solved.This information is then used by advisors that are able to recommend good heuristics.Another interesting approach is suggested in [17], which proposes to select the variable with the largest weighted degree.The weighted degree is computed by associating weights to constraints which are incremented once they lead to the deletion of a variable's domain.Then, the weight degree of a variable corresponds to the sum of weights of constraints where they participate.This idea follows the contention principle, which states that variables directly related to conflicts are more likely to cause failures.A variation of the weighted degree strategy, known as random probing, is reported in [18,19], which incorporates sampling during an initial gathering phase arguing that initial choices are often the most important.Following an online process, a pioneer work is the one presented in [6].This framework introduced a fourcomponent architecture, allowing the dynamic replacement of enumeration strategies.The strategies are evaluated via performance indicators of the search process, and better evaluated strategies replace worse ones during solving time.Such a pioneer framework was used as basis of different related works.For instance, a more modern approach based on this idea is reported in [7].This approach employs a two-layered framework where a hyperheuristic placed on the top-layer controls the dynamic selection of enumeration strategies of the solver placed on the lower-layer.A hyperheuristic can be regarded as a method to choose heuristics [10].In this approach, two different top-layers have been proposed, one using a genetic algorithm [11,12] and another using a particle swarm optimizer [13].Similar approaches have also been implemented for solving optimization problems instead of pure CSPs [21].

Preliminaries
In this section, we briefly survey the basic concepts associated with constraint programming.Definition 1 (constraint).A constraint  is a relation defined on a sequence of variables () = (  1 , . . .,   |()| ), called the scheme of . is the subset of Z |()| that contains Definition 2 (constraint network).A constraint network also known as constraint satisfaction problem is defined by a triple  = ⟨, , ⟩, where we have the following.
where [  ] denotes the value of  assigned to   .
(i) A solution to a network  is an instantiation  on  = ( 1 , . . .,   ) iff it is valid and satisfies all the constraints.A solution is denoted as sol and SOL() corresponds to the set of solutions of .
Example 5 (the Sudoku puzzle).As an example, let us consider the Sudoku problem as a constraint network.The problem consists in filling a 9 × 9 grid with prefilled cells, divided into nine 3 × 3 regions, so that each column, row, and region contain different digits from 1 to 9. Let  = ⟨, , ⟩ be the constraint network, which is composed of the following.
(ii)  is the corresponding set of domains, where (  ) ∈  is the domain of the variable   ; (iii)  is the set of constraints defined as follows.
(a) To guarantee that values differ in rows and columns: (b) To guarantee that values differ in each subregion: An instance and the corresponding solution to a Sudoku puzzle can be found in Figure 2.

Constraint Solving.
Constraint solving is usually handled by building a search tree composed of potential solutions.Those potential solutions are then verified if they can lead to a feasible solution.If not, the search tree is pruned by deleting from domain those unfeasible values.The search process is commonly divided in two phases: enumeration and propagation.The enumeration can be seen as a sorting process [22,23] where the potential solutions are incrementally constructed by assigning values to variables.The propagation is the phase responsible for pruning the tree.
A general procedure for constraint solving is illustrated in Algorithm 1.As input, it receives the constraint network  and the enumeration strategy  to be employed.The output is a solution sol ∈ SOL() to the constraint network or a failure; that is, no solution has been found.The process begins by initiating a while loop which iterates over a set of actions until a solution or a failure is found.The first action corresponds to selecting a variable from the set  to then associate it with a value from its domain according to .The propagation phase is next called in order to delete from domains the values that will not lead to any solution considering the current instantiation.This phase may empty the domain of a given variable, which is known as a domain wipe-out (DWO).If a DWO happens in a future variable, a shallow backtrack assigns the following available value to the current variable.Finally, if a DWO occurs in the current variable, the backtrack jumps back to the previous instantiated variable that is still able to reach a solution.

Adaptive Enumeration via Top-𝑘
In this section, we present the framework for adaptive enumeration via the Top- approach.The idea is to evaluate, via performance indicators, and then to rank a portfolio of enumeration strategies in running time in order to use the most appropriate one at each step of the process (see Figure 3).Our goal here is to avoid the use of hyperheuristics to provide a faster process while maintaining the quality of solutions.In the following, we briefly describe the Top- approach and its incorporation to the adaptive enumeration framework.

Top-𝑘.
In database systems, a Top- query [24] aims at retrieving the  tuples that better satisfies a given set of preferences.This is a widely used technique as selecting data according to user requirements is largely required in different real-life domains such as the Web, multimedia search, distributed systems, and multicriteria decision making.A common way to identify the Top- objects is to provide a score to each object according to its relevance within the tuple and then to compute a scoring function for all involved objects.There exist different techniques to processing Top- queries which are mainly related to database access, design, and implementation aspects (see [24] for a detailed presentation).We adopt the monotone ranking function approach which is the most appropriate one for our purposes, as it can straightforwardly be mapped to a scoring function for evaluating performance.Another interesting feature is the ability to upper-bound objects scores.This allows one to early prune certain objects without exactly knowing their scores, alleviating the function computation, and as consequence accelerating the whole solving process.A monotone ranking function is defined as follows.
Then, assuming that an indicator  of the search process can be seen as a predicate   , a scoring function  for evaluating the performance of enumeration strategies by using  indicators can be generically defined as where   with  ∈ 1 ⋅ ⋅ ⋅  is an indicator that measures the performance of the strategy in a given amount of time.However, according to [12], indicators do not have the same relevance for evaluating a solving process; a weight is therefore associated with each indicator to balance their effect on the scoring function.Finally, the general scoring function for a given enumeration strategy   ∈  is defined as where   is the weight associated with indicator     , which in turn provides a score for the strategy   , with respect to a given performance criterion.Definition 7 (Top- result set).Let  be the set of enumeration strategies used in the adaptive enumeration framework.The Top- result denoted by Top  () is a sorted set on the score, in increasing order, such that (1) Top  () ⊆ ; (2) if ‖‖ < , Top  () = ; otherwise ‖Top  ()‖ = ; (3) ∀ ∈ Top  (), ∀  ∈  \ Top  (),   ≤   .Definition 8 (result's rank).Given a Top- result set , the rank of result  ∈  is the position of  in the set .
Because our approach does not deal with hard disks issues and the processing of large volumes of data in main memory we use the most straightforward way to compute the Top- result set; that is, all the strategies are sequentially scanned and the score of each of them is calculated and placed in a sorted list.Thus, the adaptive enumeration framework selects the best enumeration strategy in order to continue with the solving process.
Algorithm 2 depicts from a high-level standpoint the integration of the Top- approach in a classic constraint solving algorithm.The algorithm receives as input the constraint network , the set of strategies denoted by , and the set of indicators denoted by ; the output is a solution sol ∈ SOL() to the constraint network or a failure.The process begins by setting up the portfolio and the indicators of the solving process and a cutoff value, which is usually a given number of steps (other stop cutoff values can be used such as the percentage or number of fixed variables, number of visited nodes, or number of backtracks).Then, the constraint network is attempted to be solved until the cutoff, and the best strategy is selected by using Top-.Once the best   is selected, the constraint network is solved and at every step, the indicators provide the corresponding score, which are used to compute the scoring function.In this way, the best   is employed next along the resolution process.Finally, if a solution sol ∈ SOL() exists, it is reported.

Experimental Evaluation
This section illustrates the experimental results of the proposed approach.The Top- adaptive enumeration has been implemented on the Ecl i ps e Constraint Logic Programming Solver v5.10 interfaced with Java.The experiments have been launched on a 3.3 GHz Intel Core i3 with 8 Gb RAM running Ubuntu.We test our approach using different instances of the following classic benchmarks: (i) the -queens problem with  = {8, 10, 12, 15, 20, 50}, (ii) the magic squares with size = {3, 4, 5}, (iii) the Sudoku puzzle, (iv) the knight tour with size = {5, 6}.
We provide in the Appendix Section the encodings of the tested benchmarks.As in previous works [4], we use 65535 steps as the stop criterion; problems having no solution at this point are set to t.o.(time out).Once the problem is launched, the step begins from 0 and it is incremented by 1 each time a variable is instantiated by enumeration.The adaptive enumeration uses a portfolio of eight strategies, which are described in Table 1.Indicators and weights employed are described in Table 2. Weights have been collected from experience and tuning phases done in previous works [12,13] in order to correctly represent the relevance that each one has on the scoring function.
We compare the adaptive enumeration based on Top- with the two last reported online methods based on hyperheuristics, one controlled by a genetic algorithm (HH-GA) [25] and the other one by particle swarm optimization The solving process alternates enumerations and backtracks on a small number of variables without succeeding in having a strong orientation.It is calculated as , where   is the current depth in the search in a given time .(HH-PSO) [13].The portfolio used by hyperheuristics is the same as the one of the Top- approach.We also include in the comparison the results of using a single strategy during the complete solving process ( 1 to  8 ).Tables 3 and 4 report solving times needed to reach a solution.The results illustrate that Top- is able to outperform both hyperheuristics and single strategies for small instances of the -queens.Top- is about 20 times faster than HH-PSO and about 30 times faster than HH-GA for  = 8; this difference is increased for  = 10 and  = 12, being about 7 times faster than hyperheuristics for the biggest tested instance of -queens.
For the Sudoku puzzle, Top- is not able to improve previous performance, but for magic squares and knight's tour, Top- performs notably better than hyperheuristics, remaining competitive with respect to single strategies.Indeed, Top- takes the first place for knight's tour ( = 6) when single strategies are not able to solve it before stop criterion.These results evidence the ability of Top- to dynamically adapt the search in order to use the best strategy for each part of the search tree, allowing producing better results than most of single strategies.Certainly, selecting the correct variable and value ordering for the different regions of the search space can have a dramatic effect on the performance of the backtracking algorithm with huge variances on solving performance [12,13,17,[26][27][28].This is explained by the fact that selecting the correct strategy to the correct part of the search tree should lead to more promising instantiations (variable-value assignments conducting to solutions) than failures (variable-value assignments not conducting to any solution), reducing as a consequence the number of backtracks and runtime.
We also consider in the experimental evaluation the number of backtracks reported to reach a solution as a performance indicator (see Tables 5 and 6), since it allows one to measure how many times the process fails in instantiating a variable.Here, we illustrate that in adaptive enumeration the number of backtracks is not exactly related to solving time.Indeed, the Top- approach can fail more to reach a result but obtain minor solving times than hyperheuristics (see, e.g., -queens with  = 20 and  = 50; magic squares with size = 5).This is explained by the fact that the Top- is a more lightweight component than the optimizer on top of hyperheuristics, which is able to find good heuristic configurations but add a noticeable overhead to the whole solving process.Indeed, the Top- algorithm used runs in ( 2 ) while the optimizers used in HH-GA and HH-PSO run in exponential time.A graphical comparison of adaptive enumeration approaches in terms of runtime and backtracks can be seen in Figures 4 and 5, respectively.

Conclusions
In this paper, we have presented a new and more lightweight framework for adaptive enumeration.The idea is to reduce the cost of associated hyperheuristics presented in previous approaches so as to alleviate the whole work obtaining faster solving processes.To this end, we have incorporated a simple and efficient classification technique named Top- in order to rapidly evaluate and rank the strategies along the resolution.We have performed a set of experiments on different instances of classic benchmarks where the Top- based adaptive enumeration is able to outperform in several cases the last adaptive enumeration methods reported in the literature while keeping the quality of solutions.Indeed, considerable runtime reductions are achieved in several instances of the tested problems.
We visualize different directions for future work; a straightforward one is the incorporation of new combinations of strategies to provide a bigger portfolio, while finding the balance to avoid increasing the cost of the rank computation.Another interesting idea is to experiment with new and possibly more lightweight optimizers in order to alleviate the work of the hyperheuristic approach.Finally, the use of a similar adaptive framework could be used to interleave different propagation techniques.

Figure 1 :
Figure 1: General flowchart of the adaptive enumeration based on Top-.

Figure 2 :
Figure 2: An instance and the corresponding solution to a Sudoku puzzle.
Figure 3: Ranking the portfolio of enumeration strategies ( 1 to  8 ).

Table 2 :
Search process indicators. st (  ) 10 Number of steps since the last time that an enumeration strategy   was used until step stth.Previous Depth.A positive value means that the current node is deeper than the one explored at the previous step.

Table 3 :
Runtime in ms for different instances of the N-queens problem with different strategies.

Table 4 :
Runtime in ms for magic squares, Sudoku, and the knight problem with different strategies.

Table 5 :
Number of backtracks solving different instances of the N-queens problem with different strategies.

Table 6 :
Number of backtracks solving magic squares, Sudoku, and the knight problem with different strategies.