Column Generation for a Multitrip Vehicle Routing Problem with Time Windows , Driver Work Hours , and Heterogeneous Fleet

This study addresses a vehicle routing problem with time windows, accessibility restrictions on customers, and a fleet that is heterogeneous with regard to capacity and average speed. A vehicle can performmultiple routes per day, all starting and ending at a single depot, and it is assigned to a single driver whose total work hours are limited. A column generation algorithm is proposed.The column generation pricing subproblem requires a specific elementary shortest path problem with resource constraints algorithm to address the possibility for each vehicle performingmultiple routes per day and to address the need to set the workday’s start time within the planning horizon. A constructive heuristic and a metaheuristic based on tabu search are also developed to find good solutions.


Introduction
Transportation is an important area for logistics studies, and on average, it absorbs a higher percentage of the total logistics costs than other logistic activities [1].Vehicle routing problems belong to a broad category of operational research problems known as network optimisation problems.
A variant of the vehicle routing problem with time windows (VRPTW) is considered here, with the highlights that each vehicle is assigned to a single driver who can travel several routes during a workday and be within the limit of his/her work hours.This feature requires the optimal point in time to start each route to be identified.Some customers have accessibility restrictions, thereby requiring specific vehicles.Concerning the objective function, the purpose is to minimise the total routing cost, which depends on both the distance travelled and vehicles used.
Great advancements have been obtained for different VRPTW variants, such as those by Solomon and Desrosiers [2] and Desrochers et al. [3].Approaches with column generation for set partitioning formulations of the VRPTW and other variants have been presented as well.Desrochers et al. [4] proposed the use of the column generation technique to solve the linear relaxation of the VRPTW set partitioning formulation.Columns are added, when necessary, by means of the resolution of the shortest path problem with time windows and capacity constraints using a dynamic programming algorithm.The solutions obtained generally yielded excellent dual bounds which are used in a branchand-bound algorithm to solve the integer formulation.This same approach is used in the present study.
Obtaining solutions to the vehicle routing problem with multiple routes, but without time windows, has been approached by means of heuristics by Taillard et al. [5], Brandão and Mercer [6], Olivera and Viera [7], and Petch and Salhi [8,9], among others.In the study by Taillard et al. [5], different routes generated by tabu search are combined to produce workdays through the resolution of a bin-packing problem.In the present work, the tabu search acts on a space with multiple routes without the need to combine them to form workdays.This approach, which is more costly for seeking neighbour solutions within a larger space, eliminates the computational cost of analysing the feasibility of combining different routes and finding a good combination.
Azi et al. [10] proposed a method, which is either exact or heuristic, to solve the VRPTW with multiple routes.It is a generalization of a previous method (Azi et al. [11]) for the same problem, but with only one vehicle available.In their work, a preprocessing is conducted to generate the set of all feasible routes or a subset of these routes so that a branchand-price algorithm is employed to find the optimal solution over the set of routes generated during the preprocessing stage.In the branch-and-price algorithm of Azi et al. [10], the pricing subproblem, which can be cast as an elementary shortest path problem with resource constraints, consists of finding a sequence of feasible routes on a network where each vertex corresponds to such a route.Alternatively, in the present study, the subproblem consists of finding a sequence of customers on a network where a vertex corresponds to a customer or the depot, and no preprocessing is required to find the optimal solution over the set of all feasible routes.
Concerning the treatment typically given to a workday, it is considered that a number of working hours cannot be exceeded for any route, forcing the last customer to be served within a maximum time interval from when the route began [10,12].Cordeau and Laporte [13] proposed a tabu search heuristic for the static dial-a-ride problem with route duration constraints that evaluates the capacity to delay the departure time from the depot to help improve solution feasibility during the neighbourhood search.In the present problem, a similar control is necessary, but it is applied to multiple routes forming a workday in an exact, rather than heuristic, context.
Ceselli et al. [14] solved a real-world, multitrip vehicle routing problem with driver constraints that are similar to those in this study.In their work, each vehicle is described by a maximum daily duty length and, according to work rules, the waiting time and unloading time count as rest time or driving time.In the present problem, the waiting time must always be counted as driving time, which requires the ability to handle the freedom of a variable workday start time.
It is also worth stressing that classical dynamic programming algorithms developed to solve the shortest path problem with resource constraints, such as in Feillet et al. [15], do not include considerations about the determination of the beginning of each vehicle path in relation to the planning horizon.In the present study, the instant when the workday starts is a decision to be made together with the determination of the sequence of customers to be visited.
This characteristic, together with the possibility of the vehicle being able to carry out multiple routes per day, is treated by a specific algorithm to solve the subproblem in the ambit of column generation.This is the major contribution of this work, in addition to providing a framework that also contains a constructive heuristic and a very effective tabu search.
It should be stressed that the vehicle routing problem considered in this study generalizes the VRPTW and is therefore NP hard [16].
The paper is organised as follows: Section 2 provides a brief description of the characteristics of the problem studied and of the objective function to be optimised; Section 3 presents a mathematical formulation; Section 4 describes procedures and algorithms; Section 5 presents parameter settings, problem sets, and the results obtained.The conclusions are provided in Section 6.

Problem Description
The vehicle routing problem studied in this work consists of determining the workday of each vehicle in the fleet to minimise the total cost of the distribution operation from a single depot.
A vehicle workday is defined as a sequence of customers to be visited; for this purpose, the vehicle may return to the distribution centre to be reloaded and then start a new delivery route.That is, a workday route is defined as a sequence of customers who must be visited by the vehicle in a specified order after the vehicle is loaded at the distribution centre.Thus, a workday consists of a feasible sequence of routes throughout the day.For convenience purposes, the maximum number of routes in a workday is predetermined.Vehicles that are not used run a fictitious route in which a vehicle leaves the distribution centre and returns to it without visiting any customer at no cost.
The demand of each customer is known before the day starts, and every customer must be served only once by a vehicle within the planning horizon; that is, a delivery is not fragmented and is compulsory.
Each customer has a specific service time that depends on the vehicle used to perform the service and a strict time window that must be respected.If the vehicle reaches a customer before the opening of its time window, it must wait until the time window opens to begin the service.If the vehicle arrives after the end of the time window, the service is not provided.
Some customers can only be visited by specific vehicles due to accessibility restrictions at the delivery location.
The fleet is heterogeneous in terms of both capacity and average displacement speed.The cost of a vehicle is variable, depending only on the total distance travelled and vehicle type.
The total number of hours worked by a driver in a workday cannot exceed a specified maximum limit.The vehicle loading time at the distribution centre before the route begins and possible waiting times at customers' locations until the opening of their time windows are considered in the calculation of the worked hours, along with the travelling time between customers and the service time at each customer's location.The loading time of the vehicle at the distribution centre depends on the vehicle type.
The travelling times between customers and between the distribution centre and customers depend on both the distance travelled and the average speed of the vehicle used.

Mathematical Formulation
Let digraph  * = ( ∪ { 1 } ∪ { 2 },  * ) be given where  is the set of customers;  1 and  2 are fictitious origin and destination vertices, respectively, both corresponding to the depot;  * is the set of arcs joining two distinct vertices of  * ; and any feasible route corresponds to a path from  1 to  2 on  * .
Let  represent the maximum number of routes driven during a vehicle workday.To represent a workday with  routes, digraph  * is sequentially replicated  times, forming digraph , in which a path from  1 (origin of route 1) to  +1 (destination of route ) represents a vehicle workday with  routes (see Figure 1).
Computationally, only a reduced digraph of a single route  * is stored and used so that memory consumption is not significantly increased.
An instance of the problem is defined by the data listed below.
(viii)    -service time at customer  when the service is provided by vehicle .At the distribution centre, this corresponds to the average loading time of vehicle ,   DC , which is independent of the customers to be served along route, on the grounds that the pallets are ready to be loaded at the beginning of each route and also on the grounds that the vehicle type is more significant due to manoeuvring times, parking time, and need for equipment use such as forklifts.
(ix)    -cost of travelling the arc (, ) in route  with vehicle  ∈ .This value depends on the arc (, ) distance,   , and the variable cost of the vehicle , VC  ($/unit of distance).
(x) t  -travel time of arc (, ) with vehicle .
(xi)  max, -maximum driver working hours for vehicle , that is, the maximum duration of a vehicle  workday.
(xii) -maximum number of routes allowed per day to the vehicles.This value is an upper bound that is sufficiently large to hold all possible trips according to past daily routing operation experience.
The mathematical formulation has two decision variables:    , a binary variable that is equal to 1 if arc (, ) is travelled by vehicle  along route  and that is equal to 0 otherwise, and    , a variable that defines the instant in time at which vehicle  will serve customer  along route .
Below is the mathematical formulation of the initial problem, denoted as IP  6) imposes that time windows must be respected.The group of constraints ( 7) and ( 8) establishes the relationship between the vehicle departure time from a customer and that customer's immediate successor.Constraint (9) states that a vehicle can only be loaded up to its capacity.Constraint (10) guarantees that the workday duration of a vehicle does not exceed its limit  max, .The group of constraints ( 11) and ( 12) defines the solution space of the decision variables.

Multiple Route Cuts.
It is important to notice that the existence of the fictitious arc (  ,  +1 ) when  ≥ 2 is responsible for the existence of solutions that are equivalent from a physical point of view but distinct mathematically.Equivalent solutions of a current solution exist when  ≥ 2 and at least one workday of any vehicle travels at least one fictitious arc and there is, at least, one route serving any customer.In these cases, there is freedom in rearranging fictitious routes inside the workday without changing the sequence of routes serving customers, so as to generate equivalent workdays and, therefore, equivalent solutions.
In order to avoid the existence of equivalent solutions and reduce the expansion of the solution space when the value of  increases, valid inequalities (13) were derived: According to constraints (13), applicable only when  ≥ 2, if a vehicle  travels the fictitious arc (  ,  +1 ) along route , that is, if      +1  = 1, then the fictitious arcs of the following routes must be travelled.

Resolution Algorithm
To solve the problem described in Section 2 and formulated in Section 3, a heuristic procedure to find good valid integer solutions and a column generation procedure to find dual bounds are proposed.These procedures are described in the following subsections.

Constructive Heuristic.
In summary, the developed procedure inserts customers into vehicle workdays and builds an initial solution after inserting all of them according to the pseudocode Constructive Heuristic (See Algorithm 1).

Tabu Search.
Tabu search is a metaheuristic with a wide range of applications in vehicle routing problems [17] that combines intensification and diversification strategies using short-and long-term memories.For the present problem, the removal of a customer from the workday of a given vehicle and the insertion of that customer in the workday of another vehicle, as well as a move to reposition a customer within a vehicle's workday, are both used as moves to find neighbour solutions.In addition, rather than employing long-term memory to help with the diversification process, a predefined number of random moves are performed, enough to generate a solution distant from the current solution.The aspiration criterion adopted is the acceptance of a tabu solution when it globally improves the objective function and when its cost is lower than the best nontabu solution.
The major parameters for calibrating the implemented tabu search are size of the short-term memory, number of iterations without minimum local solution improvement that triggers diversification, number of random moves to diversify a solution, and maximum number of diversifications.
In each iteration, all possible moves are verified instead of stopping the neighbourhood search when a solution whose cost is lower than the current solution cost is found.The latter approach is widely used in the literature when many moves are considered and the computational cost of searching in the entire neighbourhood of a solution is high.However, in this study only two simple moves are used, and the entire neighbourhood is verified.This approach has been proven effective, as described in Section 5.
The tabu activation rule is based on the two move attributes: customer and destination vehicle.If there is a move Constructive Heuristic( ) (1) find customers' geometric centre (2) divide customers into 4 quadrants with their geometric centre as a reference (3) sort each quadrant list of customers in increasing order of time window opening (4) //Travel quadrants in anti-clockwise sequence starting from the upper right quadrant// (5) for each quadrant customer list do (6) for every customer with a list of authorised vehicles do (7) find the smallest insertion cost position among authorised vehicle workdays (8) if a position was found then (9) insert the customer and remove it from the quadrant list (10) sort the list of vehicles in decreasing order of cargo capacity (11) sort the list of vehicles in increasing order of cost with a stable sorting algorithm (12) for each vehicle in the sorted list of vehicles do (13) //Travel quadrants in an anti-clockwise sequence// (14) for each quadrant customer list do (15) for each customer do (16) find smallest insertion cost position in the current vehicle workday (17) if a position was found then (18) insert the customer and remove it from quadrant list (19) //Check if the current solution is infeasible// (20) if there are unserved customers then (21) if tabu search was never called then (22) //"Make room" for future insertions by cost reduction// (23) call tabu search of few iterations to improve the current solution (24) go to line 5 to insert unserved customers (25) else (26) call a commercial software to find a feasible solution Algorithm 1: Constructive Heuristic Algorithm-procedure to find a feasible solution.
that has removed the customer from the destination vehicle workday in the short-term memory, then the reinsertion of the customer is tabu.Otherwise, the insertion is not tabu.Furthermore, in the present implementation, when a move fails to improve the minimum local solution, it is verified whether the maximum number of iterations without minimum local solution improvement has been reached.In this case, a diversification will be executed.Diversification is carried out by randomly choosing customer reallocation moves between vehicles.These moves are executed until the specified number of random moves is reached.The tabu search ends when the maximum number of diversifications is reached.

Extensive Formulation and Column Generation.
It is possible to provide an alternative formulation to the routing problem after observing that (IP) has primal block diagonal structure.This structure is composed by linking constraint (1) and independent constraint blocks from (2) to (13) for each vehicle , corresponding to workdays.Let   be the set of all feasible workdays  of vehicle .The alternative mathematical formulation of the problem, denoted as master problem (MP), is as follows: where    is a binary decision variable equal to 1 if and only if workday  is chosen for vehicle  and 0 otherwise; is the number of times customer  is visited by vehicle  on workday ;    ∈ {0, 1} for  ∈ ,  = 1, . . ., , (, ) ∈ ,  ∈   is equal to 1 if vehicle  along route  travels the arc (, ) on workday ;    = 0 otherwise.Constraints (14) impose that each customer must be served only once, constraints (15) impose that exactly one duty is selected for each vehicle, and constraints (16) ensure that the solution is integer.
We remark that (MP) has an exponential number of variables.Therefore, we resort to column generation [18] to solve the linear relaxation of the master problem (LMP).We further remark that LMP is equivalent to the Dantzig Wolfe reformulation operated on constraints (2)-( 13) of (IP) where the resulting constraints ( 14) are the so-called linking constraints and constraints (15) are the convexity constraints.
In LMP, the set partitioning constraints ( 14) are replaced with set covering constraints: ∑ ∈ ∑ ∈     ⋅    ≥ 1 for  ∈ .When the triangular inequality holds on the cost matrix and load splitting is not allowed, no optimal solution visits any customer more than once.Consequently, the two formulations are equivalent.Set covering constraints are preferable to set partitioning constraints because the associated dual variables are restricted to assume nonnegative values.
To solve LMP, we apply column generation.Therefore, we consider a subset of workdays for each vehicle obtaining a restricted master problem (LRMP) so that a feasible solution to LMP exists.We solve the LMP and consider dual variables  and  associated with constraints ( 14) and (15), respectively.We solve a pricing subproblem for every vehicle  and iterate the algorithm until no column with negative reduced cost exists for every vehicle.
The pricing subproblem consists of finding a workday that potentially contributes to reducing the objective function of LRMP; that is, a workday whose reduced cost for the new variable    is negative.The pricing subproblem, denoted as SP, has the objective function Min where    = 0 for  = 1, . . ., , subject to constraints (2) to (13) regarding vehicle .

Initialisation.
The LRMP is initialised with a feasible solution by calling the constructive heuristic, followed by a tabu search of few iterations to obtain a good initial solution at a low computational cost.In most cases, the number of columns generated for LRMP optimality is significantly reduced with a good initial solution (good initial set of columns).

Lower Bounding and Anticipated Finalisation.
In some cases, it is possible to interrupt the column generation scheme before obtaining the optimal solution with the calculation of a lower bound, LB CG , according to the expression LB CG =  LRMP + ∑ || =1  SP, , where  SP, is the optimal solution or a lower bound of the subproblem of vehicle .A detailed deduction of the expression of the lower bound in the column generation technique can be obtained in the tutorial chapter of Desrosiers and Lübbecke [19].

Stabilisation.
A well-known column generation property is that dual variable values do not smoothly converge to their respective optima but significantly oscillate with seemingly no regular pattern.This behaviour is undesirable, as it represents a major efficiency issue.
Several stabilisation methods that attempt to accelerate convergence have been proposed in the column generation literature.
Two different and widely used stabilisation techniques, namely, the interior point technique described by Rousseau et al. [20] and the method commonly referred to as BoxPen stabilisation, proposed by du Merle et al. [21], were implemented and compared with regard to the present problem.
The interior point technique is a method that addresses degeneracy and convergence difficulties by selecting a dual solution inside the optimal space rather than retrieving an extreme point.To achieve the centralisation of dual values, several extreme points of the optimal dual polyhedron are generated, and the interior point is computed as a convex combination of these extreme points.
The BoxPen method, proposed by du Merle et al. [21], combines two stabilisation techniques.The first technique, described by Marsten et al. [22], consists of defining a box around the previous dual solution and modifying the master problem so that the feasible dual space is limited to the area defined by these boxes.The second technique, proposed by Kim et al. [23], is to adapt the master problem so that the distance separating a dual solution from the previous dual solution is linearly penalised.In summary, the BoxPen method imposes soft limits on the dual variables and a penalty in the objective when dual variables take values outside of these limits (box).
Computational results have shown that the BoxPen method typically requires fewer iterations and a lower overall computing time to solve the LMP when compared to the interior point stabilisation technique, and therefore, the stabilisation method was chosen for the present problem.

Pricing Subproblem.
The pricing subproblem SP can be cast as the computation of an elementary minimum cost path considering resource constraints (ESPPRC) in digraph , in which the costs in the arcs are modified according to the shadow prices of the constraints (14) of the LRMP.A common and widely employed technique for solving this problem is dynamic programming, as successfully used by Desrosiers et al. [24] and Feillet et al. [15].
It should be stressed that the real routing problem considered here has some particular constraints, such as the possibility of a vehicle travelling multiple routes per day and the limitation on vehicle driver working hours that require a particular treatment, that, to the best of our knowledge, have not been addressed to date in the literature by exact methods.
A shortest path problem where waiting time is undesired along the path, namely, the shortest path problem with waiting costs (SPWC), was introduced by Desaulniers et al. [25].In the SPWC, the elapsed time between the arrival and the departure time at a node is referred to as waiting time and there is linear cost penalty that is imposed for each unit of time spent waiting along the path.Deasaulniers and Villeneuve [26] propose two alternative formulations of the SPWC for which algorithms already exist, namely, the shortest path problem with linear node costs and a tworesource generalised shortest path problem with resource constraints.
Unlike the SPWC, in the present problem, there is only the cost of travelling arcs in the objective function, and idle time must be minimised only enough to ensure path feasibility with regard to the maximum workday duration along the minimum cost path.

Dynamic Programming Algorithm.
The algorithm developed herein was based on the bounded bidirectional dynamic programming algorithm proposed by Righini and Salani [27,28] that, in turn, is based on the algorithm developed by Desrochers and Soumis [29] to solve the resource constrained shortest path problem (RCSPP).
In the present problem, a particular resource called the workload resource was introduced.The workload resource is limited to  max, .This resource consists of the time elapsed since the service start time of  1 and its consumption depends on the arcs travelled, the sequence of vertices visited, and the service start time of  1 .
Idle time is generated whenever a vehicle arrives at  before the opening of its time window.In this case, the vehicle is forced to wait until   to start the service, and this waiting time is incorporated into the workload resource.In traditional dynamic programming algorithms for solving the ESPPRC [27], the  1 service start time is always fixed at 0 with no implication for the final result, even though the mathematical formulation allows this variable to assume any positive value.In the present problem, the workload resource depends on the value of this variable.In short, whenever idle time is created, the dynamic programming algorithm has to recursively recalculate the service start times along the path to minimise the workload resource consumption.
However, the simple existence of the workload resource in a label is not sufficient for finding the optimal solution because there are situations in which a dominated label may be part of the optimal path.This behaviour occurs because of the capacity to serve the origin later and thus eliminate idle time along the route is smaller in the first case.Because of this feature, it is necessary to create an additional resource, the value of which represents the potential to eliminate idle time such that the idle time generated during the extension of the labels is minimised or even fully eliminated.
In Figure 2, it is possible to visualise an example with four customers in which  max, = 5,   DC = 0, and [ DC ,  DC ] = [0, 10].Values  and  represent the time and cost of travelling an arc, respectively.Note that path DC-1-2-3 is apparently better than path DC-2-1-3 for reaching 3 at the same instant in time, with the same number of hours worked and with a lower cost by 1 unit.However, path DC-1-2-3 serves 4 with 5 units of workload consumed and is incapable of travelling arc 4-DC with cost  = −10.In contrast, path DC-2-1-3 is capable of leaving the origin at  = 1 and travelling path 3-4-DC, defining the optimal solution: DC-2-1-3-4-DC.The new resource proposed is initially (at the origin) equal to the planning horizon, and along the path, it is consumed progressively until it is exhausted or not at the final destination.
The vector of resources R is composed of the following components:  (consumed fraction of vehicle capacity),  (service start time in a forward label or departure time in a backward label),  (time consumed out of the working hours), Δ (idle time reducing capacity),  (visit vector), and vstQty (number of vertices served along the path).The label also has component , which indicates the route to which the label belongs, and the cost .
When a label (R, , ) associated with vertex  is extended to vertex  generating the label (R  ,   , ), the consumption of each resource is updated according to the rules presented in the next sections, except for , ,  and , whose rules are the same as those described by Righini and Salani [28] and are thus not presented.It is worth highlighting that the extension along the fictitious route arc (  ,  +1 ) is forbidden in both the forward and backward directions, and that the extension to  is only feasible if the vehicle is authorised to serve customer . ), in a backward extension.Based on the expressions above, the idle time is incorporated into the workday duration.To ensure future viability in relation to the hours worked on a path after adding new arcs, it is necessary to eliminate the generated idle time as much as possible.To do so, idle time,  idle , must first be calculated either when reaching  before   in a forward extension,  idle =   − ( +    + t  ), or when leaving from  after (  +    ) in a backward extension,  idle = [ DC − (  +    )] − ( +    + t  ).Eliminating idle time in the forward direction is only possible when there is a margin to serve the origin later, displacing the service start time of all vertices previous to  along the path.For the backward direction, the elimination of idle time is only possible when there is a margin to "leave earlier from the final destination, " displacing the departure time of all vertices previous to  along the path.This is verified in the forward direction by calculating the maximum displacement capacity of the origin service start time without making the path infeasible according to expression
For the backward direction, after calculating  idle and Δ  , in the case Δ  > 0 and Δ  ≥  idle , then   =  +    + t  and  V =  V +  idle , for all V ∈ path(R, , ).In the other case, when Δ  > 0 and Δ  <  idle , then The created forward or backward label is only feasible if   ≤  max, .

Idle-Time-Reducing Capacity
Resource.The workload resource is not sufficient to fully identify a label in relation to the working hours and requires the use of a new resource that is capable of identifying the capacity of the path to reduce idle time, Δ.This resource is crucial for calculating working hours during the extension of a label so that the path feasibility is not lost after the incorporation of new arcs.
As Δ is progressively consumed along the path, Δ can be considered a new resource and updated according to the procedure described in Section 4.4.2 when idle time is generated during the extension.If idle time is not generated and the service start time or departure time does not violate the time window of , then Δ  = min{Δ, (  −   )} for forward labels and Δ  = min{Δ, [ DC − (  +    )] −   } for backward labels.
Because 0 ≤ Δ  ≤  DC , there is no need to verify whether the extension feasibility is lost in relation to this resource.

Multiple Routes.
If only one route is considered, the reduced digraph  * = ( ∪  1 ∪  2 ,  * ) is sufficient for the formulation.When more than one route is allowed up to a limit of  routes, the natural strategy is to replicate digraph  *  times, obtaining digraph  in which a path from  1 to  +1 represents a workday.However, for computational implementation, it is not necessary to replicate digraph  *  times because arcs linking destination  2 to the origin  1 and vice versa (backward direction) are used, together with specific extension rules for these arcs.For route control, a route  indicator is added to each label.At first, an initial forward label is added to origin  1 with  = 1 and an initial backward label to destination  2 with  = .
Forward and backward extension procedures are executed until all of the forward and backward labels have been extended.At this moment, all of the forward and dominant labels in  2 are extended to  1 according to a special rule.The same occurs for the backward and dominant labels of  1 , which are extended to  2 according to a special second rule.Among other things, during this extension, the route indicator  of the forward labels is incremented as the route indicator of the backward labels is decremented; thus, a new cycle for the application of the extension procedures begins.
This process is repeated until  =  in the forward labels and  = 1 in the backward labels.The cost and time for travelling the arc to return to the beginning of a new route are equal to 0. The extension of a forward label from  2 to  1 is made according to the rules   = ,   = 0,   = ,   = ,   =  + 1, Δ  = Δ, and The extension of a backward label from  1 to  2 is made according to the rules   = ,   = 0,   =  − 1, Δ  = Δ, and If the backward label does not visit any customer along route , then   =  and   = .If the backward label visits at least one customer along route , then   = +  DC and   = +  DC .
4.4.5.Lexicographic Ordering of Labels.The dominant labels are kept organised in increasing lexicographic order along all of the iterations as described in Ceselli et al. [14].A label  1 is considered lexicographically smaller than a label  2 if  1 <  2 or  1 =  2 and vstQty 1 < vstQty 2 .As the workload resource is minimised in a label, it is a significant resource for the dominance rule, and for this reason, it is selected as the major ordering criterion, followed by the number of vertices served.In short, whenever a vertex is selected so that its labels not yet treated are extended, the selection and extension of labels occur in increasing lexicographic order.

Bounding on Resource.
Based on the strategy adopted by Righini and Salani [28] to prevent the proliferation of labels in both directions, a critical resource is selected such that labels consuming over 50% of the resource are eliminated from each of the two directions.In the present problem, the critical resource is time; thus, a label is discarded when   > 0.5 ⋅ ( DC −  DC ) during an extension.4.4.7.Dominance Rule.The efficiency of the dynamic programming algorithm depends mainly on its capacity to identify and to eliminate labels that will certainly not be a part of the optimal solution.This prevents these undesired labels from generating an unnecessary proliferation of labels that largely contribute to increasing the total computational cost.To conduct this identification and elimination of labels that are certainly not a part of the optimal solution, there are dominance tests that are always executed when a label is extended so that the labels at each vertex are nondominated.
According to the dominance rule, label  1 dominates label  2 if all of the following expressions are true:  1 ≤  2 ; Value of cells marked with * were not computed due to an overall time greater than 93000 s.
1 ≤  2 ;  1 ≤  2 ;  1 ≤  2 ;  1 −  1 ≥  2 −  2 ; Δ 1 ≥ Δ 2 ;  1 =  The exact labelling algorithm is typically made heuristic by removing from its dominance rule combinatorial nature resources, which reduces the number of labels generated, as described by Ceselli et al. [14].In the present study, the Heuristic Labelling Algorithm uses a relaxed dominance rule that does not have the visit vector, , the capacity to reduce idle time, Δ, and component ( − ).4.4.9.Forward-Backward Joint Feasibility Tests and Search for Optimal Solution.After the end of the label extension process (1)  1 ≤  2 ; (2) There is arc (, ); for all of the routes, the forward labels are combined with the backward labels to form paths from  1 to  +1 .The optimal solution is determined after all of the feasible combinations between a forward label and backward label at each possible pair of vertices (, ) are verified.For this purpose, there are seven tests that are applied to ensure that the combination of the two labels,  1 forward in  and  2 backward in , creates a feasible path from  1 to  +1 .The seven tests are described in Algorithm 2.
Cost  of the combination between labels  1 and  2 is calculated as  =  1 +  2 +   .
When searching for an optimal solution, a very simple but rather efficient approach to reduce the computational cost is to first compute the cost of the label combination and then test the joint feasibility, only if its cost is lower than the lowest cost already found.

State-Space Relaxation and Decremental State-Space
Relaxation.State-space relaxation was introduced by Christofides et al. [30].The state-space, Ψ, explored by the exact dynamic programming algorithm is projected onto a lower dimensional space, Γ, so that each state in Γ retains the minimum cost among those of its corresponding states in Ψ (assuming the objective function must be minimised).In space Γ, the number of states to be explored is drastically reduced; the drawback is that some original state corresponding to an infeasible solution in Ψ may be projected onto a state corresponding to a feasible solution in Γ; therefore, the search in the relaxed state-space does not guarantee finding an optimal solution but rather a lower bound.
The state-space relaxation algorithm considered in this study consists of mapping each state (R, , , ) onto a new state ( R, , , ) in which the visit vector  is removed from the resource vector R by being mapped as the existing resource vstQty (vstQty = ∑ ∪{ 1 , 2 } []).Except for the removal of the visit vector , there are no differences when compared to the exact dynamic programming algorithm dominance rule.
The considered mapping makes a large difference with respect to the exact dynamic programming algorithm in which the visit vector  yields an exponential number of possible states.Because the state (label) does no longer keeps information about the set of already visited vertices, cycles are allowed; therefore, the path is guaranteed to be feasible with respect to the resource constraints but it is not guaranteed to be elementary.
As opposed to the state-space relaxation, the decremental state-space relaxation pursues a compromise between forbidding and allowing multiple visits to the vertices, that is, a compromise between the exact dynamic programming algorithm and state-space relaxation algorithm.This compromise is accomplished by identifying some vertices as critical according to the solution obtained and by preventing multiple visits to critical vertices in subsequent runs while allowing multiple visits to noncritical vertices.
In summary, the decremental state-space relaxation algorithm runs iteratively the state-space relaxation procedure in which each label also has a binary vector Ŝ playing the same role as  in the exact dynamic programming algorithm: if Ŝ[] = 1, then  is critical, and Ŝ[] = 0 otherwise.Consequently, the extension of a label ( R, , , ) from vertex  to vertex  is feasible with regard to Ŝ only if Ŝ[] = 0 or if Ŝ[] = 1 and  is not visited along the path ( R, , , ).
Every time a solution with cycles is produced by the modified state-space relaxation procedure, the vertices visited more than once are marked as critical, and the algorithm restarts.The optimal solution, which is guaranteed to be feasible with respect to the resource constraints and to be elementary, is obtained when a solution with no cycles is produced.
It is worth highlighting that the decremental state-space relaxation is used only when optimality is required, that is, Mathematical Problems in Engineering only when the dominance rule is complete (includes the visit vector ).The heuristic labelling algorithm is based on the regular bidirectional dynamic programming algorithm.

Computational Results
Solomon's benchmark vehicle routing problem instances R1 and R2 were adapted to the context of the problem studied.
The maximum workday duration for any driver is ⌊(9.5/16)⋅ ( DC −  DC )⌋, where [ DC ,  DC ] is the depot time window, 9.5 hours is the maximum workday duration in the real context analysed and 16 hours is the real delivery operation planning horizon.
Based on Solomon's instance service time at each customer, defined as   , and considering a depot loading time  DC = 10, then the service/loading times are computed as    = 1.5  for a truck,    = 1.25  for a LCV, and    =   for a van.
In every instance, customer 2 can only be served by a specific truck, and customer 3 can only be served by another specific truck.
The tabu search executed is defined by the parameters as follows: (1) short-term memory size = 20; (2) number of iterations without improvement that triggers diversification = 100; (3) number of diversifications = 1000; (4) number of random moves to diversify a solution = 60.
Regarding the implemented BoxPen stabilisation method, the linear penalty for leaving the box is initially set to 0.001, and the box size around each dual variable is initially set to 0.2.If a subproblem is able to find a negative reduced cost path, then the linear penalty is increased by 10%.However, when there are no more such columns, the penalty is significantly decreased (divided by 1000), and every dual box is recentred on the corresponding last dual value.This procedure guarantees column generation convergence.Furthermore, the initial dual solution is defined with potentially good initial dual values based on some knowledge of the problem, and consequently, they are estimated and computed as   = min ∈ {  } and   =   ⋅ min ∈ {  1  }.
In the available fleet for R1 instances there are 25 vehicles: 15 trucks, 5 LCVs, and 5 vans.In the available fleet for R2 instances there are 15 vehicles: 5 trucks, 5 LCVs, and 5 vans.The capacities of a truck, LCV, and van are 200 (as in Solomon's instances), 120 and 40, respectively.The costs per unit of distance run by a truck, LCV, and van are 1, 0.75 and 0.5, respectively.The average speeds of a truck, LCV and van are 1, 1.333 and 1.666, respectively.
The adapted R1 and R2 instances with the first 50 customers were solved with 1, 2, and 3 routes in a workday.The final results obtained are presented in Tables 1 and 2 with the corresponding lower bound cells filled with the LRMP optimal solution for R1 instances and with the first valid lower bound, that is, after one iteration with guaranteed optimality on subproblems, for R2 instances.
In every instance, the Cplex was not required to find a feasible solution.
A workstation with an Intel Core i7 2.80 GHz processor and 16374-Mb RAM memory was used to run the program developed in Java.CPLEX 12.2 is used to solve each LRMP and find a feasible initial solution when the constructive heuristic fails to find one.

Conclusions
In this paper, we presented a mathematical formulation based on arc flow variables that is able to solve a complex vehicle routing and scheduling problem.Valid inequalities were also presented to strengthen the formulation when more than one route is allowed in a workday.
To find integer solutions, a constructive heuristic and a metaheuristic based on tabu search were developed that can generate good solutions with an average error of less than 2.1% for R1 problems and 4.7% for R2 problems.It should also be noted that the heuristic computational cost is low and that the standard deviation of heuristic resolution times for instances with one, two, or three routes in a workday is also very low, which makes the heuristic solution procedure robust to a daily routing operation.
A slight increase in the average heuristic solution cost was observed when the number of routes is incremented due to an enlargement of the solution space without an increase in the number of diversifications and/or use of more sophisticated tabu search intensification and diversification strategies.
To evaluate the performance of the heuristic, a column generation algorithm with state-of-the-art techniques was presented that was able to generate excellent dual bounds.To solve the column generation pricing subproblem, a particular dynamic programming algorithm for the elementary shortest path problem with resource constraints that can address multiple routes in a workday with limitation on driver work hours was presented.
Future work related to this study may include the development of a branch-and-price algorithm, based on the column generation procedure proposed, to solve the problem exactly.Valid inequalities may also be derived to develop a branch-and-cut-and-price algorithm.Moreover, a branchand-bound algorithm to solve to optimality the elementary shortest path problem with resource constraints may also be developed, by exploiting the lower bound given by the bounded bidirectional dynamic programming algorithm with state-space relaxation.

Figure 2 :
Figure 2: Example where the idle time reducing capacity resource is required.
1 ,  2 , . . .,   } ∪ { 1 ,  2 , . . .,   ,  +1 }, ) in which the set of vertices is composed of the set of customers  replicated  times by adding the  route origins and the vertex of the final destination ( +1 ); set  contains the arcs between any two vertices on the same route.There are no arcs reaching origin  1 , and there are no arcs leaving destination  +1 .Vehicles not used in route  during the workday travel the fictitious arc (  ,  +1 )   -maximum capacity of vehicle , in units.(vi) [  ,   ]-time window of customer .For the vertices { 1 , . . .,  +1 } representing the distribution centre (depot), the time windows correspond to the planning horizon; that is, [  1 ,   1 Resource.The calculation of   , the workload resource, for forward labels is initially given by   = max{ + (   + t  ),  + (  − )}.For backward labels, the value of   is   = max{ +    + t  ,  + {[ DC − (  +   )] − }}.However, idle time is generated when the vehicle reaches vertex  before the opening of its time window, +   + t  <   , in a forward extension or, when the vehicle departs from vertex  after the end of its departure time window,  +    + t  ≤  DC − (  +

Table 1 :
Results obtained for R1 instances with the first 50 customers.

Table 2 :
Results obtained for R2 instances with the first 50 customers.