A Study on Online Scheduling Problem of Integrated Order Picking and Delivery with Multizone Vehicle Routing Method for Online-to-Offline Supermarket

+e online order fulfillment of online-to-offline (O2O) supermarket faces the challenge in how to pick orders from thousands of products on the supermarket shelves and deliver them to customers in different zones and locations by a vehicle routing method within the lowest cost and shortest time. It is critical to integrate the order picking and delivery processes and schedule them jointly with a coordinated manner. +us, in this paper, we study the online integrated order picking and delivery problem with multizone routing method (IOPDP-MR) to minimize both the maximum delivery completion time and the total delivery cost.+e online algorithm A is presented to solve the online problem and is proved to be 2(1 + Qv)-competitive, where Qv is the vehicle capacity. Since it is difficult to get a lower competitive ratio theoretically, the numerical experiments are proposed to analyze the gaps by comparing the values of algorithm A with the ones of offline optimal algorithm A∗ under different situations. It can be inferred that the competitive ratio is less than 2.5 and the average flow time for customer orders is less than 30 minutes, which verifies the good performance in both computation efficiency and customer satisfaction of algorithm A.


Introduction
e online-to-offline (O2O) supermarket e-commerce business model is currently very popular with middle-class consumers in China, such as Alibaba's fresh [1] and JD's 7fresh [2]. e online orders are picked up by pickers in the offline supermarket and then delivered by vehicles to customers within 30∼60 minutes. However, owing to the small lot-size, highfrequency, time-sensitive, and dynamic arrival of online customer orders, many O2O supermarkets face challenges in online order fulfillment [3]. Two key processes in the online order fulfillment are order picking and delivery, respectively.
us, how to pick ordered products from the supermarket shelves and deliver them to customers within the lowest cost and shortest time are critical issues that need to be solved.
In most O2O supermarkets, as online orders are small lot-size, the order batching method is used in an order picking process. It is an efficient method to group a set of orders into several subsets, i.e., batches, to minimize the maximum picking completion time [4]. To shorten the delivery completion time and provide better customer service, the delivery area of the O2O supermarket is usually divided into multiple communities (called zones in this paper) based on customers' geographical locations, and each zone is served by several capacitated vehicles. e vehicles deliver the ordered products to the customers by the vehicle routing method to minimize the delivery completion time. To achieve more efficient online order fulfillment performance, it is critical to integrate the order picking and delivery processes and schedule them jointly with a coordinated manner. us, in this paper, we study the online integrated order picking and delivery problem with multizone routing method (IOPDP-MR). An online schedule should be generated to minimize both the maximum delivery completion time and the total delivery cost. e online IOPDP-MR can be defined using the five-field notation α| β| π| δ| c [5] as follows: P m |online, B ≤ Q b , batch|V(∞, Q v ), routing|Z|D max + TC. e value of the integrated schedule has been proven by Zhang et al. [6], Moons et al. [7], and Zhang et al. [3]. However, Moons et al. [7] studied the offline IOPDP with a vehicle routing delivery method; Zhang et al. [6] and Zhang et al. [3] studied the online IOPDP with no routing method. is paper closes the research gap by studying the online IOPDP with a vehicle routing delivery method. e contribution of this paper is embodied in three aspects. First, to be more realistic, we formulate the O2O supermarket order fulfillment problem into the online IOPDP-MR. Second, although Moons et al. [7] studied the offline IOPDP with a vehicle routing method, the vehicle routing method has not been studied in online environment of IOPDP. We close the research gap by studying the online IOPDP with a vehicle routing method. ird, to tackle the online IOPDP-MR, an online algorithm is proposed. e effectiveness and efficiency of the online algorithm are demonstrated by the competitive analysis and the numerical experiments. e results of this paper are as follows. An online algorithm, called A, is presented and proved to be 2(1 + Q v )-competitive. Since it is difficult to get a lower competitive ratio theoretically, the numerical experiments are conducted to further analyze the gaps between A and the offline optimal algorithm (A * ) under different situations. e experiments show that the competitive ratio is less than 2.5, and the average flow time per order, which is the time interval from the arrival time to the delivery completion time, is less than 30 minutes. Several managerial implications are also drawn from the experiments. e remainder of the paper is organized as follows. We first review some related literature in Section 2. e online IOPDP-MR is formulated in Section 3, and the online algorithm A is proposed in Section 4. To further verify the effectiveness and efficiency of A, a series of experiments are presented and the results are discussed in Section 5. e paper concludes with a summary and directions for further work in Section 6.

Literature Review
e online IOPDP-MR has some similar features with the integrated production and delivery problem (IPDP). e order picking process in IOPDP-MR is a special production process in IPDP, and thus, IOPDP-MR and IPDP both contain the production process and the delivery process; IOPDP-MR can be classified into a special IPDP. e literature of IPDP is vast. e comprehensive review can be seen in Chen [5]. In this paper, there are multiple pickers in the picking process, and thus, we only review the online IPDP with parallel-machine production environment. Table 1 provides an overview of the online IPDP articles with parallel-machine production environment. e production process, vehicle characteristics, delivery method, objective function, and competitive ratio are reviewed in Table 1 to distinguish those IPDP models. e production process includes direct process and batch process. e vehicle characteristics include the vehicle number and the vehicle capacity. e delivery method includes individual and immediate delivery (idd), batch delivery by direct shipping (direct), and batch delivery by routing (routing). e objective function includes two types: one type is the maximum delivery completion time (D max ), and another is the sum of the maximum delivery completion time and the total delivery cost (D max + TC). For the online IPDP to minimize D max , Liu and Lu [8] studied the special IPDP with two parallel machines and direct production process. e ( � 5 √ + 1/2)-competitive online algorithm was proposed. ey later expanded the problem to unbounded batch parallel-machine scheduling [9] and proposed an online algorithm with a competitive ratio of 1 + 2/ �� m √ . Tian et al. [10] studied the unbounded drop-line batch machine scheduling problem and presented an online algorithm with a competitive ratio of 1 + α m . For the above three papers, the vehicle number and vehicle capacity were unlimited, and the delivery method is idd. For the online IPDP to minimize D max + TC, Han et al. [11] investigated different IPDPs with parallel machine in terms of the number of vehicles and the capacity of vehicles. e competitive ratios were different in different IPDPs. Zhang et al. [3] studied the online IOPDP, and the production process is a multiple picker bounded batch process, which is the same with our study. However, the delivery method was direct.
In our study, the routing method is considered, which is more realistic but more complex to solve. e production process is multiple-picker bounded batch processing. e objective function is D max + TC. e 2(1 + Q v )-competitive online algorithm is provided to solve the online IOPDP-MR.
We further discuss the literature of IOPDP. As far as we know, only Zhang et al. [6] and Zhang et al. [3] studied the online IOPDP. In [6], the picking method is order batching with one picker, and delivery method was direct. In [3], they expanded the picking method to order batching with multiple pickers, but the delivery method was still direct. Moons et al. [7] studied the offline IOPDP with a routing method. Although the routing method was considered, the information of orders was known beforehand, and the batching method was not considered.
In this study, more realistic factors in the delivery part are included, which makes the problem more complicated and harder to solve. To sum up, the online IOPDP-MR differs from the previous papers in the following three aspects. First, orders are batched based on the online order batching method with multiple pickers. Second, several zones need to be serviced by the routing delivery method.
ird, there is enough capacitated vehicles for each zone. e combination of the three aspects above makes our paper unique in the literature.

Problem Description
e full process of the online IOPDP-MR is shown in Figure 1. First, the orders which arrive dynamically are batched and assigned to the pickers to process. en, the completed orders which belong to the same zones are assigned to vehicles to deliver by the routing method. e full process can be separated into the picking process and the delivery process. e notations in Table 2 will be used throughout the paper. e picking process can be described as follows. Orders that arrived are combined into batches and assigned to the pickers by the online order batching algorithm (see details in Section 4). e picking process includes the following decision issues: (1) which orders should assign to the same batch; (2) when and in what sequences should the batches be processed; (3) which pickers should the batches assign to. In the delivery process, the completed orders are delivered to the customers by solving the online vehicle routing problem. ere are two critical issues: (1) when the deliveries should depart from the supermarket; (2) which completed orders should be loaded in each delivery. e meanings of several terms about time in Figure 1 are described as follows: (i) e arrival time is the point in time when a customer order arrives in the order fulfillment system. It also means the time that the order is available to be picked.  Han et al. [11] Direct  Set of batches, j ∈ B R l Set of delivery routes in zone l, Capacity of the picking device Each route incurs a fixed delivery cost and a variable delivery cost. e fixed delivery cost of route r l to zone l is D l , and the variable delivery cost of route r l to zone l is identified as the delivery time t d r l . e assumptions and constraints are as follows: (i) Each customer places one item, and the size of each item is identical. (ii) Each order can be assigned to one batch, and the capacity of each batch cannot exceed Q b . (iii) Each batch needs to be processed by one of the available pickers, and each picker can only serve one batch at a time. (iv) e layout of the supermarket is shown in Figure 2, which is similar to the setting in De Koster et al. [4] and Zhang et al. [3]. e picker starts from the depot on the leftmost aisle, retrieves items from the left and right side through the aisles, and returns to the depot to hand over the picked items as the last step. e picker can pick up several items in one tour with the help of the picking device. (v) Once the picker starts a tour through the warehouse, the interruption of the picker and the rearrangement of the picking batch are not allowed.
(vi) In this paper, we focus on the study of the online order batching algorithm, and thus, the storage assignment storage is random storage. (vii) For the picking route strategy, there is an optimal routine [12], which offers a fast solution time and the shortest distance route for the single block warehouse. However, optimal routes are often confusing in nature and may not work within the confines of an order picking operation [4]. In practice, the problem of order picking routing is usually solved by using heuristics. For example, the S-shape strategy [4,13] is a well-known heuristic used in the literature, providing nonconfusing picking routes for pickers. In this study, the S-shape strategy is applied to optimize the picking route. (viii) Each delivery route starts at the supermarket and terminates at the supermarket. (ix) Each customer can be served by one delivery route, and the capacity of each route is Q v .
Using the five-field notation α| β| π| δ| c [5], we can define the online IOPDP-MR as follows: (1) Details of the five-field notation are presented in Table 3.

Online Algorithm Design
e algorithm framework is discussed in Section 4.1 and the competitive analysis is proposed in Section 4.2.

Algorithm Framework.
For the online order batching method with multiple pickers, Zhang et al. [3] proposed the online algorithm A pick , and the competitive ratio is 2. e objective is to minimize the maximum delivery completion time. e algorithms proposed an idea to start assign the batch j to the earliest available picker at time t once t ≥ (1 + θ)t a i′ + θt s i′ − t s j (t is the earliest available time among the pickers, batch j is the uncompleted batch with longest service time (t s j ), order i′ is the one with longest service time (t s i ′ ) in batch j, and t a i′ is the arrival time of order i′). e arrived orders are batched by applying the batching method C&W (ii) [14]. In this paper, we use the online algorithm A pick in Zhang et al. [3] to solve the order picking part.
For the online IOPDP, Zhang et al. [6] proposed a 4competitive online algorithm. In their study, there were also multiple zones, but the number of vehicles in each zone was one. e round-trip travel time from the warehouse to zone l was T l , and the round-trip delivery cost from the warehouse to zone l was D l . e delivery decision was triggered at the end of each specific time interval unlimited, the vehicle capacity was Q v , and the round-trip delivery cost from the warehouse to customers was a constant. However, the delivery method was direct, and the travel time was not considered. e delivery decision was triggered by the condition that the number of completed undelivered orders was at least Q v . In this paper, there are some identical settings with Zhang et al. [3], such as the number of vehicles, the vehicle capacity, and the fixed delivery cost, but the delivery method and objective function are different. We consider the routing delivery method, which is more realistic and complex. e specific constraints are shown as follows: (i) For the constraints in picking part, there are m identical pickers. e orders are generated into batches, and the capacity of each batch cannot exceed Q b . Each picker can process one batch at a time.
(ii) For the constraint in the delivery part, there are |Z| delivery zones, the number of vehicles in each zone is unlimited, and the vehicle capacity is Q v . e customer locations of the orders belonging to the same zone can be delivered together by the routing method. Each delivery to zone l contains a fixed delivery cost D l and a variable delivery cost t d r l . e objective function contains the time-dependent function and the cost-dependent function. e timedependent function is to minimize the maximum delivery completion time of all orders, that is, D max . e costdependent function is to minimize the total delivery cost, including the fixed and variable delivery costs, that is, TC � FC + VC. To minimize the fixed delivery cost, it is reasonable to trigger the delivery decision when the number of completed undelivered orders in zone l exceeds Q v . When the number of completed undelivered orders in zone l is less than Q v , it might be beneficial to delay the delivery to wait more orders, but the waiting time should not be too long. As we know, the variable delivery cost represents the time interval from the earliest picking completion time to the maximum delivery completion time. Waiting too long would increase the variable delivery cost. To minimize the variable delivery cost, therefore, we can set an upper limit of the delivery time for each route, that is, D l . e details of the online algorithm A are shown as follows (Algorithm 1). Proof of eorem 1. C max is the maximum picking completion time. As we know, (A)≤C max (A * ). e detailed proof processes can be seen in [3]. t d r ′ is the delivery time of the last delivery route r′. We can get that Z 1 (A) � C max (A) + t d r′ . As Z 1 (A * ) � C max (A * ) and t d r′ is generally much less than C max (A * ), we have (Z 1 (A)/Z 1 (A * )) ≤ 3.

□ Theorem 2. e cost-dependent function value of Algorithm A is not greater than 2(1 + Q v ) times the one of Algorithm
Proof of eorem 2. Suppose that a delivery route is called saturated if it contains Q v items; otherwise, it is called unsaturated. Let q 1 be the number of routes to zone l in A * and q s l (q u l ) be the number of saturated (unsaturated) deliveries to zone 1 in A. e fixed delivery costs in A and A * are l∈z (q s l + q u l )and l∈z q l D l , respectively. As the delivery time of each route in A is not greater than D l , we have Z 2 (A) ≤ 2 l∈z (q s l + q u l )D l . We know Z 2 (A * ) > l∈z q l D l + l∈z T * l , where T * l is the minimum total travel time needed to service all customers orders with the capacitated vehicles in zone l. As T * l is hard to quantify and compare with others, we set Z 2 (A * ) > l∈Z q l D l . As q u l ≤ q l Q v and q s l ≤ q l , we have Table 3: e five-field notation.
α Picker configuration (we regard "picker" as "machine" in this field) P m : orders are processed by m identical pickers.
β Restrictions and constraints on order parameters Online: orders arrive dynamically. B ≤ Q b , batch: orders are processed by batches and the capacity of the batch is no more than Q b . π Vehicle configuration. V(∞, Q v ): unlimited vehicles available and each vehicle can service at most Q v customers. Routing: batch delivery with routing method, i.e., orders going to different customers can be delivered together in the same route (where vehicle routing is a part of the decision).
δ Number of customers Z: the delivery area is separated into |Z| zones.
c Objective function w 1 , w 2 : the weights of the time-dependent function and cost-dependent function, respectively, where w 1 , w 2 > 0.
D max : the time-dependent function, that is, the maximum delivery completion time of all orders. TC: the cost-dependent function, that is, the total delivery cost, i.e., the sum of the fixed delivery cost (FC) and variable delivery cost (VC), where FC � l∈Z |R l | · D l and VC � l∈Z r l ∈R l t d r l .

Numerical Experiments
It seems possible that the competitive ratio 2 (1 + Q v ) of algorithm A can be reduced, but it is hard to get such a result by the method of competitive analysis. us, we present the numerical experiments to further analyze the gaps between Z (A) and Z (A * ) under different situations. e parameters of the O2O supermarket set in this paper, shown in Table 4, are frequently used in the literature [3,16]. e supermarket contains 1000 identical storage locations, and 100 storage locations are distributed on both sides of each aisle. Pickers retrieve items from the right and left side of each aisle simultaneously. e depot is on the entrance of the leftmost aisle, shown in Figure 2. e length of the aisle is 50 m, and the width between two adjacent aisles is 2 m. e service time includes the travel time, pickup time, and setup time. e picker's travel speed is v travel � 48m/min, the pick-up speed is v pick � 6(item/min), and the setup time is t setup � 3 min. e storage policy is random-based policy. e number of pickers is 3, the number of delivery zones is 4, and the batch capacity is Q b � 10. From eorems 1 and 2, we can get Z 1 (A * ) ≥ t arrive n and Z 2 (A * ) > l∈z q l D l + l∈z T * l . We set Z 1 (A * ) ≥ t arrive n and Z 2 (A * ) � l∈z q l D l + l∈z T * l in the experiments. T * l can be calculated by solving the capacitated vehicle routing problem. e interarrival times (the time between the arrival of customer order i with i + 1) are exponentially distributed with the arrival rate λ � 4. e total number of orders is n � 100. e x-coordinates and y-coordinates of the locations of the customers are randomly sampled from U (0, 5 km). e locations are not known in advance but become available when the orders arrive. e supermarket is located at the middle of the square, i.e., (2.5 km, 2.5 km), which separates the delivery zones into four parts, as shown in Figure 3. e average vehicle speed is 30 km/h. To calculate T * l in Z 2 (A * ) and t d r l in Z 2 (A), the genetic algorithm (GA) is used to get the near-optimal route for each delivery. e initial population size is 100, the iteration number is 500, crossover ratio is 0.8, and mutation ratio is 0.4. Round Robin selection rule and the single point crossing rule are selected. e values of the vehicle capacity and the fixed delivery cost are varied based on a survey of O2O supermarket in China. Q v is set to be 5, 6, 7, 8, 9, or 10, and D l is set to be 5, 10, 15, 25, 30, or 35. By choosing different values of Q v and D l , 36 (6 * 6) problem classes are generated, and for each problem, 50 instances are calculated. We provide the average results of the 36 problem classes. e experiments are carried out on an Intel Core i7 Processor and 8.0 GB RAM. e algorithm is implemented with MATLAB R2016a. Table 5 shows the detailed results of A and A * under 36 problem classes, which includes the number of deliveries in A (N (d) for short), Z 1 (A), FC, VC, Z 2 (A), the number of (1) Picking part: schedule all arrived orders according to algorithm A pick [3] regardless of which zones they belong to. e completed orders are assigned to the delivery part. (2) Delivery part: for each zone l, the delivery decision is made once one of the following two situations happens: (1) the number of completed undelivered orders is Q v ; (2) the delivery time of the route t d r l is larger than D l , and the route r l contains all the completed undelivered orders in zone l. As there is only one route at every delivery decision point, the delivery route can be calculated by solving the travelling salesman problem (TSP) [15].  We can see that R ranges from 1.19 to 2.34. e smallest R occurs when Q v � 5 and D 1 � 35, and the largest R occurs when Q v � 1 and D 1 � 10. We can see that, with the increase in D l , R decreases. It can be explained that, with the increase in D l , more completed orders can be delivered in full batches, then N (d) decreases, and FC would be closer to FC * . Moreover, when D l is large enough, N (d) would be equal to N (d) * . For example, when Q v � 5, D l � 10, the number of deliveries is 43. When D l increases to 25, the number of deliveries decreases to 21, which equals to the one of A * . To sum up, the competitive ratio is less than 2.5, which verifies the good performance of algorithm A.
Next, we will discuss the values of the objective function, and the weights w 1 � w 2 � 1. Figure 4 shows the development of the objective function (the total cost) under different D l . We can see that, with the increase in Q v , the total cost decreases. It means that when D l is unchanged, a larger vehicle capacity is better. We can also see that the descend range of the total cost decreases with the decrease in D l . When D l � 10, the largest cost occurs when Q v � 5. With the increase in Q v , the cost changes little (shown in purple line in Figure 4). It means that if the fixed delivery cost per route is very small, it has little benefit to increase the vehicle capacity.
For the value of Z 1 (A), i.e., the maximum delivery completion time, we can see that the change is not obvious under different Q v and D l . It is because that the maximum delivery completion time is related to the last delivery route of each zone, and in general, the last route contains fewer orders, which is not full loaded. us, the departure time of the last route is generally the picking completion time of the last order, which has less relationship with Q v and D l in Algorithm 1.
As we know, for O2O supermarket, another important performance is the average flow time per order, which is the time interval between the arrival time and the delivery   Table 6.
We can see that the average flow time ranges from about 18 to 26 minutes, which is an acceptable flow time for O2O supermarket. In general, with the increase in Q v and D l , the average flow time increases. It is because that larger Q v and D l will make the orders wait longer when the picking process is completed. When D l is small (for example, 10∼25), the increase in Q v has little impact on the flow time. In contrast, when D l is large (for example, 30∼35), the increase in Q v leads to a large increase in flow time.
To sum up, the competitive ratio of Algorithm 1 is less than 2.5, and the average flow time for customer orders is less than 30 minutes, which verifies a good performance on both the computation efficiency and the customer satisfaction of Algorithm 1.

Conclusions
is paper is inspired by the online order fulfillment in O2O supermarket, which included two key processes: order picking and delivery. e order picking process is the online order batching problem with multiple pickers, and the delivery process is the capacitated vehicle routing problem. To provide an efficient online order fulfillment solution, we integrate the order picking process and the delivery process and formulate the joint problem as the online IOPDP-MR. e objective function is to minimize the maximum delivery completion time and the total delivery cost. We propose the online Algorithm 1, and the competitive ratio is proved to be 2(Q v + 1). As it is hard to get a lower competitive ratio theoretically, the numerical experiments are conducted to analyze the gaps between Z(A) and Z(A * ) under different situations. e main contributions of this study are as follows: (1) modelling a specific IPDP, which is the online IOPDP-MR for O2O supermarket; (2) proposing a 2(Q v + 1)-competitive algorithm to solve the online IOPDP-MR; and (3) the competitive ratio is less than 2.5 in the experiments, which implies a good computation efficiency of Algorithm 1. Meanwhile, several managerial suggestions can be concluded when implementing algorithm A. (1) Larger delivery cost per route generally leads to a larger total cost. (2) If the fixed delivery cost per route is very small, the change in the vehicle capacity is unnecessary to decrease the total cost. (3) If the delivery cost per route is high, the increase in vehicle capacity can effectively reduce the total cost. However, for customers, the flow time increases, which may reduce their satisfaction.
ere are some limitations of this paper. e competitive ratio is proven to be 2(Q v + 1), which is not smaller enough. Further research will focus on investigating the best possible competitive ratio for algorithm A. Moreover, the delivery route is calculated by the general GA, and the time window of the customers is not considered. We will further improve the study by proposing the improved heuristic algorithm to solve the capacitated VRP with time window. We are also interested in investigating the order picking and delivery problem under a multitemperature transportation modal.

Data Availability
No data are needed in the manuscript.

Conflicts of Interest
e authors declare that they have no conflicts of interest.