Solving a Closed-Loop Location-Inventory-Routing Problem with Mixed Quality Defects Returns in E-Commerce by Hybrid Ant Colony Optimization Algorithm

This paper presents a closed-loop location-inventory-routing problem model considering both quality defect returns and nondefect returns in e-commerce supply chain system. The objective is to minimize the total cost produced in both forward and reverse logistics networks. We propose a combined optimization algorithm named hybrid ant colony optimization algorithm (HACO) to address this model that is an NP-hard problem. Our experimental results show that the proposed HACO is considerably efficient and effective in solving this model.


Introduction
According to eMarketer, worldwide business-to-consumer (B2C) e-commerce sales reached $1.471 trillion in 2014, increasing by nearly 20% over 2013 [1].Customers have grown accustomed to return unwanted products back to the store for any reasons.It is reported that the proportion of customer returns online range from 18% to 74% of original orders under e-commerce environment [2,3]; thus enterprises take various measures to prevent the appearance of quality defects.However, quality defect is inevitable.So, it is necessary for us to take into account both quality defect returns and nondefect returns; we call it mixed quality defect returns (MQDR), when considering the closed-loop supply chain as a support system in e-commerce environment.
As a classic discrete dynamics problem, the customer service level is determined by three important decisions: facility location decision, inventory decision, and transportation decision [4].Obviously, facility location, inventory control, and transportation optimization are highly related.For example, delivery in small lots and high frequency leads to reducing the in-inventory cost but increases the additional transportation cost.In addition, facility location decision needs to consider inventory decision and distribution decision.Perl and Sirisoponsilp [5] discuss the interdependence between the three key elements.Ballou and Masters [6] provide a schematic representation of the interrelationships among facility location, inventory control, and transportation optimization.
There are few researches about the integration optimization of location-inventory-routing problem (LIRP).Some researchers attempt to carry out research on LIRP [17].Liu and Lee [18] firstly studied this interesting problem; they proposed a two-phase heuristic method to solve the multidepot location-routing problem (MDLRP) considering inventory optimization.In order to avoid the local optimal solution, Liu and Lin [19] designed a global optimizing heuristic method to find the solutions for LIRP.Shen and Qi [20] presented an algorithm based on Lagrangian relaxation to minimize the inventory and routing costs in strategic location models.They focused on the layout phase and used continuous approximation to get the approximate optimal routing cost, but the vehicle routing was not optimized in their models.Javid and Azad [21] presented a novel LIRP model and proposed heuristic method containing two stages: constructive stage and improvement stage.Ahmadi-Javid and Seddighi [22] presented a mixed-integer programming model and a three-phase heuristic to solve the LIRP with multisource distribution logistics network.Guerrero et al. [23] researched the LIRP with deterministic demand and provided the hybrid algorithm to solve the problem.Zhang et al. [24] proposed a hybrid metaheuristic solution to LIRP considering multiple depots and geographically dispersed customers.Nekooghadirli et al. [25] presented a novel biobjective model of LIRP model considering a multiperiod and multiproduct system.Based on Lagrangian relaxation and a column generation technique, Guerrero et al. [26] developed a relax-and-price heuristic to solve ILRP; they proposed two dependent constraint sets with an exponential nature: Lagrangian relaxation and a column generation technique.
However, little research has been conducted on the LIRP considering returns.Li et al. [27] presented the HGSAA algorithm to solve a LIRP model considering returns under esupply chain environment.To be more consistent with reality, Liu et al. [28] introduced a stochastic demand into LIRP considering returns in e-commerce and proposed a PPGASA algorithm as the solving approach.
The above two researches mainly focus on the returns without quality defect but did not consider the MQDR.In this paper, we propose a model of closed-loop LIRP with MQDR.To the best of our knowledge, it is the first time to introduce the MQDR into LIRP in e-commerce.An effective hybrid algorithm named hybrid ant colony optimization (HACO) is provided to solve this model.Results of numerical instances indicate that HACO outperforms ant colony optimization (ACO) on optimal solution, iterations, and computing stability.
The remainder of this paper is organized as follows.Section 2 presents the mathematical model of LIRP with MQDR.Section 3 proposes the solution approach named HACO.Section 4 analyses the parameters of HACO and shows the results of different experiments.Section 5 gives the conclusion and future research directions.

Model Formulation
As we all know, customers' return in e-commerce is higher than traditional commerce.Because of personal dissatisfaction, or a mistaken purchase of the wrong product, some of the returns are without quality defects.These returns can reenter into the market after a simple repackaging process without being recovered [29].While the other returns result from quality defects, which need to be sent back to the plant and be recovered.
In order to meet the needs of MQDR, the merchandise center (MC) is necessary to deliver normal merchandises to the demand points (DPs) of downstream and collect the returned merchandises.MC integrates the functions of distribution center and recycling center and provides quality inspection and repackaging services.Meanwhile the returned merchandises are collected to MCs.Returned merchandises without quality defects become resalable normal items after repackaging treatment at MCs.The plant will recover the returns with quality defects and bring them to the market again.
The operation mode of the system is shown in Figure 1.The closed-loop supply chain in this paper consists of one plant, multiple MCs, multiple DPs, and a single type of product with continuous inventory policy under the e-commerce environment.
The goal of this study is to decide the quantity and location of MCs and arrange the vehicle routes and determine the ordering times on each route.To minimize the total cost of logistics operations, this problem involves the following three decisions: (1) location decisions: obtain the optimal number of MCs and their locations; (2) inventory management: determine the ordering times on each route; (3) routing optimization: arrange the vehicles to delivery merchandises and collect returns.
To benefit from the risk of MQDR, we take assumptions (1)-( 8) from Li et al. [27] into consideration: since the singleproduct system is researched in this paper, assumption (1) is necessary; in the capacitated vehicle routing problem, assumption (2) should be satisfied [30]; assumption (3) eliminates the indeterminacy from the different type of vehicle; assumption (4) means that each DP is well served by the only vehicle route [31]; assumption (5) ensures that each route will return to the same MC after traversing; assumption (6) follows the early published papers considering uncapacitated MCs [32]; assumption (7) takes MCs as the distribution center and recycling center; assumption (8) is a simplification of the reality [33].
The returned merchandises without quality defect are processed and repackaged at MCs, while others will be shipped back to the plant for reprocessing after a predetermined quantity at the MCs.Assume that the demand at each retailer is known and let R be the set of candidate MCs.Let S be the set of DPs and let  be the number of DPs.Let V be set of vehicles from the MCs to DPs.Let U = R ∪ S. Let the following notation denote the decisions of the firm: According to the aforementioned assumptions, the inventory levels depend on both demand and the quantity of MQDR.So, during each replenishment cycle, the holding cost of MCs is ℎ ∑ V∈V ∑ ∈R ∑ ∈S ((  +   +   )/2 V ) V  , where ℎ is annual inventory holding per unit merchandises,   is mean (daily) demand for DP  , and   and   are quantity of merchandises without and with quality defect returned by DP  per day.
In order to exactly describe the logistic distribution costs.Let   be the transportation costs per unit product from plan to MC  .Let  be the delivering cost per unit distance.Let   be the distance from node  to node .And let  be the working days per year.The total transportation costs from plant to DPs through MCs can be expressed, respectively, as where   denote the fixed cost of dispatching vehicles per time at MC  ,   denote the fixed (annual) administrative and construction cost of MC  , and   denote the ordering cost per unit product from plant to MC  .We let  to be the returning cost per unit of merchandise from DPs to MCs, so the total reverse transportation costs from DPs back to MCs are The cost of deal with mixed quality defects is We adopt   as the inspecting cost per unit for the returned product and   as the repackaging cost of unit returned merchandise without quality problem at MCs.
In summary, the model is formulated as follows: It is easy to find that the objective function ( 4) is convex in  V .We can simplify the objective function by solving  V .Consequently, the optimization solution of  V can be obtained by taking the derivative of the function with respect to  V ; the result is as follows: The optimization problem (4) given a known  * V can now be written as The objective function ( 6) is to minimize the total cost; (7) ensure the selected MC is not empty; (8) ensure each DP is traversed by a unique vehicle which belongs to a certain MC; (9) ensure the amount of each delivery from MC on each route must be within vehicle capacity; (10) ensure that each route has only one vehicle; (11) ensure each DP must be followed by exactly one note; (12) ensure every DP node of the system will be serviced before it gives services to the others; (13) ensure the subtour is eliminated.Equation (14) ensures that each DP is assigned to an MC when there exists a route that starts from the MC passing through the DP.Equations ( 15)-( 17) ensure the nonnegativity and integrality of decision variables.

Solution Approach
Like the VRP, the closed-loop LIRP is also an NP-hard problem, since it includes the VRP and is more complex than VRP.Generally speaking, there does not exist a complete, efficient, and accurate analytic algorithm to address NP-hard problems; ant colony optimization (ACO) has been proved very successful and widely applied to solve the static and dynamic problems as an EC algorithm [34].However, ACO does not distinguish ant behavior results.The pheromone concentration will distribute in every direction for the iteration.Therefore, this leads to low searching efficiency.The algorithm may get caught in local optimization if we do not take preventive measures.On the other hand, ABC provides an effective institution to find the global optimal solution from the trapping of local optimal solution [35].So, in this study, we present a hybrid ant colony optimization algorithm based on the combination of ACO and ABC to solve the above LIRP model.

Initialize Solution.
Since the natural number is an efficient coding method for these problems, the sequence of solutions is composed of candidate MCs (1, 2, . . ., ) and DPs, which are indicated by ( + 1, . . .,  + ).The candidate solution of our proposed model will be described well by those natural number sequences.As an example, Figure 1 fully interpreted the perceptions of our method, which refers to the individual feasible solution: {1 8 13 5 3 9 14 15 10 11 6 4 7 16 17 12}.
In the HACO, the moving strategy of the ant in node  is depending on the pseudorandom proportional rule.The rule indicates that the ant has both exploiting and exploration ability, which means the ant is guided by the pheromone trails as well as the heuristic information.In this case, the ant has a higher degree of exploring unknown knowledge.The connected function of the pheromone values   and the heuristic values   is shown as where   is the density of pheromone remaining on the edge (, ),   is the inverse value of distance between node  and node ,  and  are user-defined parameters for corresponding pheromone concentration and heuristic information, and allow() is the remaining nodes to be visited by ant .

The ABC Phase.
In order to improve the performance of global searching of our algorithm, the paper applied the scout bee searching phase into the ACO.Scout bees are free bees used for finding a new better solution from the neighbor known solution.As soon as a scout bee finds a new solution, she turns into an employed bee.If there is no improvement in the quality of solution, the bee will abandon that source and continue to search for another new solution.
The searching function of scout bees is as To meet the requirements for coding sequence type, we described two operations to complete scouts searching process, namely, random array reverse (RAR) and random swap (RS).
Step 1. Set the initial number of scout bee  and probability  0 .
Step 2. Generate two positions randomly named  and , for each  < .
Procedure: ABC Input: the initial sequence, the number of scout bee  and  0 Output: the better sequence Begin Take  and  0 while  <   = rand[0, 1] take  <  if  >  0 Reverse the array between position  and  else Swap the position of  and  Output: the better sequence End Pseudocode 1: Pseudocode of an ABC framework.
Step 4. Reverse the array between positions  and  as a new solution.
Step 5. Swap the position of  and  as a new solution.
Step 6. Calculate the cost of new solution.
Step 7. Keep the best solution to the next iteration and return to Step 2.
The pseudocodes of ABC are shown in Pseudocode 1.

Global Pheromone Trail Update.
The global pheromone updating rule is triggered at the end of iteration to reward tours that are in line with the objective of impedance minimization.This strategy is applied to reinforce the pheromone density on the sets of edges belonging to the inspect tour and to increase the likelihood that this tour will also be selected by other ant agents.The rule of global pheromone updating is given by where is a constant initial pheromone.  is the cost of the best of all the tours produced by all  agents from the beginning of the iteration. ∈ (0, 1] is the pheromone evaporation coefficient.To improve the pheromone trail quality, a part of the worst result is removed.

Local Pheromone Trail Update.
In addition to the global pheromone trail updating rule, the selected ants will update the local pheromone trail in the process of passing an arc(, ).It is opposite to the normal pheromone trail updating rule that increases the pheromone density while ants cross over arcs.The purpose of using the local pheromone trail update rule is to prevent stagnation behavior because the arc becomes less desirable for the following ants.The rule of local pheromone updating is as follows: where  0 is a constant at the beginning of pheromone trails and  is a user-defined coefficient that lowers the pheromone density of arcs traversed by the intelligent ants.

Algorithm Flow
Step 1. Get the formulas for solving  V .
Step 2. Set the initial parameters for the model: set of candidate MCs R, set of DPs S, set of vehicles V, inspecting cost   , ordering cost   , transportation costs   , daily demand   , dispatching vehicles cost   , fixed (annual) administrative and construction cost   , vehicle capacity , holding cost ℎ, and returning cost .
Step 3. Parameter setting for HACO is as follows: ants number , evolution terminate iteration , pheromone concentration impact factor , heuristic information pheromones impact factor , evaporation rate of the pheromone , constant initial pheromone , and mutation probability array reverse  0 .
Step 4. Using unit matrix   , calculate the probability Step 5. Ant solutions generation module: each ant will generate a feasible solution after traversing the DPs.
Step Step 7. Scout bee module: random selection probability 0 <  < 1; if  >  0 , turn to random array reverse operation.Otherwise, turn to random swap operation.
Step 8. Pheromone updating module: update the information pheromones as follows: Step 9. Termination module: if the parent optimal solution and offspring optimal solution are equal during continuous  generations, stop the algorithm.Otherwise, return to Step 3 after  increments.
The pseudocodes of HACO are shown in Pseudocode 2.

Computational Experiments and Results Analysis
In this section, numerical simulations are given to illustrate the performance of HACO compared with the traditional ACO.Both algorithms in this paper are compiled by Matlab R2014a and run on a computer with 8 GB main memory and 3.6 GHZ CPU.All instances come from the LRP database in University of Aveiro [36].  ∼  (16,20);   ∼ (6, 10);   ∼  (21,25);   ∼  (12,25); and   ∼  (2,5).
We run the program 50 times on the same computer.The performance of ACO and HACO varies with the different values of the parameters, which are shown in Tables 1-6.In these tables, the symbol C.V. means the coefficient of variation.
Tables 1-6 represent the parameters' effect on the objective function values.The data was normalized through two dimensions, that is, cost and iterations, and three indicators,   that is, mean, standard deviation, and coefficient of variation.Actually, in order to find the minimal cost, we usually take the parameter values, where the cost is lower and more stable.

Computational Experiment.
To get a reliable conclusion, we run another 50 times on the same computer with the best parameter values in Gaskell 67-22 × 5.One of the best solutions of objective function in the 50 experiments of HACO is 30.2 million CNY.Table 7 shows the solution.MCs were established at MC1, MC2, and MC5 with five vehicles distribution routes.Figure 2 shows topological structure of the closed-loop supply chain.Figure 3 shows their trend of optimal objective function values along with iterations.The fluctuation curves of optimal objective function value are varied by different algorithm, which are shown in Figures 4(a As shown in Figure 3, the cost and iterations of HACO are lower than ACO; and in Figure 4, the range and mean value of  the minimum cost of HACO are also lower than ACO, which both imply that HACO is more efficient than ACO in solving the LIRP.

Extended Experiments.
In this section, a series of instances are given to show that HACO is more efficient and stable than classical software and ACO.In order to ensure the demands of DPs are not more than the vehicle capacity, we need to enumerate some instances.In this paper, the daily demands are set as 1/10 of corresponding demands of the database.
As we know, Lingo is a representative classical optimization software tool.Thus we used Lingo 11.0 to solve the problem by using a small-sized instance named Peal 183-12 × 2 and two medium-sized instance named Gaskell 67-22 × 5 and Gaskell 67-36 × 5; the results are shown in Table 8.
Each instance was run 50 times by HACO and ACO with their optimized parameters values, respectively; the results are shown in Tables 9 and 10. 8, we found that (1) for the small-sized instance, HACO can obtain better result than Lingo within less time and (2) for medium-sized instances, Lingo cannot get the global optimization within 1 hour, while HACO can solve the problem in a short time.

Result Analysis. According to Table
Observe, from Tables 9 and 10, that HACO is more efficient than ACO for the following reasons.(1) The cost    of HACO is significantly lower than ACO ( < 0.05); (2) the difference of the number of iteration between HACO and ACO is not significant ( > 0.05); (3) HACO is more stable than ACO as the coefficient of variation (C.V.) is lower.To sum up, our algorithm reduces the cost with the same number of iterations compared with ACO.

Min cost Mean
By improving pheromone updates and bee colony searching, we improve the solution quality of the algorithm and make it useful as a guide for the ant searching process.Observed from the results of numerical simulations, HACO can get better result with a fewer number of iterations.Hence, comparing with ACO, HACO is adopted as a better approach in solving this LIRP with MQDR.

Conclusion and Future Research
With the development of e-commerce, customers' return keeps a high rate with MQDR, which can be reentered into markets after being repackaged or recovered.In this research, we built a closed-loop LIRP model considering both quality defect returns and nondefect returns; we call it MQDR in this paper.We perform an extensive computational study and observe the following interesting results.
(1) Considering MQDR are computationally beneficial for the formulation presented, the MQDR and closedloop pattern with returns are features of the proposed problem in e-commerce, which is never considered in previous work.
(2) Since the evolutionary computation algorithm has been proved successfully in tackling NP-hard problem, a hybrid algorithm is proposed by combining ACO algorithm and ABC algorithm to solve the LIRP.HACO integrated the scout bee searching phase into the ACO to improve the global searching ability.
(3) The performance of HACO is evaluated by using the instances in the LRP database, and HACO outperforms ACO on convergence, optimal solution, and computing stability.This numerical study shows the efficiency and effectiveness of the solution method.
However, developing other elements for the LIRP will lead to further research directions.And analyzing the model under the dynamic demand of customs and a time-varying demand can be a valuable subject.The design of experiments and verification by discrete dynamics simulation should be established.Fruit fly optimization algorithm (FOA) as one of the best EC algorithms has attracted the attention of various researchers [37].It is important to apply these models and algorithms to the operation and management of enterprises to improve the decision-making efficiency of e-commerce logistics system.

Figure 1 :
Figure 1: Closed-loop supply chain for a single product.

4. 1 .
Parameters Discussion.Parameter values selection is crucial to the efficiency of algorithms.An example named Gaskell 67-22 × 5 from the database, which contains the nodes coordinate and the DPs demand, is used to determine the optimal parameter.Gaskell 67 is the instance's name and 22 × 5 means 5 candidate MCs for 22 DPs.The inventory holding cost ℎ = 2, the vehicle capacity  = 500, transportation costs   = 2, returning cost  = 2, working days  = 300, and the delivering cost per unit distance  = 0.7.The other parameters of the instance are as follows:   ∼ (16,20);

Figure 2 :
Figure 2: Topological structure of the network.

Figure 4 :
Figure 4: The fluctuation curve of optimal objective function value.
ordering times of MC  on routing V.

Table 8 :
Comparisons between HACO and Lingo.

Table 9 :
Optimal objective function values of two algorithms (CNY).