A Competitive Bilevel Programming Model for Green, CLSCs in Light of Government Incentives

. Te growth of world population has fueled environmental, legal, and social concerns, making governments and companies attempt to mitigate the environmental and social implications stemming from supply chain operations. Te state-run Environmental Protection Agency has initially ofered fnancial incentives (subsidies) meant to encourage supply chain managers to use cleaner technologies in order to minimize pollution. In today’s competitive markets, using green technologies remains vital. In the present project, we have examined a class of closed-loop supply chain competitive facility location-routing problems. According to the framework of the competition, one of the players, called the Leader, opens its facilities frst. Te second player, called the Follower, makes its decision when Leader’s location is known. Afterwards, each customer chooses an open facility based on some preference huf rules before returning the benefts to one of the two companies. Te follower, under the infuence of the leader’s decisions, performs the best reaction in order to obtain the maximum capture of the market. So, a bilevel mixed-integer linear programming model is formulated. Te objective function at both levels includes market capture proft, fxed and operating costs, and fnancial incentives. A metaheuristic quantum binary particle swarm optimization (PSO) is developed via Benders decomposition algorithm to solve the proposed model. To evaluate the convergence rate and solution quality, the method is applied to some random test instances generated in the literature. Te computational results indicate that the proposed method is capable of efciently solving the model.


Introduction
In facility location problem-a well-known discrete optimization-a company decides on plants and facilities to be opened from a given set, open plants to be assigned to each open facilities, and open facilities to be allocated to each customer to maximize proft or market capture.For decades, logistic network design problems that consider facility locations and product fow have been extensively studied.Recently, given the increasing pressures from environmental and social requirements, many manufacturers have adopted second-hand products and their recovery activities in the production process.Consequently, the design of reverse logistics networks have caused concern pertaining to not only economic aspects but also how such networks can afect other aspects of human life, such as environment and the sustainability of natural resources.Implementing reverse logistics operations requires setting up additional appropriate logistics infrastructure for the arising fows of used and recovered products.Physical location, plants, facilities, and transportation links need to be chosen to transfer a new product from manufacturers to end customers and convey used reverse products from customers to manufacturers or suppliers for recovery or safe disposal [1].
Furthermore, global concerns about sustainable issues have grown rapidly, leading all governmental and nongovernmental environmental ofcials and activists into trying to encourage companies to take into account sustainability and environmental issues in their strategic decision-making [2].Sustainable development is responsive to the needs of current generations without being able to meet future needs [3].As stated by Zhang and Li, [4], Mondal and Giri [5], and Zhang et al. [6], governments use the idea of encouragement or pressure to motivate companies into aiming for lower emissions through utilizing innovative technologies when manufacturing products.Such state intervention could be carried out via a federal agency.
For instance, Temur and Bolat [7] refer to two common regulatory approaches considered by the US Environmental Protection Agency.Te frst, standardization in the technology used in production processes, is a general rule for all manufacturers.Te second is an optional mode, with a performance-based mechanism in which the manufacturers can use the technology they want.Evidently, in addition to the two frameworks mentioned, the US Environmental Protection Agency has recently used subsidies or fnancial incentives for manufacturers to bring new technologies into action [1].Indeed, the government is increasing its environmental concerns by providing such fnancial incentives.Te idea, in other words, acts the opposite way that taxation does.Jung and Feng have investigated the government's subsidy design problem for using green technologies in an evolving industry and the subsidy's impact on the environment and social welfare [8].Tese ideas are not unique to a specifc country or continent.Tus, besides the United States, some Asian countries, such as South Korea, China, Taiwan, Philippines, and Tailand, subsidize using electric vehicles and solar panels in supply chains [9].Aldieri and Vinci, in [10], have provided an overview of pollution control incentives for developing countries.Moreover, Jia et al. [11] have also elaborated on the importance and potential impacts of pollution control incentives on the supply chain.In Zare et al. [12] and Zare et al. [13], there is a detailed explanation of new technologies, including environmentally friendly technologies that governments can support companies in using.
Te fundamentals of the present research are based upon the work carried out by Chalmardi and Camacho-Vallejo [1], and Mondal and Giri [5].Tey assume that the government subsidizes suppliers, who use green technologies to manufacture products.Chalmardi and Camacho-Vallejo [1] propose a bilevel model with the frst level representing government incentives, while the second represents the producer frms.End consumers are involved in the second level of the model.Tere are multiple signifcant diferences [1,5] in the present research.Te frst is how to consider government incentives.It is assumed that the government ofers subsidies to both competing frms at two decision levels.Additionally, the type of supply chain is taken into account; a closed-loop-supply chain (CLSC) includes all the components involved.Te third is the competition over obtaining market share by two rival companies, present in our study but not in previous works.Yet the third diference makes it possible to strike a balance among subsidies for clean technology and profts via customer attraction.Terefore, this paper examines a bilevel leader-follower model, which looks into government-ofered fnancial incentives to the CLSC managers at both levels.Te maximum objective capture should be used to incorporate location decisions into leader-follower decisions.Closed-Loop Supply Chain Management (CLSCM) motivates the actors involved to integrate cleaner technologies.
Te key contributions are highlighted as follows: (i) Proposing a new mathematical competitive bilevel model based in green sustainable CLSCM.(ii) Incorporating government-ofered fnancial incentives meant to encourage two competitive frms to use green technology in the proposed model.(iii) Presenting a hybrid model based on quantum binary PSO and Benders decomposition method (iv) Generalizing the example of new test examples based on similar cases in the subject literature to evaluating the algorithm

Literature Review
Based on the existing literature, most studies conducted in the feld of supply chain management (SCM) have dealt with multiobjective models, single-level, and supply chains without considering all layers involved.Besides, works related to competitive bilevel programming models that include sustainability issues have received comparatively less attention.Tree fundamental aspects should generally be considered in sustainable decisions: economy, environment, and community.Tat is to say that a sustainable company is one capable of doing business with the purposes of community, environmental, and economic well-being in its operations [14].Growing environmental concerns and their consequent pressures lead to the development and implementation of sustainable approaches during corporate decision-making.Terefore, much of recent research focuses on reducing pollution caused by SCM; such reductions can be made possible by a greater tendency toward cleaner production and technologies.As a whole, in SCM issues, measures are taken based on the location, quantity, and capacity of facilities, while also taking into account the fow of material between locations.In supply chain issues, with the main approach of sustainability (SSCM), the objective is commonly to optimize long-term economic proftability, environmental performance, and social aspects [15].Nowadays, in light of such aspects as environmental conditions, the management of the complete consumption cycle of a product is the responsibility of the primary service provider, meaning it is not possible to fully manage a supply chain from the point of view of environmental conditions without considering how to manage the products consumed.A CLSC manager is responsible for the production and product management both.Listes and Dekker [16] are pioneers in designing Closed-Loop Supply Chain Management, followed by too much efort.A mixed-integer (MIP) biobjective problem is proposed in [17].In this model, the objective functions maximize the responsiveness of the proposed CLSCM network while minimizing the total costs.Furthermore, a memetic algorithm is applied to solve the model.Keyvanshokooh et al. [18] have introduced an MIP model for the multiechelon, multiperiod, and multicommodity SCM, in which a dynamic pricing approach is used to solve the model.In [19], a mixed-integer nonlinear programming (MINLP) model is proposed to optimize crucial strategic and tactical decisions in the CLSCM network.A nondeterministic biobjective model is developed in [20].Te interactive fuzzy solution method is used in the newly introduced model.Te same model is addressed in [21,22].A comprehensive review of CLSCM has been done in [23], in which a robust model is proposed for tackling uncertainties in some cost data, supply parameters, or demand elements.Tey propose a more complex stochastic CLSCM for a multiechelon multiproduct model.Under such a scenario, the return rate is determined by a fnite (limited) set of feasible possibilities, with the demands holding unpredictable values.Te ability of the multiperiod, multiproduct, and multilevel supply chain model to handle the risks involved in a buildto-order strategy is confrmed by Keyvanshokooh et al. [24].Tey have proposed a proft maximization MILP CLSCM model by considering fexibility in satisfying demand and collected returns based on a specifc procedure.A hybrid uncertainty CLSCM is addressed in [25].Soleimani and Kannan [26] have studied the deterministic multilevel multiproduct periodic CLSCM for large-scale problems.A hybrid method based on genetic algorithm (GA) and particle swarm optimization (PSO) has been applied to their proposed model.Yadegari et al. [27] have considered three transportation modes and propose an integrated logistics network model.Te memetic algorithm is used to solve the presented model.
Te green supply chain is a slightly novel concept in CLSCM.Talaei et al. [28] have defned a problem that looks into environmental issues, such as reduced carbon dioxide emission rates.A green sustainable location-routing inventory model in CLSCM is presented in [29].In this model, economic, social, and environmental conditions as well as efects have been reviewed.A robust nonlinear programming model has been extended by [30] which works on price, advertisement, and uncertain demands, solved by a Lagrangian relaxation algorithm.
A review of the green reverse logistic and environmental aspects has been completed in the survey paper on reverse logistic models [31].More environmental factors in a CLSCM have been focused upon in a model by Giri et al. [32].A disruption risk using the transshipment strategy in CLSCM has been proposed in [33].
Asghari et al. [34] plan an operational and tactical framework to promote advertising programs in a direct-sales CLSCM taking into consideration diferent elasticity efects.In order to impact the proftability of manufacturers positively, they propose an optimization model for pricing similar products.Soleimani et al. have developed one of the most comprehensive mathematical models on the sustainability of CLSCM [35].In theirs, which involves three choices of remanufacturing, recycling, and disposing of the returned items, the distribution facilities act as warehouses and collection centers.
Te competition is an integral part of supply chain issues.Recently, the approach was partially included in the CLSCM literature.For instance, a competitive bilevel approach to CLSCM design model is presented by Rezapour et al. [36].A nondeterministic competitive CLSCM is designed in [37].It considers Stackelberg's competition between two supply chains based on their returns, profts, and demands.Such competition between manufacturers and retailers in the CLSCM was proposed in [38].Te purpose of this study is to examine the impact of manufacturer fairness concerns on CLSC decisions and profts in the green economy.Fathollahi-Fard et al. [39] address a static Stackelberg game between nurses, and patients within the HHCS framework by proposing a bilevel programming model [40] develops two BiLevel Stackelberg Models (BLSMs) under special conditions in the presence of strategic customers.Optimal production and order quantities and prices are determined by a sequential game at both levels.Fathollahi-Fard et al. [41] have developed a multilevel static Stackelberg game programming model to design the location allocation of the real case study of the CLSC.Inspired by the performed work, this paper aims to defne a competitive, sustainable CLSC model.In addition to environmental and sustainability issues, the competition problem between two companies is covered.Indeed, when a company intends to produce, distribute, or provide a particular service in the market, it cannot ignore other companies with similar activities.Tere is, therefore, defnite competition among these companies as diferent decision-makers.Such hierarchy distinguishes bilevel programming problems from those of biobjective.In competitive SCM problems, one of the main factors to yield a company's success in gaining market share is the correct location of centers.Te basic duty of competitive location models is to fnd the location at which the captured market share is maximized under the existing conditions.A Stackelberg game leaderfollower model is considered, proposing that location decisions should be incorporated in leader and follower decisions by the maximum objective capture in a green CLSC.In general, in the leader-follower model, a new competing facility known as the follower is expected to join the market at some point in the future, trying to maximize its market share by establishing its facility in the location.Te leader's location decision afects the competitor's location decision.On the other hand, it is expected that a future competitive entrance would impact the leader's choice of site in some way.Te leader's goal is to choose a site that will maximize its market share across the time horizon, including the market share that will be captured when the rival enters the market [42].Most Journal of Mathematics competitive issues lead to the appearance of bilevel or multilevel models.Tese problems are entirely diferent from single-level ones due mostly to structure and solution methods applied.
NP-hardness of bilevel programming problems (even in the "simplest case linear") has been established in [43] and was strongly proved in [44].Tus, solving such problems is challenging, which is why many algorithms have been proposed in the literature.Some, including [45], have provided global optimization algorithms by relaxing the inner issue in a convex manner before representing the corresponding result using required and sufcient optimality criteria [46] which suggests a global optimization method that uses bilevel nonlinear optimization to determine the fexibility test.Multiparametric programming, which can transform bilevel problem into a family of single-level optimization problems, was proposed in [47].Tus, the authors in [45] verify some new other methods; [48,49] and [50] particularly consider algorithms for bilevel linear programming.Te cuttingplane approach to approximate the lower-level feasible region has been proposed in [51]; hence, this algorithm is applicable when the upper-level problem includes continuous variables while the lower-level problem encompasses integer variables.On the other hand, linear programming duality was used for the opposite case [52], where the lower-level problem is linear, and the upperlevel problem is an integer.In [53], the authors have developed a basic enumeration scheme to identify feasible solutions in the discrete case.Mathieu, has in [54], proposed a two-level GA-based (GABBA) method to solve bilevel programming problems (BLPPs).Te same GAbased method is proposed by Hejazi et al. [55], as the results are put together for comparison with those of Gendreau et al. [56].A developed version of GA has been proposed by Oduguwa and Roy [57], an elitist algorithm created to encourage limited asymmetric collaboration between two players to solve diferent classes of BLPPs within a single framework.
An applicable Lagrangian-based algorithm [58] and the metaheuristic method using quantum binary particle swarm optimization-(QBPSO-) based method [59] were used to solve a particular class of bilevel problems.Considering the overall similarity of the model structure presented in [59] and the computational results of the PSO method in [59], the method can be implemented on P 1 .However, given the difculty in solving the generated subproblems in the PSO method, Benders heuristic decomposition algorithm is effciently used for some of the following problems.Terefore, one of the key goals of the present study is to demonstrate whether it is possible to modify and adapt the hybrid method of QBPSO and Benders decomposition to successfully solve the proposed BLPP.
Summarizing the fndings of the study, data presented in Table 1 indicate that there is no mathematical model for a competitive CLSC when government incentives are in place.
Te remainder of the present paper is organized to cover the following: Section 3, where the proposed problem and its related mathematical model are described.Section 4 explains a hybrid method.Section 5 incorporates the introduction of the sample test problem and computational results.

Problem Definition
Te considered CLSC to be designed needs fve echelons, including suppliers, plants, facilities, customers, and recycling centers.Te basic assumptions considered in the model formulation are as follows: (i) Unit material costs are the same for all suppliers in supply chain.(ii) Recycling centers and facilities buy returned products from customers with the diferent and variable price ofered by the two competing companies.(iii) All customer demands must be satisfed, and all returned products must be purchased.(iv) A single period and single product supply chain is taken into account.
In this system (Figure 1), the priority of distribution and planning is as follows: a raw material is delivered from the supplier to the plant, a primary product from the plant to the facility, a primary product from the facility to the customer, a reverse product from the customer to the facility, a usable reverse product from the facility to the plant, a useless reverse product from the facility to the recycling center, and ultimately a renewed material from the recycling center to the supplier.All purchased raw materials are made in the factories, and all reverted goods are frst sent back to the facilities for product recovery, where some are recovered and sent back to the factories as a main good.Other checked reverse products are then sent back from the facilities to the recycling centers, where they are renewed as useless raw materials.Afterwards, they are shipped back to the suppliers.Forward and recovered products, indicating that the renewed from the facilities are the same as the primary in terms of customer satisfaction, could meet the demand.
In supply chains, the environmental efect of all factors is measured against harmful greenhouse gas (GHG) emissions, such as CO 2 , CFCs, and NOx.However, the opened plants, facilities, and recycling centers have environmental efects, which are taken into account.In this regard, governments with their responsibilities for environmental protection, encourage supply chain managers to fght of ecological damage.Transporting products throughout the network, opening plants, facilities, and recycling centers, as well as manufacturing at plants and recycling at recycling centers, cause and contribute to these contaminants.Against such a backdrop, the present study looks into governmentgranted fnancial incentives to persuade frms into using cleaner technologies across plants and recycling centers.In order to promote sustainable networks, the governments 4 Journal of Mathematics  Journal of Mathematics make those ofers to supply chain managers.As an incentive, there is a limited budget available.Within such a competitive market, all players get to have a say, including the leader company, the follower company, and the customers.It is convenient to summarize this decision-making into the following process: In the frst stage, the leader company decides on the location of plants, facilities, and recycling centers with types of technologies (ofers to protect the environment), and fxes the prices for the reverse product.In the second, taking the leader company's decision into account, the follower company decides on the location of the plants, facilities, and recycling centers (with technologies) to be opened for new and reverse products.Government incentives often lead both leaders and followers to act in the direction of reducing environmental pollution risks.Finally, in the third stage, each consumer enjoys the option to meet his or her demands using the goods of both businesses, purchasing new items and selling old ones to generate revenue for both companies.Te decision-making problem for the leader company pertains to choices that yield the highest proft, with the knowledge that the follower company is in a push to maximize proft and attract customers.
As a rule of thumb, it is assumed that when the leader company makes decision, it does have precise information on the types of facilities the follower company could ofer, i.e., a detailed forecast is provided by the follower or for each decision from the leader which generated the best response.Moreover, it is assumed that both companies recognize the fact that the preferences of all customers are based on the distance, price, and the facilities' attractiveness (e.g., good services, spacious centers, restaurants, and accessible parking lots).Indeed, changes in those varying preferences could afect the companies' capture.

Attractiveness and Distance.
At any given time, customers do prefer to walk into facilities with higher attractiveness and shorter distance.

Price.
Customers or sellers usually prefer to sell their products to the highest bids.Tis preference, indeed, is related to the frms, rather than facilities; however, the previous preference has to do with the facilities.Let i, i ′ ∈ I be the index of potential facilities, with j ∈ J standing as the index for demand points; d ij is the distance from facility i to demand point j, σ 2 is a related distance function, qr L ij is the price at which the leader purchases the products from customers, and σ 3 is a related price function.Incorporating the attractiveness of facility to capture demand point, A i is the attractiveness of potential facility i, σ 1 is a related attractiveness function of facility i, and ρ L ij (ρ ′ L ij ) is the attractiveness of facility i for demand point j for new (reverse) products defned as Te rule could then be afected by the following linear order notation.Indeed, the preferences are determined by a linear order ≺ j on I; relation i ′ ≺ j i for any i, i ′ ∈ I means that if two facilities of i and i ′ are opened, customer j prefers facility i.Moreover, for all i, i ′ ∈ I, relation i ′ ≺ j i means that either i ′ ≺ j i or i ′ � i.It is assumed that σ 1 and σ 3 are linear functions, while σ 2 is a quadratic function.All other symbols used are given in Appendix A. Te mathematical model of the competitive green CLSC with government incentive is addressed in the following (refer to P 1 ):

Suppliers Production Centers Distribution Centers Customers
Collection/Inspection Centers Recycling Centers Disposal Centers 6 Journal of Mathematics Equation ( 1) presents the leader's objective function, with the frst statement showing the leader's market capture amount.Te second phrase demonstrates the cost of purchased returned products from the customers by facilities, while the third, fourth, and ffth terms represent fxed opening costs of plants, facilities, and recycling centers, respectively.Te sixth one deals with long-term contract costs with suppliers.Te transportation costs from suppliers to plants, plants to facilities, facilities to customers, and facilities to recycling centers, facilities to plants, and recycling centers to suppliers are exhibited in terms seven to twelve, respectively.Tirteen to sixteen are the manufacturing operating costs, as well as those of separating, recovering, and recycling.Te seventeenth term shows the profts of returning the reverse products from facilities to plants, and the eighteenth pertains to the achieved proft by selling reverse products to suppliers from recycling centers.Te nineteen concerns the costs of purchasing raw materials from suppliers by plants.Te next seven summations measure the environmental efect linked to the shipment of material from suppliers to plants, plants to facilities (or facilities to plants), facilities to customers, facilities to recycling centers, and recycling centers to suppliers beside the environmental efect related to the manufacturing of the products in plants and recycling the reverse products in the recycling centers, respectively.Te twenty-seventh and twenty-eighth phrases indicate the costs of contamination caused by manufacturing and remanufacturing of the products at plants and recycling centers, respectively.Finally, the last two terms represent the fnancial incentives obtained by the supply chain manager thanks to using the installed cleaner technologies at plants and recycling centers, respectively.
Constraint (2) ensures all raw materials shipped from suppliers, and reverse products from facilities to plants to be used during the manufacturing process.Constraint (3) guarantees the shipment of all manufactured products from plants to facilities.Constraint (4) shows that every single 8 Journal of Mathematics customer demand must be satisfed.Constraint (5) expresses the maximum limit of products purchased from suppliers by opened plants.Constraint (6) shows the limitation of the manufacturing capacity at an opened plant using a certain type of technology.Constraints ( 7) and (8) express the capacity limit of open facilities for products shipped from open plants and reverse products from customers, respectively.Constraints ( 9) and ( 10) guarantee a unique type of technology to be used in an opened plant and recycling center, respectively.Constraint (11) asserts that each customer can only select an installed facility.Inequality (12) means that only one open facility can be chosen based upon selection rule ≺ j on I. Constraints (13) stipulate that each customer is assigned to exactly one leader's facility at frst.Constraints ( 14) and ( 15) are the same as constraints ( 10) and ( 11) for reverse products.Constraint ( 16) guarantees all purchased reverse products to be shipped from facilities to plants and recycling centers.Constraints ( 17) and (18) show the reverse shipped products from facilities to plants and recycling centers, respectively.Te recycling capacity cap for a recycling facility that has been operational and is using a certain technology is shown in constraint (19).All reverse products must be transferred from facilities to recycling centers in order to be employed in the recycling process, according to constraint (20).Constraint (21) guarantees the shipment of all recycled products from recycling centers to suppliers.Constraints (22)(23)(24) state a maximum number of opened facilities, plants, and recycling centers, respectively.Constraint (25) is the budget at the government's disposal to provide fnancial incentives to the plants and recycling centers in exchange for their use of cleaner technologies.Constraint (26) indicates the type of variables linked to the follower's decision, which the followers' problem is stated as follows: max Journal of Mathematics Te maximum total utility of the follower is stated as function (27).Constraints ( 28)-( 52) can be considered in a fashion similar to the leader for the follower.Constraints ( 38)-( 41) guarantee the selection of an open facility by a customer according to the given rule; moreover, it shows that if a facility is opened by the leader, it cannot be opened by the follower.
Evidently, the above bilevel model involves the following hierarchy: where x and y are the upper-level and lower-level variables, respectively, c 1 x + d 1 y, c 2 x + d 2 y are upper-level and lowerlevel objective functions, and A 1 x + B 1 y ≤ g 1 and A 2 x + B 2 y ≤ g 2 are known as upper-level and lower-level constraints, respectively.In the section to follow, a hybrid metaheuristic algorithm is applied to solve the model.

Solution Method
To solve bilevel problems and their extensions, many algorithms have been discussed, including exact and heuristic algorithms.Te metaheuristic hybrid algorithm presented here functions better than other ones in the literature.It can also be easily implemented by varying software.To properly implement the algorithm, the structure of model, and its features need to be clearly understood.Tis model is expected to be characterized by the following properties: Such features pave the way for the QBPSO method to be used, details of which will be discussed further.
QBPSO is comprehensively applied to solve IP problems.Te key contribution of the present paper is the way it adapts the algorithm with the Benders decomposition algorithm to solve the efciency of the bilevel problem in question.An overview of both methods is given below and how to implement it will be explained later.

Preliminaries. Te particle swarm optimization (PSO)
method is a population-based algorithm initially proposed by Kennedy and Eberhart [60], in which the social behavior of a fock of birds is simulated, as the birds (particles) show the candidate solutions to the problem, fying via the search space to fnd the optimal solution.In each iteration, the particles move to the desired level by following their best solutions before the best global position is discovered by each particle in the swarm.
Should K be assumed as the dimension of the search space, then for the i th particle, the current position and velocity vectors are represented as X i � (x show the best position of the i th particle and g best � (g best 1 , ..., g best j , ..., g best K ) be the group's best position recorded so far.Te following equations will demonstrate how to update the velocity and position respectively of i th particle at t th iteration to t + 1 th iteration: where w is the inertia weight and c 1 and c 2 are acceleration coefcients representing the degree of belief in the particle's own experience and the whole swarm experience, respectively.Ten, r 1 and r 2 are random values within a range of [0, 1].[60] have proposed the binary version of PSO (BPSO).In theirs, each particle is represented by a string of 0 and 1. Particle velocity is related to the probability with which a slight fip can occur.

Binary PSO (BPSO). Kennedy and Eberhart
Remarkably, the updating equation for velocity remains unchanged while the equation of updating position changes as shown in the following: where u j i (t) is a random number belonging to [0,1], and S(v j i (t)) is the velocity value sigmoid function (1/(1 + exp(−v j i (t)))) that specifes the probability of j th bit of the i th particle, i.e., x j i (t) being 0 or 1 at the t th iteration.

Journal of Mathematics 11
4.1.2.Quantum Binary PSO (QBPSO).An example of a successful theory is QBPSO proposed by Mirhassani et al. as a "mechanism of biological evolution" and "simulation of things [59].One of the major drawbacks of BPSO is being unintelligent S(v j i ), making it unable to lead the particles into the promising region of the search space.Furthermore, parameter selection is difcult.QBPSO, by contrast, overcomes such challenges.In QBPSO, instead of a sigmoid function, a population of quantum particle vector Q(t), which is based on the quantum bit, is used.Remarkably, the smallest unit that carries information is known as a quantum bit or "qubit," which can be either "1" or "0" or in any extraordinary position.Hence, the quantum particle population is expressed as where Q i (t) � [q 1 i (t), ..., q j i (t), ..., q K i (t)] and 0 ≤ q j i (t) ≤ 1, i � 1, 2, ..., N, j � 1, 2, ..., K represent the probability of the j th bit for the i th particle to take zero value at the t th iteration.
In QBPSO, a quantum particle vector Q i (t) � [q 1 i (t), ..., q j i (t), ..., q K i (t)] updates the position of particles by the following rule: Tus, the following rules are used to help update the quantum particle vector: in which α, β are control parameters satisfed in relations α + β � 1, 0 ≤ α, β ≤ 1.Additionally, these parameters are tuned to control the degree of Q i .w, c 1 , c 2 are PSO parameters and satisfed in relations Te parameters represent the degree of belief in oneself, local maximum, and global maximum, respectively.Te general framework of QBPSO for the proposed bilevel model is described as follows.

Benders Decomposition.
To verify the Benders decomposition, consider the following problem: Te above problem is decomposed in the following Benders subproblem (SP): where u is the dual value corresponding to constraint x ≥ 0. Based on the dual SP mentioned above, the Master Benders (MP) problem is where Ρ and Β are subsets of extreme points and extreme rays of ℧, respectively, and ℧ is defned by constraint uA ≤ c 1 .Benders showed that this algorithm below fnds an optimal solution after a fnite number of steps, or proves that none exists Algorithm 1.
In the given algorithm, UB stands for the upper bound and LB stands for the lower bound.Te constant ε represents the user-defned optimality gap.

Benders Decomposition with Pareto-Optimal.
In this section, we will try to show how to create the strongest possible cut.
A cut (b − By)u ≤ α corresponding to u 1 ⊆℧, dominates that corresponding to u 2 ⊆℧, if We say that cut is Pareto-optimal cut if it is not dominated by any other cut.Also, the point u corresponding to this intersection is called Pareto-optimal.Now, let ri(S) and S c be the relative interior and the convex hull of the set S ⊆ R k , then every point y ∈ ri(Y c ) is called the principal or core point of Y. Te following theorems help us to create a Pareto-optimal cut.Theorem 1.Let y 0 ∈ ri(Y c ) and  u be the optimal solution of the dual SP, then the optimal solution u 0 of the following problem is Pareto-optimal.
Proof.Conversely, suppose that u 0 is not Pareto-optimal; that is, there is a u 1 , such that ∀y ∈ Y: And so, y �  y that u 1 must be the optimal solution of the Z(u), i.e., However, this inequality contradicts the hypothesis and shows that our assumption that u 0 is not Pareto-optimal was wrong.Tis completes the argument.

□ Theorem 2. With the same hypothesis as in of Teorem 1, the following problem is also Pareto-optimal.
Proof.Te proof is almost identical to Teorem 1 proof.
Te above theorems help the Benders master problem make progress towards the best solution from the frst iteration.Tis is shown by the following steps Algorithm 2: □ 4.2.QBPSO for Solving the Proposed BLPP.Te QBPSO algorithm [61] is adopted to solve mixed-integer linear BLPPs, where calculating upper and lower levels iteratively makes it possible to get closer to the optimal solution step by step.Representation of each solution is done through location variables, and other variables can be determined from these binary variables by solving some of the following subproblems.Terefore, X and Y are used for the upper and lower level location variables, respectively (Algorithm 3).
Te pseudocode of the algorithm is given by Figure 2 in Appendix B. Based on properties (a) and (b), the follower's optimal solution always exists for each fxed X r .Furthermore, feasible solutions are generated for the leader in the initialization phase, a feature which is preserved during the algorithm implementation.) be the optimal solution to the follower problem.Ten, the leader problem is solved by fxing X r and optimal values of the lower level (regarding X r ) in the upper-level problem, while the ftness value of the current position of the r th particle (X r , Y * r ) is calculated using F(X r , Y * r ), where F is the leader's objective function.Consider the personal best position of the r th particle Pbest r � (Xbest r , Ybest r ) equal to (X r , Y * r ) and set the global best position Gbest � (Xbest, Ybest).

Benders Decomposition
Step 2: (Updating X r ) Change Q r and update X r according to the QBPSO algorithm.
Step 3: (Obtaining the follower's reaction and determining the particle's ftness) by fxing new X r in the lower-level problem, obtain the follower's optimal solution Y * r .Afterwards, in the upper-level problem, the position and ftness value of F(X r , Y * r ) is calculated by fxing (X r , Y * r ).
Step 4: (Updating the personal best position) If the ftness value of (X r , Y * r ) is better than that of Pbest i � (Xbest i , Ybest i ), update Pbest i .
Step 5: (Updating the global best position) after comparing the ftness value of Gbest with that of all personal best positions, Gbest is updated with the global best position Gbest � (Xgbest, Ygbest).
Step 6: (Termination) Go to step 2 until the stopping criterion is met.ALGORITHM 3: Adopted QBPSO algorithm.14 Journal of Mathematics

Benders Subproblem. For a given variable
, the Benders subproblem SP(1) is written as follows:

Journal of Mathematics
Given the structure of SP(1), the model certainly serves as a feasible solution (since SP(1) is a minimization problem, and the included variables are bounded from below).Te SP(1) subproblem can be solved and broken down into a linear subproblem based on two sets of variables Assuming that θ 1 , ..., θ 17 are dual variables of constraints 27, 28, 29, 30, 31, 32, 33, 36, 37, 39, 40, 41, 42, 43, 44, 45, and 46, respectively, then the following model is the dual of the Bands subproblem shown in SPD(θ |  e,  y,  x,  x ′ ,  z): Te master problem is now transformed into a problem with binary variables e F s , y F pt , x F ij , x ′F ij , z F kt ′ and a continuous variable π.Using double values obtained from the Benders subproblem, a generated cut ( 51) is incorporated into the master problem in each iteration.As the Benders subproblem is always bounded and feasible, the dual subproblem is similarly feasible and bounded as well based on the strong dual theorem.Terefore, the added constraint in each iteration is an optimal cut constraint and merely eliminates the nonoptimal ones.In order to examine the constraints added to the MP model based on extreme points D(SPD), the branch-and-cut idea is proposed.Te MP problem is solved before solving the subproblem to reach the optimal one in the classical implementation of the Benders method, meaning that in the MP problem, similar solutions are unnecessarily remet.Terefore, the problem is solved in a branch and boundary tree as [62].In each potential solution of the search tree, the subproblem is solved, and the optimal generated cut is added if needed.

Pareto-Optimal Cut Generation.
Reducing the number of iterations is of signifcant value in speeding up the implementation of the Benders method.In this section, a cutting Pareto-optimum solution method, introduced by Magnanti and Wang [63], is presented.Let Z be a set of feasible solutions to the master problem, MP, and opt(P) be the optimal solution to a problem P, then defne the function (71) Magnanti and Wang introduced the concept of dominant cut within the Benders method.Te cut generated by the dual solution ) is said to be dominant over the cut generated by the dual solution (  θ 1 , ...,  θ 17 ) if and only if Furthermore, inequality, at least at one point, is strict.Tis is called an optimal-Pareto cut if no more dominant cut is available.Additionally, inequality, at least at one point, is strict.Tis is called an optimal-Pareto cut if no more dominant cut is available.To obtain the inferred optimal-Pareto cut, solving the following scheduling problem is needed: where the polyhedron ∆ is generated from the solution space of dual problem SPD; also, (e ) is an interior point of the convex hull of Z, named a central point.
For the frst value of the central point, one can employ a heuristic method, leading in turn to a feasible solution.Besides, for the iterations to come, the mean of the certain current point and the active solution to the MP problem is used.Finally, the following steps are taken to perform the Benders technique: Step 1: Consider the start point Step 2: Solve the MW model based on ), and generate the optimal-Pareto cut.
Step 3: Add the Pareto cut produced in step 2 to the MW, before solving it with the newly added cut.
Step 4: If the diference in the solution of the main Benders model in two consecutive iterations (current and previous) is below a given value ε or the number of iterations is equal to or greater than a defned boundary, then the algorithm ends.Otherwise, go to the second stage.
Step 5: Te last optimal solution of the subproblem determines the value of continuous variables, and it is the last solution to the MW which determines the value of integer variables.Te fowchart (Figure 3) of the proposed method is given below.Also, the pseudocode of the algorithms is given by Figure 4 in Appendix B.

Computational Results
Tis section presents the computational experiments used to assess the performance of the proposed hybrid algorithm in solving the model, while also ofering some sensitive analyses.To evaluate the efciency of the method in terms of convergence rate and solution quality, the method is applied to random-test instances generated to assess the algorithms' performance.Computational experimentation is conducted over a set of 30 instances with varying scales.
Tese instances are generated based on the real-case instances presented in [2,64].Te size of each instance is reported in Table 2. Te instances are denoted as Ins L (S, P, T, T ′ , I, J, K), in which L accounts for the instance number, and S, P, T, T ′ , I, J, and K represent the number of suppliers, plants, producing technology, recycling 18 Journal of Mathematics technology, distribution centers, and customers, as well as the number of recycling centers, respectively.

Parameter Setting. One of the most practical techniques
to determine the values of parameters in metaheuristic algorithms for the purpose of retaining optimum performance is the Taguchi method.Hence, to reach the or nearoptimal solution in this step, the efect of three parameters c 1 , c 2 and α on the QBPSO efciency and capability is provided.Remarkably enough, based on the relations α + β � 1 and w + c 1 + c 2 � 1, setting the values of parameters c 1 , c 2 , and α sufces.For these factors, fve levels are considered in Table 3.As such, the Taguchi L 25 orthogonal array is selected (due to the number of parameters and their selected levels).Terefore, 25 experiments should be performed using a combination of levels for each parameter based on L 25 .
Te medium size of the instance (5, 10, 10, 10, 12, 30, 10) is used to calibrate the parameters of the proposed hybrid-based heuristic algorithm.Twenty-fve particles are regarded as the population size, and the maximum iteration number of 100 is used for the stopping criterion.For each set of parameters, ten runs of algorithm are considered.Te average value of proft for the leader as the signal-to-noise (S/N) ratio is performed with the bestknown value (S/N) being used when it comes to more effcient performance.
Consequently, to identify the best optimal combination of the levels of parameters, the mean value of S/N is computed for each single level.Te results are presented in Figures 5-7.   4 reports the analysis of variance to determine the most efective parameters in QBPSO performance.Te column percentage contribution shows the signifcance of parameter efects on the PSO-based method performance.It can be clearly deduced that α is an important and unique parameter in this method, with a 98.3% share in yield.

Journal of Mathematics
Finally, based on the above analysis, a proposed hybrid algorithm with an optimal combination of parameters is used to solve the problem being studied.

Evaluation of the Proposed Algorithm.
All generated instances are carried out by the proposed hybrid algorithm on a PC Pentium Intel Core i5 with 4 GB of RAM.Te algorithms are coded in AIMMS 3.12 optimization software with solver CPLEX12.For the most efcient performance of the hybrid method, diferent swarm sizes are brought in from N � 10 to 100 to be allocated to each group of instances.Te most efcient performance for the algorithm (based on the highest objective function quality) is within 20 to 55.Since the samples are randomly generated, the optimal solution remains unknown.Tus, the bilevel parametric optimization toolbox (B-POP) [65], Lagrangean relaxationbased method [58], and hybrid method based on the genetic algorithm [58] are applied to show the efciency of the proposed algorithm.Moreover, for each data test, 10 independent runs are performed due to the randomness involved in the hybrid algorithm, from which the best and average Gaps of the leader's objective function are selected and presented.For readability, a relative percentile Gap is applied as the following formula: where F H , F LR , F G denotes the best values of the solution yielded by the proposed hybrid, Lagrangean relaxation (upper bound), and genetic-based algorithms, respectively, for the leader's proft.F denotes the best lower bound value solution achieved by the B-POP toolbox.For the termination condition, the maximum CPU time allowed (in seconds) is used.Te results obtained for small and medium scales are given in Tables 5 and 6.Te frst column of the tables demonstrates the swarm size N (also applied to a number of the genes in the genetic-based method).Te third and fourth columns show the best and average gaps between the proposed method and the B-POP toolbox, respectively.Te ffth and sixth columns illustrate the best and average gaps between the Lagrangean relation algorithm and the B-POP toolbox, respectively.Te seventh and eighth columns show the best and average gaps between the genetic-based method and the B-POP toolbox, respectively.Te maximum CPU time allowed (rounded in seconds) for termination of the algorithms is reported in the next adjacent columns.Te time is raised to 7200 sec to catch up with the growing size of the problem.
For small-size tests, the B-POP is able to fnd the optimal solution in a reasonable time span.Tis, however, is not possible for medium-sized tests.Terefore, the most workable solution is reported with a time limit of 7200 seconds.
Based on Table 5, the proposed hybrid method can obtain the optimal solution for all small-size cases, except for Ins7 and 10, and the best optimality gap ranges are between 0 and 0.5, proof of the efciency of the proposed algorithm for an optimal solution for those instances.
For the same medium cases in particular, e.g., Ins17 to Ins20, the optimal solution is not available by the B-POP within the given time constraints.In most of these cases, the proposed algorithm either introduces a better solution or at least fnds a solution equal to that of the feasible bound provided by the B-POP.Tis is evidently confrmed by nonpositive gaps.However, the gap between the solution obtained by the B-POP and the objective value of the hybrid solution remains below 2%.
Based on a general analogy between the proposed method and Lagrangian and genetic-based methods, the QBPSO-Benders works relatively faster, and the geneticbased algorithm is speedier than the QBPSO-Benders.Hence, the solutions obtained from the proposed method work better than those from the other two.Generally, the time required increases as expected but is still reasonable for this category of problems.For large-scale tests, the B-POP is incapable of fnding any feasible solutions in a reasonable time of 7200 seconds.Terefore, only a comparison (Table 7) is made between the three methods.Te average gap (Gap � (F H − F LR (F G ))/F LR (F G ) × 100) between these three methods and the execution time of the methods have also been incorporated and reported.
Te results in Table 7 underline the superiority and quality of the solution obtained from QBPSO-based method.In all instances, the proposed algorithm stops earlier than the specifed time, while the Lagrangian-based method fails to yield better solutions despite spending longer time.Te genetic algorithm, however, comes to an end highly quickly.Tis is explained by the fact that the suggested solution employs the Benders decomposition methodology, which takes longer to execute.Instead, the outcome is a superior, more intelligent answer.Figure 8 compares the time solutions of the methods.
In this certain instance, 443 seconds are totally needed to obtain an upper-bound value of 1.83 × 10 − 8 by the Lagrangian-based algorithm, and in 187 seconds, the genetic-based algorithm is stopped with the objective value 1.63 × 10 − 8 , while the proposed QBPSO-based algorithm is able to obtain 1.7 × 10 − 8 in 202 seconds, a manifest indication that the QBPSO-based algorithm is a relatively fast, reliable, and efcient method.

Sensitive Analysis.
Let us now examine the efects of fnancial incentives in an example, for which Ins10 is taken.Te instance is solved by the hybrid algorithm, with the results showing that the leader uses technology 4 in plant 2 and technology 3 at recycling center 2 for the purpose of securing fnancial incentives.Te related environmental impact and fnancial incentives of the technologies in the plant and recycling centers are given in Tables 8-11, respectively: Although opening and manufacturing costs remain too high, plant 2 with technology 4 and recycling center 2 with technology 3 (which the relevant values highlighted in Tables   Journal of Mathematics 8-11) bring along less environmental impacts.Tese cleaner technologies stand a higher chance of receiving government incentives.Now, to show the efectiveness of fnancial incentives, the leader's advantages and the follower's disadvantages are quantifed when the fnancial incentives are fxed for the leader.To this end, it is demonstrated how the benefts to the leader and follower vary if the fnancial incentives to the leader are halved, i.e., λ pt /2 and η kt ′ /2, respectively.Furthermore, Table 12 reported the percentages of follower losses of market share stemming from the changes in the fnancial incentives.
Table 12 shows that followers bag more market share when government incentives to the leader are lowered.In fact, with the reduction of fnancial incentives, the leader, as the frst decision-maker, will enjoy less power in choosing the facility.Rather than incentives, it is the costs that remain the key deciding factor in choosing facilities with technologies.Terefore, some plants, facilities, and technologies which are cost efective with government incentives, may no longer stay so after incentives are slashed.
Yet the same conclusion cannot be stated for incentives HC i and LT i , as in the test instances, the efect of HC i proves greater than LT i .

Conclusions
Te present study models bilevel programming on a competitive, CLSC network design problem.In the proposed model, the environmental impacts brought about by establishing or opening plants, recycling centers, and facilities, as well as the emissions for which product manufacturing is responsible, recycling (in diferent centers), and transporting, are all looked into.Assume that two rival companies produce a commodity simultaneously and they decide to open plants, recycling centers, and facilities.Te best open facilities will satisfy customer demand based on each customer's huf rule preferences.It is presumed that with its fnancial incentives, the government aims to encourage competitive frms to bring in cleaner technologies.Tis new model, which represents one of the main innovations of the article, can be used in many areas that beneft from government incentives.Another innovation of the article refers to the problem solution method, which is a heuristic method.
Since bilevel problems are inherently Np-hard, to solve the proposed model, a metaheuristic algorithm based on the quantum PSO method is adopted, for which several singlelevel subproblems, some still NP-hard, are solved.Based on the structure of these subproblems, a Benders decomposition method is used, while a new rule for cut generation is implemented at Benders decomposition steps.
To evaluate the algorithm, we try to generalize the example of new test examples based on similar cases in the subject literature.So, thirty instances with varying sizes are considered.
Computational experiments demonstrate that the proposed method is a fast and efcient algorithm to help solve instances when the appropriate parameters are employed.Te results of the computational experiments show that fnancial incentives positively afect the use of cleaner technologies in the design of supply chain networks, mitigating environmental damage.
Terefore, the proposed model can be used as a fresh approach to CLSC sustainability, promising signifcant contribution to tackling environmental implications.
As a novel and innovative strategy, fnancial incentives can be variable.A three-level model is needed where the government considers fnancial incentives at the frst level, and delivers them to the competing companies, at the other two.Moreover, uncertainty could arise with regard to some parameters, such as the demand of retailers or the amount of fnancial incentives required to deploy technologies in a factory, leading to a more serious issue.Lastly, the Lagrangian method or even the branch and bound method can be brought forward to implement metaheuristic methods, such as the PSO.
λ pt Te fnancial incentives in the case of opening plant p with technology t c Te percentage at which the reverse products can be recovered at a facility for shipment to the plant pr L ij (pr F ij ) Te cost of purchasing the reverse products from customer j with facility i by the leader (follower) frm pc L ip (pc F ip ) Te price realized by the leader (follower) frm for selling reverse products by facility i to plant l cz ip Te cost of conveying reverse products from facility i to plant p cz ik Te cost of conveying reverse products from facility i to recycling center k Ecz ik Environmental

B. The Pseudo-Code of QBPSO and Benders Decomposition Algorithms
Pseudocode of the algorithms given in Figures 2 and 4.
(a) Te constraints of the upper-level problem do not have any lower-level variables.(b) Any feasible solution to the upper-level problem leads to a feasible solution to the lower-level one.(c) Te lower-level problem includes only the binary location variables of the upper-level problem.

Figure 9 :
Figure 9: Convergence characteristics of the hybrid method.

Table 1 :
A comprehensive review of the sustainable CLSC literature.
Lr kt ′ ) for any particle; next, in the lower level, the follower's problem is solved by fxing X r .
Step1: (Initialization) 1.1.Set the value parameters of the QBPSO algorithm and population sizes N, Q i .1.2.By equation x j i (t), randomly generate the initialized value of the leader's decision variables X r � (e Lr s , y Lr pt , x Lr i , z * (also (x Fr ij , x ′ Fr ij , z Fr kt ′ , u Fr sp , g Fr pt , g ′Fr kt , v Fr pi , qr Fr ij , o Fr ks , w Fr ip , w ′Fr ik ) * z) can be used to generate the feasible cut constraint (which is the optimal cut) in the master problem.4.3.2.Benders Master Subproblem.D(SPD)set can be viewed as the extreme point of the dual problem SPD(θ 1 , ..., θ17|  e,  y,  x,  x ′ ,  z).Te Benders master problem (MP) could, as a result, be written in the following form: θ 1 , ..., θ 17   ∈∆,

Table 3 :
Te levels of parameters.

Table 4 :
Te analysis of variance.

Table 5 :
Results for the small size.

Table 6 :
Results for the medium size.

Table 8 :
Parameter Ecy pt related to technologies.

Table 12 :
Te follower's market share loss due to halved fnancial incentives.
impact linked to conveying reverse products from facility i to recycling center k CEz kt ′ Te fxed operating cost of technology t ′ at recycling center k cz kt ′ Te fxed operating cost of recycling center k with technology t ′ (cz kt ′ � F k + CEz kt ′ ) Ecz kt ′ Environmental impact related to technology t ′ at recycling center k η kt ′ Te fnancial incentives in the case of opening recycling center k with technology t ′ cc kt ′ Te per-unit recovering cost of reverse products at recycling center k with technology t ′ p ′ c L ks (p ′ c F ks ) Te price realized by the leader (follower) frm for selling reverse products to supplier s by recycling center k qr L ij (qr F ij ) Te per-unit price realized by the leader for purchasing reverse products from customer j at facility i by the leader (follower) cs ks Te cost of conveying reverse products from recycling center k to supplier s Ecs ks Environmental impact related to the distribution of reverse products from recycling center k to supplier s EM pt Te per-unit cost of emission caused by manufacturing at plant p with technology t EM kt ′ Te per-unit cost of emission caused by remanufacturing at recycling center k with technology t ′ Cap s Te capacity of supplier s to handle primary products Cap pt Te capacity of plant p with technology t to handle primary products Ca i Te capacity of facility i to handle primary products Cad dt ′ Te capacity of recycling center d with technology t ′ to handle primary products cbu sp Te cost of purchasing each unit from supplier s by plant p B L (B F ) Te maximum fnancial incentives realized by the government for the leader (follower) A i Te attractiveness parameter of facility i d ij Te distance of facility i to customer j X max (Y max , Z max ) F s ) 1 if supplier s is selected for ofering the services by the leader (follower); otherwise, 0 u L sp (u F sp ) Quantity of the forward products shipped from supplier s to plant p by the leader (follower) frm g L pt (g F pt ) Amount of products manufactured at plant p with technology t by the leader (follower) g ′L kt (g ′F kt ) Amount of products recycled at recycling center k with technology t ′ by the leader (follower) y L pt (y F pt ) 1 if plant p is used with technology t by the leader (follower); otherwise, 0 v L pi (v F pi ) Quantity of the forward products shipped from plant p to facility i by the leader (follower) x L i (x F i ) 1 if the location i is open by the leader (follower) as a facility; otherwise, 0 x L ij (x F ij ) 1 if customer j chooses facility i used by the leader (follower); otherwise, 0 x ′L ij (x ′F ij ) 1 if customer j chooses facility i (for reverse products) used by the leader (follower); otherwise, 0 z L kt ′ (z F kt ′ ) 1 if recycling center k is used with technology t ′ by the leader (follower); otherwise, 0 o L ks (o F ks ) Quantity of the reverse products shipped from recycling center k to supplier s by the leader (follower) w L ip (w F ip ) Quantity of the reverse products shipped from facility i to plant p by the leader (follower) w L ij (w F ij ) Quantity of the reverse products shipped from customer j to plant i by the leader (follower) w ′L ik (w ′F ik ) Quantity of the reverse products shipped from facility i to recycling center k by the leader (follower)