A Dynamic Programming-Based Sustainable Inventory-Allocation Planning Problem with Carbon Emissions and Defective Item Disposal under a Fuzzy Random Environment

There is a growing concern that business enterprises focus primarily on their economic activities and ignore the impact of these activities on the environment and the society.This paper investigates a novel sustainable inventory-allocation planning model with carbon emissions and defective item disposal over multiple periods under a fuzzy random environment. In this paper, a carbon credit price and a carbon cap are proposed to demonstrate the effect of carbon emissions’ costs on the inventory-allocation network costs.The percentage of poor quality products frommanufacturers that need to be rejected is assumed to be fuzzy random. Because of the complexity of the model, dynamic programming-based particle swarm optimization withmultiple social learning structures, a DP-based GLNPSO, and a fuzzy random simulation are proposed to solve the model. A case is then given to demonstrate the efficiency and effectiveness of the proposed model and the DP-based GLNPSO algorithm. The results found that total costs across the inventory-allocation network varied with changes in the carbon cap and that carbon emissions’ reductions could be utilized to gain greater profits.


Introduction
The need for environmental awareness has affected several aspects of the global economy such as supply chain management.Traditionally, supply chain network design problems have tended to be analyzed from a fixed and variable cost perspective without any consideration of the carbon footprint factor [1,2].However, this analysis behavior has now been forced to focus on more environmentally conscious supply chain planning optimization models in which economic aspects (profit maximization and cost minimization) are integrated with clear environmental goals such as carbon footprint reductions [3][4][5].There has been an increasing research interest in sustainable supply chain network design, with most suggesting that environmental sustainability be viewed as an opportunity rather than a risk [6][7][8].Recently, many companies have realized that sustainability is a bottom-line requirement and therefore can no longer be ignored.Despite all these studies, there is still an urgent requirement to develop quantitative models that address these sustainability issues.
Current global efforts to minimize environmental impacts have encouraged companies to change their practices to increase efficiency and reduce negative externalities [9], which has led to a higher focus on sustainable practices such as recycling and waste management [10,11].Shaw et al. [12] designed a sustainable location-allocation model that considered the consumers' environmental behavior, which affected consumer demand for low carbon emissions products.Torabi et al. [13] proposed a generic model for a sustainable wine manufacturer-distribution network that encompassed economic, environmental, and social objectives.Diabat and Al-Salem [14] developed a nonlinear mixed integer program that minimized the cost of a stochastic inventory-allocation network that included a carbon emissions cost term to account for environmental concerns.They proposed a concept of emission cap, which means the company needs to pay for the amount of carbon emission that exceeds the carbon cap in their model.Sustainable supply chains can be achieved by developing a supply chain that either incorporates environmental concerns or incorporates reverse logistics such as recycling.The most notable international framework for minimizing greenhouse gas (GHG) emissions was the Kyoto Protocol, an international agreement ratified by the United Nations, in which emissions trading schemes and a carbon credit market were outlined so that countries who had not exceeded their nominated carbon emissions targets could sell the excess to other countries, thereby giving GHGs the status of an international commodity [15].
The uncertain competitive environment means that inventory-allocation management needs to be more flexible and efficient as enterprises must not only reduce their cost of storage and distribution, but also ensure the downstream supply chain retailers are not unduly affected because of out of stock items at a critical time.Without proper inventory control, a retailer's loss can directly affect the interests across the whole supply chain; therefore, supply chain inventory and distribution management has become an important element of supply chain efficiency in the past few years [16].Time has also become a very important factor when managing products, especially when a number of time-periods are involved.Pal et al. [17] proposed a model to determine order quantities between suppliers at the initial stage and the optimal inventory levels over multiple periods for all stages in the inventory-allocation network.Radhi and Zhang [18] extended multiobjective nonlinear mixed integer models for multiperiod allocation planning problems that involved multiple suppliers and multiple products.In addition, because production systems are not perfect, defective products randomly appear, the production of which follows a probability distribution.Kennedy and Eberhart [19] considered a single-vendor, single-buyer inventory model that considered the impact of varying percentages for defective goods, storage costs, and disposal schemes.There has also been significant research interest in different aspects of imperfect production inventory models [20].However, as most of these studies have tended to focus on economic order quantities [21] or economic production quantities [22], defective item disposal has not been applied across the whole inventory-allocation network.Further, in most studies, the defective item rate has been assumed to be a constant [23], which does not accurately reflect reality.Product defect rates are characterized by both fuzzy uncertainty and randomness, or the so-called twofold uncertainty.Therefore, an inventory-allocation management dynamic programming model with a fuzzy random defect rate and fuzzy annual demand is proposed in this paper.
In recent years, the higher levels of uncertainty within inventory-allocation management have been shown to be extremely costly for manufacturers and the total supply chain [24,25]; therefore, inventory-allocation models that can reduce or eliminate uncertainty to avoid incorrect and costly decisions are needed.As a general theoretical framework to model practical problems with unknown parameters, uncertain random programming was introduced by Lin [26], which was then extended to uncertain random multiobjective programming [27] and uncertain random multilevel programming [28].More recent studies on the application of fuzzy set theory to inventory-allocation problems can be found in [29,30].
In this paper, an inventory-allocation planning model with carbon emissions and defective item disposal under a fuzzy random environment is considered, with annual demand, transportation costs, inventory conversion factors, and product defect percentages being fuzzy random variables.Of the many heuristics and metaheuristics algorithms, global best, local and near neighbor best particle swarm optimization (GLNPSO) [31] has been proven to be a powerful competitor in the field of nondeterministic polynomial-time-(NP-) hard problem optimization.Because of the relationships between the state equation, the constraint conditions, and the objective functions, a dynamic programming-based GLNPSO (DP-based GLNPSO) algorithm was developed [32,33] which reduced the particle dimensions using the state equation.In this paper, a DP-based GLNPSO algorithm is developed to solve the research problem model, in which initialization and adjustment methods are developed to avoid infeasible solutions.
The main contributions of this paper are as follows.A sustainable inventory-allocation model with carbon emissions and defective item disposal is developed, for which several constraints are considered to make the model more applicable to reality.Then, a modified version of the particle swarm optimization algorithm called the DP-based GLNPSO is constructed to solve the developed model.Finally, a representative example is applied to tune the parameters of the DP-based GLNPSO.The remainder of this paper is organized as follows.The problem statement for the inventoryallocation planning model with carbon emissions and defect item disposal (IAPCEDID) under an uncertain environment is introduced in Section 2. In Section 3, the suggested model and its formulations are described.Section 4 describes the development of the DP-based GLNPSO to solve the IAPCEDID, and the efficiency of the proposed model is illustrated by a representative example in Section 5. Finally, in Section 6, the conclusions and limitations are discussed and future research directions elaborated.

Key Problem Statement
In the supply chain, inventory-allocation management with effective quality and carbon emissions controls is essential for an efficient manufacturer-retailer network.As retailers order products from different manufacturers at specified timeperiods, there is a multiple stage problem planning horizon, with replenishment taking place at the beginning of each of these stages [34,35].With government regulations on carbon emissions (carbon cap), transport needs to maintain carbon emissions below a certain level.This sustainable manufacturer-retailer network is based on the allocation of carbon units in line with established carbon emissions reduction targets.At the end of each period, the emissions values of the company are verified, and each emitter must then offset its carbon emissions against the target established by the government.The discrepancy between the imposed target and the actual emissions may be offset by the company purchasing carbon units in the domestic market [36].Alternatively, for each ton of CO 2 emissions avoided, the company receives a carbon emissions certificate that can be sold on the futures market.
In many inventory-allocation problems, all products are deemed to be of suitable quality; however, in the real world, there is a probability that some items will be defective, the percentage of which is uncertain.Items are classified as being of suitable quality or as being defective, with all defective items found during the screening process being returned to the manufacturer.For the sake of convenience, the manufacturers take back the defective items as a batch in the next shipment [28,37,38].However, if there are defective items, shortages may be difficult to avoid.Therefore, a penalty cost is considered to reduce the losses because of possible shortages.Due to the uncertain constraints, manufacturers are not able to produce items at more than a specified value and must also provide the products to the retailers under all-unit and incremental quantity discount policies [39].From the above, a sustainable inventory-allocation planning model with carbon emissions and defective item disposal (IAPCEDID) is considered based on dynamic programming under a fuzzy random environment [40,41].The flow of items in the proposed supply chain network is shown in Figure 1.The proposed IAPCEDID problem can be described as follows.There are  manufacturers,  warehouses, and  retailers.The manager purchases the required items from specific manufacturers at the beginning of each stage.On receipt from the manufacturer, items are classified as suitable or defective.Defective items are returned to the manufacturer while suitable items are transported to the corresponding warehouses and allocations made according to retailer demands.
There are the following assumptions in this study: (1) Item demand, transportation costs, and the percentage of defective items in each stage are regarded as fuzzy random variables.(2) The span of each stage is identical.(3) Shortages are allowed and a penalty cost is applied to reduce losses because of shortages.(4) The manufacturer is liable for the costs incurred for returned defective items.(5) Every item type has a corresponding warehouse with a maximum storage capacity.The items are first transported from the manufacturers to the warehouses and then from the warehouses to the retailer stores.(6) The order lead time is negligible.At the beginning of each stage, all purchased items arrive at the corresponding warehouses.(7) The retailer product demands are independent of one another and are fixed in a stage.

Modeling
In this section, a dynamic programming model for the IAPCEDID that considers fuzziness and randomness is constructed.

Notations.
The following notations are adopted.

Objective Functions.
The objective function defines the total cost of the complete manufacturer-retailer network.The aim of the project manager is to determine the order quantity and inventory level for each item in each stage so that total manufacturer-retailer network costs are minimized.The total costs are made up of purchasing costs, transport costs, inventory costs, penalty costs, and carbon emissions costs.
In the proposed inventory-allocation model, the retailer orders products under several discount policies.In this paper, an incremental quantity discount is considered, for which the products are delivered in known packets containing a certain number of items.In the incremental quantity discount policy, the purchase cost of Item  in the ( + 1)th stage depends on the ordered quantity.Each price discount-point is obtained by (1) Therefore, the purchase cost under this policy ( PC ) is Let   be the unit inventory cost of Item .However, as not all items are stored in the warehouse over the whole stage, the actual inventory cost is less than     .To deal with this, an inventory conversion factor is introduced to balance the difference between the actual inventory quantity and   in the ( + 1)th stage.  () is the function for the current inventory for Item  across the whole manufacturer-retailer network, in which a unit of Item  is one stage and   () =   ; therefore, the inventory conversion factor X can be defined as follows: Let   be the inspection fee for Item .Before being transported to the warehouse, each item is inspected, after which all defective items are returned to the manufacturer and a return price is requested.As the purchase quantity is   , the all inspection fee should naturally be     .Let q be the percentage of defective Item  in the ( + 1)th stage.Let   be the total inventory price, so As the transportation distances between the manufacturers, warehouses, and retail stores are all different, the transportation vehicles are also different, making total transportation costs difficult to determine.Let γ be the transportation price of Item  per kilometer,   be the transportation distance between the manufacturers and the corresponding warehouses, and   be the transportation distance between the warehouses and the retail stores.Hence,   is the transportation quantity for Item  from the manufacturer to the corresponding warehouse in the ( + 1)th stage, and μ is the demand at each retail store in the (+1)th stage; therefore,   is the total transportation costs for Item  over the manufacturer-retailer network as follows: A penalty cost is applied when the demand for Item  cannot be met.Let   be the penalty if the demand for Item  cannot be met in the ( + 1)th stage.Let  PeC be the penalty cost for Item , which can be determined as follows: The carbon emissions costs are the penalties/rewards in a carbon constrained scenario.These two terms represent the transport emissions from the manufacturers to the warehouses and from the warehouses to the retail stores.Let   be the fuel consumption for a transportation vehicle and   be the CO 2 emissions from the gasoline; therefore, the vehicle's carbon emissions per kilometer are     .
Let  be the carbon cap during transport and a similar carbon price () be considered for the purchase as well as the sale of carbon credits [7]; therefore,  EC is the carbon emissions cost for Item  over the complete manufacturer-retailer network, as follows: As it is very difficult to deal with objective functions that have fuzzy random factors, Khan et al. [42] developed a method to convert fuzzy random variables in both the objective function and the constraints into fuzzy variables similar to trapezoidal fuzzy numbers.Based on the theory proposed by Heilpern [43], without loss of generality, the expected value operator is used to convert the uncertain model into a deterministic model, which can then be used to transform the fuzzy random objective functions and constraints into crisp equivalences.

State Equation.
The state equation describes the relationship between stage th and stage ( + 1)th.Let   be the inventory level and ũ be the demand.If the item is deemed suitable after inspection, it is then transported to the warehouse; therefore, the inventory level of Item  in the corresponding warehouse at the beginning of the ( + 1)th stage,  (+1) , is   +   (1−[ q ])− ũ , or  (+1) is zero.The relationship between the inventory level, purchase quantity, and demand can be modeled as follows:

Initial and Terminal Conditions.
The initial conditions describe the storage level for Item  before the beginning stage.The terminal conditions describe the storage level for Item  at the end of the manufacturer-retailer network.Let   and   be the initial and terminal inventory levels for Item .
Generally, in a practical condition, the two conditions above can be settled as   = 0 and  = 0, ∀ ∈ Ψ.The initial condition and terminal condition can be presented mathematically as follows: =   , ∀ ∈ Ψ (10)

Constraint Conditions. If a manager decides to purchase
Item  in stage ( + 1)th, let  min  and  max  be the minimum purchase quantity for Item  and the maximum purchase quantity for Item  in stage ( + 1)th; the purchase quantity for Item  in each stage must be within this specified range: The retailer has financial constraints.Let  be the total purchase budget; therefore,  PC should be within the budget.
As maximum storage levels must be taken into consideration, the inventory level of each item in each stage cannot exceed the maximum storage level.Let  max  be the maximum storage for Item .The storage level   should satisfy the following condition: 3.6.Global Model.The IAPCEDID determines the quantity of item  that needs to be purchased from the manufacturer and distributed to the retailers in stage  to minimize the total expected cost function under the considered constraints and a carbon emissions cost that is added to account for the environmental considerations.The model proposed here is based on dynamic programming over a planning horizon that has multiple periods with initial and terminal conditions and state equation constraints.The objective function is made up of the purchase costs ( PC ), transportation costs ( TC ), inventory costs ( IC ), penalty costs ( PeC ), and carbon costs ( EC ).As the items are classified as suitable or defective, the processes for both item inspection and defective item disposal are included.In summary, the global model is as follows:

Dynamic Programming-Based GLNPSO
4.1.General Mechanism of DP-Based GLNPSO.Based on the particle swarm optimization (PSO) proposed by Kennedy [31], the main PSO algorithm is developed based on a GLNPSO with multiple social structures [44].In this study, based on an iterative dynamic programming model, a DPbased GLNPSO algorithm is developed to solve the problem.The proposed DP-based GLNPSO is a variant of the GLNPSO, with the main difference being the dimensionality reduction of the variables.With appropriate model transformations, a dynamic programming-based particle swarm optimization with a multiple social learning structures (DPbased GLNPSO) algorithm is developed to solve the IAPCE-DID.The goal is to search for satisfactory solutions to ( 14) by constantly moving the direction of the particles towards optimization.The notations needed are as follows: : iteration index,  = 1, 2, . . ., .
V  (): velocity of the th particle at the th dimension in the th iteration.
(): position of the th particle at the th dimension in the th iteration.
best  : personal best position of the th particle at the th dimension.

𝑝 best
: local best position of the th particle at the th dimension.

𝑝 best
: near neighbor best positions position of the th particle at the th dimension.) is the best position for several adjacent particles and the near neighbor best ( best  ) is a social learning behavior that is determined based on the fitnessdistance-ratio (FDR).Each particle is represented by its position in a - space, where  is the problem dimension.Unlike the GLNPSO, using the state equation in the dynamic programming model [32], the DP-based GLNPSO can reduce the particle dimensions, the details for which are shown in Figure 2. In this problem, the problem dimension contains decision variables (  ) and state variables (  ), which are, respectively, related to the objectives and constraints.It should be noted that if the decision variables are known, then the state variables can be determined using the state equation.

𝑝
The essential difference between the DP-based GLNPSO and the GLNPSO is that the DP-based GLNPSO takes advantage of the iterative mechanism in the dynamic programming model to reduce the dimensions of the particles, thereby significantly reducing the solution search space.It should be noted that if a GLNPSO were used in this study, the particle dimensions would be 2× compared to × for a DP-based GLNPSO particle.
where    () can be the th part of the th particle in the th generation.Note that every part of a particle is a - vector, which can be denoted as where   (+1) () is the ( + 1)th dimension of    () for the th particle in the th generation;  = 0, 1, . . .,  − 1.In order to be in line with the expression   () = [ 1 (),  2 (), . . .,   ()],   Step 1. Set  = 1,  = 0. Step 4. If the stopping criterion is met, that is,  =  and  =  − 1, then the initialization for the th particle is completed.Otherwise,  =  + 1 and return to Step 2.

Adjusting Strategy.
An adjustment strategy is used to generate the particle and adjust it to the feasible region.After updating to avoid an infeasible position, the particle is adjusted as follows.
Step 6.If the stopping criterion is met, that is,  =  and  =  − 1, then the adjustment for the th particle is completed.

Updating Strategy and Decoding Strategy.
Throughout the DP-based GLNPSO optimization process, the social learning behavior component includes the global best, the local best, and the near neighbor best.The search benefits from the sharing of information with the whole population about the particles' discoveries and past experiences.In each generation, the  best  is calculated as the best position the swarm reaches; the  best  is calculated as the best position from several adjacent particles; the  best  is a social learning behavior which is determined based on the fitness-distanceratio (FDR) [45]; and () is the inertia weight used to control the impact of the previous velocities on the current velocity, which influences the trade-off between the global and the local exploration abilities during the search.The particle then updates the positions using the new velocity, after which each particle updates its velocity   () to approach the new  best  ,  best  ,  best  , and  best  : The DP-based GLNPSO decoding strategy transforms the particle  4.5.Overall Procedure.Based on the above sections, the overall procedure for the DP-based GLNPSO algorithm can be given.The algorithm is shown in Figure 3, the details for which are as follows.
Step 1. Initialize the particle   and   using the initialization strategy.
Step 2. Check the constraints based on the DP-based GLNPSO, and avoid an infeasible position.
Step 3. Calculate the initial particles to generate the fitness value,  best  ,  best  , and the  best  .
Step 5. Adjust the particles to the feasible region using the adjustment strategy.
Step 6.If the stopping criterion is met, go to Step 6; otherwise,  =  + 1 and return to Step 4.
Step 7. Determine the fitness value and global best position.

Case Study
To illustrate the performance of the proposed DP-based method and to show the effect of a carbon cap on the optimization results, the method was applied to a particular case.
A sustainable logistics item structure made up of five main parts is considered, as shown in Figure 1, in which each stage is one month, and four periods, five retail stores, and four items with corresponding warehouses are considered.After the items are inspected, suitable items are transported to the warehouses and defective items returned to the corresponding manufacturers.In this case, a strategy is generated to minimize the inventory, allocation, and carbon emissions costs.
The carbon emissions can be converted into the carbon credits cost price, which has the same dimensions as the economic costs [9].This case has four items ( 1 ,  2 ,  3 ,  4 ) and five retail stores.Each retail store's demand for each item for each month is shown in Table 1; the purchase information and item inventory information are shown in Tables 2 and 3; and the distribution information is shown in Tables 4 and 5.All fuzzy random variables are represented by triangular fuzzy numbers, with the parameters obeying a normal distribution.The fuel consumption (  ) is 0.245 (l/km), CO 2 emissions for a unit of gasoline (  ) are 2.63 (kg/l), and the carbon credit price () is 189.29 (CNY/ton).These emissions' parameters were referenced from the Environmental Data for International Cargo Transport & Road Transport [46].

DP-Based GLNPSO Parameter
Selection.The IAPCEDID parameters were determined based on practical situations and past studies to observe the behavior of the algorithm at different parameter settings.From a comparison of several parameter sets including the acceleration constants   ,   ,   , and   and the inertia weight (), the most reasonable parameters were identified.Through further experiments, (1) = 1.0 and () = 0.1 were found to be the most suitable to control the impact of the previous velocities on the current velocity and to influence the trade-off between the global and local experiences.The other parameters were selected by comparing the results with the observations from the dynamic search swarm behavior.The selection of the acceleration coefficients   ,   ,   , and   affects both the convergence speed and the ability to escape from the local minima.In this paper   =   =   =   = 2 were chosen as the most suitable.
For maximum generation  and population size , the maximum iteration's influence on the IAPCEDID performance was tested to determine suitable parameters.In the test, the population size  was set at 10 to 30 with a step-length of 5 and the stopping criteria  was from 400 to 600 with a step-length of 50; therefore, there were 25 maximum iteration groups.Figures 4(a) and 4(b) show the average results and computing times.The horizontal TN illustrates the  and  groups; for example, "1∼5" represents five different groups.When  = 10,  increases from 400 to 600 with a step-length of 50, with the remainder following the same analogy.The IAPCEDID was run 30 times for each group and the specific optimal results are presented in Figure 4(b).From Figure 4(a), when  was from 10 to 20, the maximum iteration had a marked impact on the results and the particles traded in a relatively tight range.The best result touched the bottom when  = 500 and  = 20.From Figure 4(b), it can be seen that the maximum iteration significantly influenced computing time.Further, a significant positive correlation between the average computing time and the maximum generation was observed when the population size was a fixed value.As mentioned, the best values for maximum generation and population size were, respectively, identified as  = 500 and  = 20.

Result Analysis.
The experiments described in this section were conducted using MATLAB language, and the DP-based GLNPSO based approach was developed using MATLAB software.Using the data in Section 5.2, MATLAB 7.10.0R2010a on a core i5-5200U, 2.19 GHz clock pulse with 3.88 GB memory was used to test the performance of the method.
Inventory and allocation decisions with rejected items over multiple stages are very important to the IAPCEDID.The specific purchase and inventory strategies for the four different items over the five stages without carbon cap consideration are shown in Table 6.The optimal results for the purchase, inventory, transport, and penalty costs are shown in Table 7.The total optimization cost was 6.24 × 10 6 (CNY) and the losses caused by defective items were 3.30 × 10 4 (CNY).
The specific purchase and inventory strategies that consider carbon emissions in an inventory-allocation network are almost the same as when not considering the carbon emissions.However, the total costs and carbon costs summarized in Table 8 indicate that a carbon cap can have a serious effect on total costs.As the carbon cap increased from 1000 to 10000, the total costs gradually decreased from 7.128 × 10 6 (CNY) to 5.663 × 10 6 (CNY).With an increase in the carbon cap, economic costs experienced a downward trend with the same carbon credit price.When the carbon cap increased from 1000 to 5000, the carbon costs reduced from 9.015 × 10 5 (CNY) to 1.402 × 10 5 (CNY).This indicates that there is a substantial incremental increase in the manufacturer-retailer network costs if carbon emissions are considered.This calculation should convince decision makers to acknowledge the influence of the carbon emissions costs and persuade them to take measures to reduce their emissions as much as possible.When the carbon cap increased from 6000 to 10000, the carbon costs continued to decrease to negative.According to the Kyoto Protocol, companies that do not reach their carbon emissions limits can sell the excess as carbon credits to other companies.In these cases, the company's carbon costs would begin to decline towards a negative value, at which point, the company would be earning credits; in other words, a negative cost means that the firm is actually making money by reducing carbon emissions.The sensitivity of the total inventory-allocation network costs to a change in the carbon cap in the mathematical model is shown in Figure 5(a).Total costs were found to decrease with an increase in the carbon cap, and a linear relationship was found between the carbon cap and overall costs.It can be seen from Figure 5(b) that as the carbon cap decreased, the carbon costs gradually took a larger share of the total costs.Further, it demonstrated that the carbon cap was conducive with economic growth; however, as the environment becomes increasingly damaged, the carbon cap would tighten, eventually causing a negative economic effect.In fact, from the enterprise point of view, as the carbon cap has strong externality, it is better to select the best decision based on the decision makers' preferences under different carbon caps.
Environmental considerations can also have a substantial impact on the inventory-allocation system, especially when the network dimension is large or when more stringent restrictions are placed on carbon emissions.In this network, when the variety of items increased, the purchasing quantities become extremely large and the distribution network became more complex.This indicates that it is necessary to construct a manufacturer-retailer model that considers the carbon cap so that an appropriate strategy can be quickly chosen when  there are changes in the external environment and also to provide different strategies for decision makers who have different preferences.Therefore, it is recommended that companies begin working towards sustainable inventory-allocation networks given that there is a global movement for a reduction in carbon emissions.

Algorithm Comparison.
With the development in technology, dynamic programming has become a counterpart to the PSO when dealing with different optimization problems.The average optimal result proposed by the DP-based GLNPSO is shown in Figure 6(a).To demonstrate the feasibility and effectiveness of the proposed DP-based GLNPSO, it was compared to a standard PSO.To conduct the comparison under a similar environment, the parameters selected for the DP-based GLNPSO were also adopted for the standard PSO; carbon cap = 5000, population size = 20, iterations = 500, the acceleration constant   =   =   =   = 2, and the inertia weight (1) = 1.0 and () = 0.1.The performance of the iterative process for each algorithm is shown in Figure 6(b) and the comparison details are shown in Table 9.It can be concluded that (1) both the DP-based GLNPSO and the PSO were able to obtain optimal solutions; however, the computation time for the DP-based GLNPSO was faster than for the PSO.(2) the DP-based GLNPSO converged faster, indicating that the DP-based GLNPSO needs less iterations to find the optimal solutions.(3) the DP-based GLNPSO had a more stable tendency than the standard PSO when searching for the optima, while the standard PSO had a tendency to occasionally fall into a local optimum.As shown from the above comparisons, it can be concluded that the DP-based GLNPSO is able to produce sufficient feasible solutions for the IAPCEDID.

Conclusion
In this paper, a sustainable inventory-allocation planning model with carbon emissions and defective item disposal (IAPCE DID) under a fuzzy random environment was presented.The aim of the model was to find the optimal purchase quantities so as to minimize total network costs, which were made up of the purchase costs, inventory costs, transport costs, penalty costs, and carbon costs.In our model, inventory and distribution planning under a fuzzy random environment was considered with annual demand, transportation costs, inventory conversion factors, and the percentage of defective items being fuzzy random variables.The findings in this research extended those of previous studies, most of which have assumed no defective items in the purchase process.When considering the price of carbon credits, the carbon emissions can be converted into carbon costs, which have the same dimensions as the economic costs.Carbon costs were added to the model to analyze the impact of a carbon cap on total costs.It was apparent that such an extension was necessary for decision makers to balance operational costs on the one hand and the environmental impact on the other.Considering the complexity of the model, a heuristic solution algorithm was proposed to solve this problem, called a dynamic programming-based particle swarm optimization with a multiple social structures (DP-based GLNPSO) algorithm with fuzzy random simulation.A brief comparison was made between the DP-based GLNPSO and the classic PSO to further illustrate the merits of the algorithm.This study expanded existing research on sustainability in supply chains and paved the way for the development and implementation of sustainable inventory-allocation networks, which can guide managers to better evaluate the sustainable practices in their manufacturer-retailer networks.

Figure 1 :
Figure 1: The flow of items in manufacturer-retailer network.
) () into a corresponding purchase quantity for each item at the beginning of each stage.Based on the state equation;  (+1) =   +   (+1) ()(1 − [ q ]) − [ ũ ], decoding    () in the ( + 1)th dimension into the purchase quantity of item  at the beginning of stage .The decoded result can be represented as   =    ().

the function of current inventory for Item 𝑡 in the whole process. 𝐷 𝑡 : the distance between manufacturer and the corre- sponding warehouse. 𝐷 𝑡𝑛 : the distance between warehouses 𝑡 and store 𝑛
.   : the inspection price of Item .  : the return price of defective Item .  : the stock out penalty price of defective Item .  : the unit cost of the item  from manufacturer at th price break point.  : the th price break point for the item  in the ( + 1)th stage.: total purchase budget of the retailer for the planning horizon.  : fuel consumption per kilometer for transportation vehicle.
: CO 2 emission for unit gasoline fuel for transportation vehicle.: carbon cap over the network.: carbon credit price per ton.

Table 1 :
Demand information of items.

Table 2 :
Purchasing information of items.

Table 3 :
Inventory information of items.

Table 4 :
Distribution cost from manufacturer to warehouse.

Table 5 :
The distance among manufacturers, warehouses and retail stores (km).

Table 6 :
Results for purchase and inventory management.

Table 7 :
Costs of manufacturer-retailer network without carbon emissions.

Table 8 :
Total cost and carbon cost according to different carbon caps.
(b) Iterative process of DP-based GLNPSO and PSO

Table 9 :
Comparison between the PSO and DP-based GLNPSO.This framework can assist managers to simultaneously achieve economic growth and environmental protection.The suggested model can be extended by considering dissimilar carbon credit pricing as well as dealing with very strict carbon footprint control scenarios.