Modeling the City Distribution System Reliability with Bayesian Networks to Identify Influence Factors

Under the increasingly uncertain economic environment, the research on the reliability of urban distribution system has great practical significance for the integration of logistics and supply chain resources. This paper summarizes the factors that affect the city logistics distribution system. Starting from the research of factors that influence the reliability of city distribution system, further construction of city distribution system reliability influence model is built based on Bayesian networks. The complex problem is simplified by using the sub-Bayesian network, and an example is analyzed. In the calculation process, we combined the traditional Bayesian algorithm and the Expectation Maximization (EM) algorithm, which made the Bayesian model able to lay a more accurate foundation.The results show that the Bayesian network can accurately reflect the dynamic relationship among the factors affecting the reliability of urban distribution system. Moreover, by changing the prior probability of the node of the cause, the correlation degree between the variables that affect the successful distribution can be calculated. The results have significant practical significance on improving the quality of distribution, the level of distribution, and the efficiency of enterprises.


Introduction
With the ups and downs of economic development in recent years, urbanization is speeding up, cities are being expanded, and the urban population has been dramatically increased, which leads to an increasing complexity of business environment such as diversity, uncertainty, and instability of market.With this situation, the research on reliability of urban distribution system is becoming important and has a practical significance to integrate logistics and supply chain resources, to further improve the national economy with a sustained, healthy, and rapid development.
The level of reliability on urban distribution not only reflects the capacity of the various unit components in distribution system to contact and coordinate each other, but also reflects the ability to guarantee a rapid, stable, and sustainable development environment for society and economy.Urban distribution is sensitive in both internal and external development process, which makes it becoming more easily disturbed and affects its reliability.
Bayesian network is a graphic model that combines graph theory and probability theory in the field of artificial intelligence.It can simulate the causal relationships among different subjects in the real world and also well combine prior knowledge and posterior data.Bayesian network is an important application on dealing with uncertain information, and it has been successfully used in several areas such as medical diagnosis, statistical decision, and learning prediction.In this paper, we use Bayesian networks to deeply analyze the influence factors of the reliability in city distribution system.Parameter learning of Bayesian network is mainly based on the existing training data.But in real world, there is little complete data for learning.EM algorithm is an algorithm for parameter, which can improve the accuracy of network parameters.So we combine Bayesian network and EM algorithm to make the calculated value more exact.
The remainder of this paper is organized as follows.Section 2 provides the relevant literature on urban distribution, urban distribution system reliability, and Bayesian networks.Section 3 describes the factors affecting the reliability of

Literature Review
Reliability research is widely used in many fields, but the research in the logistics system is only emerged gradually in recent years.At present, there are already some domestic and foreign scholars on the reliability of the logistics system study.The research results from the supply chain reliability can be used for the reliability of logistics system research route, methods and tools, and so forth.
Thomas [1] first used reliability engineering in the study on supply chain and gave the probability model of the supply chain reliability in emergency environment, but it does not involve the logistics operation details.In order to ensure the normal operation of the supply chain, Chen and Xue [2] build a comprehensive evaluation model of reliability of supply chain based on MAS (Mobile Agent Server).City logistics system is a complex network system; some scholars have conducted research on the reliability of city logistics network from the perspective of the reliability of complex networks.Tran and Zhiya [3] measured shortage of client in the logistics system and defined the logistics network point of supply reliability, arc reliability, and logistics network reliability from the viewpoint of probability.They construct a single-layer network reliability model.Then they put forward the reliability of the logistics network optimization problem and constructed a double objective chance constrained programming model based on service reliability and cost.To measure the logistics system reliability, Lin [4] proposed a model and method to measure the operation performance of the network and the reliability of network structure when the logistics network node malfunctions.Schuëller et al. [5] qualitatively discussed the reliability of the logistics system and the metrics and methods of some other logistics property which influence the logistics system reliability, but not for the quantitative research of strict.Wang et al. [6] defined the reliability of logistics service as a function of the distance and put forward a modeling method of the logistics service reliability.Križman [7] studied measure method and the convergent validity algorithm of the logistics structural reliability.Xu et al. [8] proposed a bilevel programming model which can reduce or eliminate the adverse effects of supply uncertainty in collaborative logistics network (CLN) resource matching process.By adding the robust constraints, this model reduces both the frequency and the cost of resource planning changes, which lead to increasing stability of CLN operation.In the field of logistics system reliability research, the existing literature research focuses on logistics system reliability measure, evaluation, and prediction, mainly using the Monte Carlo, the fault tree, and probability analysis method to establish the assessment model of logistics system reliability and obtained certain research results.From the research tendency, the research on the reliability of logistics system will be more specific, such as the problem of reliability level and influence factors.Moreover, research on the reliability of logistics systems will be more closely combined with the key technology problem which needs to be solved urgently in the national economy and social development, and the application value will become more and more obvious.
Bayesian network is a methodology that has been widely used in fault diagnosis, reliability analysis, economic forecasting, and so forth.Since 2001, the application of Bayesian network is expanded to risk analysis by Weber et al. [9].For example, Norrington et al. [10] studied the network construction procedures intensively to improve the reliability of risk analysis.By applying the probability inference theory to the network topology, Bayesian network can be used to identify the most influential factors by Hänninen and Kujala [11].Now Bayesian network has been applied in the research of the logistics system.Tian-kui et al. [12] put forward a fuzzy comprehensive evaluation of supply chain risks based on Bayesian networks.Taking the supply chain risks of an enterprise as an example, they calculated the logarithmic probability  of risky affairs through linear deduction of Bayesian networks and then worked out the main risky affairs and their ranks by the way of fuzzy comprehensive evaluation.Li et al. [13] establish a local risk analysis model of food supply chain based on the Bayesian network, which can predict the risk.Wen-Fang [14] analyzed the key factors affecting the evaluation of logistics performance and their relationship with each other through analyzing the connotation of logistics performance evaluation.And the evaluation index system of logistics performance is constructed from the logistics service quality performance, logistics strategy and system, and operation service and supporting.Guo et al. [15] crystallized the abstract problem of cold-chain logistics system fault and constructed fault tree according to operation features of various functional parts and the causal relationship between events in cold-chain logistics system.By converting fault tree to Bayesian networks, it is convenient to draw a comprehensive evaluation for the system reliability of cold-chain logistics system and reveal the main cause of system failure, in order to provide quantitative foundation for improving system reliability.As can be seen from these results, the use of Bayesian networks is extremely broad, but the use is not widespread in the field of the reliability of logistics system.This paper applied the method to study the factors affecting urban logistics distribution system reliability to analyze the impact of these factors on system reliability.
EM algorithm proposed in 1977 by Dempster et al. [16] is an iterative maximum likelihood estimation of the optimization strategy for seeking parameters.It can focus on the noncomplete data from the parameters of maximum likelihood estimation.This method can be widely used in missing data, incomplete data, and censored data with noise.Some scholars combine the EM algorithm and the Bayesian network to solve the problem.Liu and He [17] apply the Naive Bayesian Method and the EM algorithm in Chinese web-page classification, which improves the final convergent result by continually changing the convergent initial conditions.Wang et al. [18] proposed a method that combines data revising and the Bayesian structural EM algorithm, which is effective in learning Bayesian networks from small data set.Wei and Huijin [19] use the result of Naive Bayesian as the initial range of EM, then refine the value reduplicate, and finally get the excepted maximized value.Cao et al. [20] propose a learning algorithm based on cloud model and EM algorithm to solve the problem that the conception of node which has been discrete is fuzzy and random in Bayesian network.These papers focus on theoretical research.In this paper, the EM algorithm and Bayesian network are combined to study the influence factors of urban distribution system reliability.

Analysis of the Key Influence Factors of City Logistics Distribution System
In this paper, four influence factors will be discussed.They are hardware configuration, personnel operation, external factors, and the change of customer demand.
3.1.Hardware Configuration.For hardware configuration, there are facilities, logistics technology, and information systems.
Facility is one of the most important elements of logistics systems, which undertakes various tasks of logistics operation of each link, which is in a very important position in the logistics system.Logistics facilities include logistics packaging, warehouse, storage, transportation, loading and unloading handling, container utilization, and port logistics and distribution facilities in the process of machining.Advanced logistics facilities and equipment are a guarantee of efficiency, high-quality, and low-cost logistics operation.Thus logistics distribution system efficiency will be extremely low without the logistics equipment.
Logistics technology includes some basic technology and methods such as logistics prediction, supplier selection, inventory control, and logistics packaging.The content of the logistics industry has been changed with each passing day, so the new concept and technology of logistics play a vital role in the development of logistics industry.In twenty-first century, some theories of logistics and technology such as "frontier logistics," "6 Sigma Logistics," "closed loop logistics and reverse logistics," and "RFID" have greatly promoted the development of the logistics industry.But in the meantime, they brought us more challenges and uncertainties to the city distribution system.
The flow of goods is accompanied by the flow of information, and information plays an important role in city logistics system.In China, there is no unified platform in logistics information, which directly leads to a low degree of information sharing between various enterprises.On the one hand, because of the difficulty in the communication of information, once there are some problems arising, the error information will be difficult to correct, which will significantly affect the reliability of logistics distribution system.On the other hand, due to the lack of a unified system of standardization, the information is difficult to transmit within different enterprises, which eventually weaken the function of information system.

Personnel Operating Factors.
Personnel operating factor refers to the uncertainty of the system caused by the specific operational problems.It includes stocking process, process storage, and the quality of delivery staff.
Stock is the basic work in distribution, which mainly includes the order, purchase, goods collection, purchasing, and other related processes.Stock is a key link to the success of distribution, which is also one of the important factors that affect the city logistics distribution system reliability.Any personnel information input, wrong label posts, and scanning errors would delay the delivery of goods.
Storage is a guarantee to meet customer demand and to deal with emergency and also an important part to ensure the reliability of urban logistics distribution system.Storage should be processed in a planned way based on past experience and customer orders.The hoarding of goods will not only extend the cash cycle but also may cause some risks for quick replacement of products.
The quality of delivery staffs directly affects the goods delivery rate and customer satisfaction.The distribution industry management has some requirements.(1) Timeliness: is the delivery in the stipulated time?(2) Safety: whether the goods are damaged or lost?Some products need unpacking inspection of their quality and quantity by customers (service quality).The attitude of delivery staff may significantly determine customers satisfaction.Operation mainly includes the waybill filled and the packaging completed.In China, the government's policy is of great significance for the development of logistics, and the change of government policy will have a significant influence on the development of logistics industry, while it cannot be denied that there are still some defects of Chinese government functions in the process of city logistics development, which mainly include the defects of government's logistics management system, the government's lack of guidance of modern logistics, and the weakness of government's regulation of logistics market.Hence, the development of city logistics needs the establishment of the urban infrastructure system, the improvement of enterprise logistics network, the construction of logistics system and logistics standardization, and a good environment for enterprise development.The unpredictability of government policy has brought a series of uncertainty, which will directly affect the reliability of the urban logistics system.
The State Council issued "overall national public emergency contingency plan," which mentions that public emergency events may cause heavy casualties, property damage, environmental damage, serious social harm, and public safety problems.Public emergency has the characteristics of suddenness, complexity, destructiveness, and continuity.Many sudden natural disasters can interrupt logistics operation process and influence the normal function of urban logistics flow and functions.Therefore, the public emergency is an important factor for the uncertainty of city logistics distribution system.City logistics distribution system is a complex economic system in terms of the mutual influence and mutual restriction within the whole national economy.First of all, city logistics economy is an important component of the national economy systems.On the one hand, its development will promote national economy.On the other hand, the development of the national economy also has a significant influence on the development of city logistics.And macroeconomic fluctuations have a great impact on city logistics system.When the macroeconomics is overheated, logistics economy will be booming, and even there is an expansion trend.And when macroeconomics declines, the city logistics demand will decrease and the investment will be greatly reduced as well.
With the development of economic globalization, the connection between the Chinese economy and world economy is gradually strengthened especially after China's accession to the WTO (World Trade Organization), which brings us the opportunities and challenges simultaneously.Enterprises have been facing the pressure from both domestic and overseas competition.Foreign logistics enterprises moving into our country become a challenge for the development of domestic enterprises.In the same time, the connection between the countries became closer than ever before.A country's economic fluctuation will also have impact on other countries' economic development, which brings a series of uncertainty to city economy.

Customer Demand.
From the customers' requirement perspective, the frequent changes of orders from customers and the irregularities of orders have an impact on demand forecasting.In shorter production cycle of enterprise, the change of demand forecasting leads to corresponding changes of production plan, procurement, and scheduling.For the longer production cycle of enterprise, the change is unable to meet the final demand of the customers, as it cannot change the production plan.The uncertainty of demand mainly comes from the incorrect forecast method, decision error, and variability of customer demand.

Basic Principle of the Bayesian Network. Bayesian network is a graph theory model, which is established based on the relationship between probabilities of random variables and represented by directed cyclic graph (DAG)
. It has two elements: nodes and arcs.Node is the basis of Bayesian network structure, which represents the basic elements of Bayesian network modeling events.A directed arc between the nodes indicates that the condition dependency results from the cause node to final node.Bayesian network can be expressed by the following mathematical expression: DAG = (, ),  = ( 1 ,  2 , . . .,   ),  = ( 1 ,  2 , . . .,   ), and  and  represent nodes and directed arcs in the Bayesian network.A simple Bayesian network is shown in Figure 1.
In Figure 1, , , and  represent individual node, and the variable to each node should be expressed by lower case letters.For example, the variable of node  is expressed as  1 ,  2 , . . .,   and satisfies ( 1 )+( 2 )+⋅ ⋅ ⋅+(  ) = 1, and in the same way, we can have the variable that represents nodes , .
Assuming the existence of  1 ,  2 , . . .,   in a Bayesian network, there are as many as  independent nodes: According to formula (1) and Byes theorem, the joint and conditional probability is Formula ( 2) can be further expressed as ∏(  ) can be called the parent node of   , in the Bayesian network, if the directed arcs connect node  with  and point from node  to , and we call  the parent node to  and  is a child node to .If   has no parent node, we call it the root node.Total probability formula is as follows.
Sample  provided the experimental space ,  is an event of ,  1 ,  2 , . . .,   is a partition of , and (  ) > 0 ( = 1, 2, . . ., ). Then, A Bayesian network is a kind of probabilistic reasoning algorithms based on the relationship between the nodes variables in a Bayesian network.In practical applications, the specific structures of the Bayesian network are rather complex, and the probabilistic reasoning for the whole Bayesian network could be relatively difficult and tedious, as the network between adjacent nodes is assumed to be nonindependent, so we can define a node as an intermediate node, and the node can form a sub-Bayesian network with its child nodes and parent nodes.In this way, a complex Bayesian network can be decomposed into a number of simple "sub-Bayesian networks," through the probabilistic reasoning to the "child Bayesian network," and it could figure out the conditional probability of whole Bayesian network.We firstly set the prior probability of Bayesian network, and directed arcs represent the relationship between the nodes, using the basic principles of probability theory, and the probabilistic reasoning between networks nodes could reflect the impact degree of the cause node to the result node.Therefore, we can calculate the probability of the relationship nodes in different structure according to the different needs and then reflect the relationship between the nodes.

Parameter Learning.
Parameter learning of Bayesian network is mainly based on the existing training data.But in real world, there is little complete data for learning.Therefore, to improve the accuracy of network parameters, modifications based on mathematical method are essential.Here, Expectation Maximization algorithm is applied (Kjaerulff and Madsen [21]).As an algorithm for parameter adjustment, it can be used to maximize the likelihood estimation with incomplete data.By iterating the expectation step and maximization step, the EM algorithm can repair missing data and refine it gradually until the data likelihood value reaches a local optimal.In the expectation step, the likelihood function of the current parameter  is expected when the set of samples  is given; in the maximization step, the likelihood function obtained at the expected stage is maximized.And replace the next expectation with the maximized result, and then perform the desired calculation.The EM algorithm repeats steps  and  until the expected error in the estimate is less than a fixed limit.EM algorithm for the parameter estimation provides a reliable method, which is considered the most feasible solution to the current data loss method.The specific algorithm steps are as follows.
The E-step of the EM algorithm is to calculate the expected sufficient statistics for a complete database.
Step 2 (maximization step).Consider where   is the new parameter.The maximization step is to compute the maximum likelihood estimate of   , expressed by   .

Model of Reliability Factors in Urban Distribution System.
According to the analysis of the reliability factors in urban distribution system in Section 4, we give the model of reliability factors in urban distribution system in this section, as shown in Figure 2.

Sub-Bayesian Network Inference.
In order to facilitate the analysis, we decompose the Bayesian network of urban distribution system reliability factors into several simple "sub-Bayesian network reasoning."We select a "sub-Bayesian network" in Figure 2 for calculation.In the "sub-Bayesian network," the hardware configuration for the intermediate nodes, logistics technology, facilities and information system for the parent node, and successful distribution for the child nodes are shown in Figure 3.
In the Bayesian network, we get the initial measurement data from a logistics company.We did a sample survey and set the number of samples as 50.Table 1 shows the survey results.
Then we analyze the data and get the prior probability of cause variables.For factors such as human and organizational factor, the statistical data are not sufficient.Thus, the EM algorithm and experts experience are essential.When there is some data missing in the numerical statistics results, first calculate the parameters according to the existing data.
Then the estimated value of the first iteration is calculated according to formulas ( 5) and ( 6).Repeat expectation step and maximization step, until the error between the first and last iteration values is less than 0.005 when the iteration is stopped.At this time, we get more accurate results.With the EM algorithm, the Bayesian model can lay a more accurate foundation for the relations and dependencies between incidents.
In the example, we firstly set the parent node according to the prior probability of the data, as shown in Table 2.Among them,  means that the logistics technology is suitable for the operation of logistics enterprises;  means that the logistics technology is not suitable for the operation of logistics enterprises;  means that equipment is in good condition;  means that equipment is not in good condition;  means that the use of information systems will improve the operational efficiency;  means that the use of information systems will not improve the operational efficiency;  means that hardware configuration system functions correctly;  means that hardware configuration system cannot function correctly. means that the entire distribution process is successful, and  means that distribution process is unsuccessful.
represent four conditions that are established and so on.Tables 3 and 4 show the joint probability between various factors.
According to formula (3) and total probability formula (4), we can calculate the probability for : Among them, Δ() means the variation of parent node  probability and Δ() means the variation of child node  probability, and then we change the prior probability of parent node  as   () = 0.6 and keep the initial state of  and .Similarly, we can deduce the probabilities of nodes  and :   () = 0.77928,   () = 0.66329.
According to formula (9), we can calculate the correlation between the logistics technology and equipment of the hardware, and also the correlation between the logistics technology and the successful distribution: Similarly, we can calculate the situation when changing the prior probability of a parent node for ,   () = 0.6, When the prior probability of parent node  changes as   () = 0.6, the correlation between the information system and equipment of the hardware and also the correlation between the information system and the successful distribution are  = Δ () / () Δ () / () = 0.3586,  = Δ () / () Δ () / () = 0.3037. ( The result is given in Figure 4.It can be seen from Figure 4 that the correlation among the information system, the equipment of the hardware, and the successful distribution is in max degree.The information system has a great effect on the equipment of the hardware, and it also plays an important role in the successful distribution.Therefore, we propose that, in daily operation, distribution enterprises should pay attention to the daily maintenance work of information system and hire professionals to ensure the normal operation of the information system.The correlation among information system, the equipment of the hardware, and the successful distribution is large.So enterprises should increase the equipment investment.In order to promote the development of logistics industry, the government should take the responsibility of management functions, strengthen the investment in infrastructure, and create a good environment, which will provide a solid foundation for the development of distribution business.The correlation among the logistics technology, the equipment of the hardware, and the successful distribution is min, even though we cannot ignore.

3. 3 .
External Factors.External factors refer to the external factors of city logistics distribution system including government policy factors, emergency, macroeconomic fluctuations, and the international factors.

Figure 2 :
Figure 2: Urban distribution system reliability factors model.

Figure 3 :
Figure 3: Child Bayesian network of urban distribution system reliability factors model topology.

Table 1 :
The measurement data.

Table 3 :
Joint probability between the hardware configuration and its parent.

Table 4 :
Joint probability between hardware configuration and successful distribution.