Regional Risk Assessment for Urban Major Hazards Using Hybrid Method of Information Diffusion Theory and Entropy

,


Introduction
Urban safety is a complex system issue involving all components of society and its citizens [1][2][3].At present, urbanization continues to accelerate in China.Te population, functions, and scale of cities have continually expanded [4].From 1978 to 2021, the urban permanent population had increased from 170 million to 901 million, and the urbanization rate had risen from 17.9% to 63.89%.Both the number of cities and urban population are developing rapidly in China.Meanwhile, a high concentration of urbanization elements, including population, buildings, gas, transportation, and production parks, has formed with the rapid development of urbanization [5][6][7].Te urban operation system has become more and more complex.When an urban hazard happens, it extremely tends to cause the chain efect of derivative damage of the public crisis.Terefore, it is imperative to conduct an urban regional risk assessment and strengthen the process safety management.
Many studies have been conducted to explore the possible regional risks caused by a single hazard in the felds of fre protection, construction, chemical industry, and environment [1,[8][9][10][11].By the superposition of single risks, the regional risk involving multiple hazards had been simply described [12].For instance, Chen et al. [13] constructed the disaster risk evaluation index system in accordance with Chinese reality and presented the urban risk ranking and risk map of 31 provinces.Zhang et al. [14] explored the relationship between urban spatial risk and the distribution of infected COVID-19 populations.Altindal et al. [12] carried out the seismic risk assessment of earthquake-prone old urban centers.Hoyos and Hernández [15] pointed out that hazard-consistent record selection was extremely important in the derivation of vulnerability models to use at a local scale, for sites with contributions from diferent tectonic regimes.Yang et al. [16] proposed a multiscale environmental accident risk assessment model to comprehensively evaluate the characteristics and impacts of environmental risk at diferent scales.As a huge carrier of society development, regional security has gradually evolved from qualitative description to quantitative analysis, deepened from certainty to uncertainty, and developed from random uncertainty to fuzzy uncertainty [11,[17][18][19].For example, Zhou et al. [20] claimed that uncertainties existed in slope stability analysis and proposed a probabilistic method for landslide prediction.Zhao et al. [21] developed R scripts to implement the K-means algorithm and gap statistics validity index for clustering regional risk.Yang et al. [22] used a multilevel risk characterization method to show the evolutionary process of risk and to provide a scientifc basis for the management of the urban agglomeration ecological risk.Xu et al. [23] developed a land-use-based urban growth scenario for the temporally and spatially explicit simulation of future urban growth in terms of buildings, roads, and electrical facilities while considering dynamic information.However, these publications mainly focus on a certain "point" or "face" of urban risks, such as drought, food, and earthquake, and analyze a risk superposition under limited conditions.Few comprehensive studies on the overall regional risk have been conducted.
As a complex nonlinear fuzzy system, urban regional risk involves many factors and varies rapidly in reality [24][25][26][27].Due to the spatiotemporal limitations, data collection and feedback are commonly difcult.In the process of urban regional risk assessment, information asymmetry and insufciency are often encountered [28][29][30].Tis is commonly called a small sample size problem.Information difusion theory is a fuzzy mathematical method for centralized processing of incomplete information systems [31,32].With the aid of fuzzy mathematics, a single sample point can be processed to establish an information difusion model.Te relationship among variables is constructed via the difusion function, and then, the incomplete information is appropriately expanded to compensate for information insufciency in small sample size problems.Recently, information difusion has been gradually introduced into specifc disaster risk assessments such as food, meteorology, fre, and earthquake [33][34][35].For instance, Bai et al. [36] employed a genetic algorithm to improve the information difusion algorithm and conducted that for monthly river discharge time series interpolation and forecasting.Hao et al. [37] claimed that the incompleteness, the nonclarity, and the uncertainty of the data must be addressed with the risk assessment and applied information difusion algorithm in probabilistic analysis of grassland biological disasters risk.Sun et al. [38] proposed a difusive foot-and-mouth disease model with nonlocal infections.
However, the overall regional risk assessment by information difusion method was rarely accounted for.Moreover, it should be pointed out that the above studies mainly computed that by using a simplifed model derived from molecular difusion theory.A reasonable difusion coefcient is crucial to system risk assessment.As mentioned earlier, uncertainty is a signifcant factor in urban regional risk assessment.Actually, it would also afect the determination of the difusion coefcient.Te optimization of the difusion coefcient should be considered during the implementation of the information difusion model.
To compensate for the aforementioned drawbacks, this study is intended to propose a regional risk assessment method for urban safety by introducing information diffusion theory.Furthermore, the entropy theory is combined into an information difusion model to reduce the uncertainty.Te remainder of the paper is organized as follows.a brief introduction to information difusion theory is elaborated in Section 2; Section 3 adopts the entropy to optimize the difusion coefcient; Section 4 illustrates the overall procedure of the regional risk assessment model for urban public safety; in Section 5, a case study is adopted to validate the proposed method; fnally, Section 6 provides the conclusions and suggestions for future work.

Information Diffusion Theory
Te information difusion theorem is a fuzzy mathematical method for centralized processing of incomplete information systems [31].In this method, each information sample point is supposed to have a tendency to develop into multiple information points in the process of transition from the original incomplete system to completeness [19].Accordingly, the corresponding information expansion of the incomplete system can be carried out by mathematical methods to make up for the shortcomings of insufcient information.At the same time, the calculation of the membership function can be avoided, and the original information carried by the original data can be preserved to the greatest extent possible.Terefore, even under the condition of incomplete information, this method can predict the relationship between variables through a certain difusion function.
To understand the principle of information difusion, the following defnitions should be frst reviewed [37].
Defnition 1.For a nonlinear relationship, any sample with a size smaller than the population is regarded as incomplete.
Defnition 2. For a known sample set X, let W be the object's true relation.X is called a correct data set for W if and only if there exists a model through which X is processed to obtain an estimate W X c such that W X c = W. Based on the abovementioned two defnitions, the information difusion principle can be defned.Let X be a given sample, V be a subset of the universe U, and μ: μ is called a difusion function, and V is called a monitor space.If and only if X is incomplete, there must exist a difusion function μ (x) which can difuse the quantitative information obtained at point x to v.
Based on the molecular difusion theory, Professor Huang [19] proposed a one-dimensional normal difusion function, as shown in the following equation: where h is the difusion coefcient, which governs the domain of information difusion; x i is the variable of X including m samples, i � 1, 2, . . ., n; u j is the element of universe U including n variables, j � 1, 2, . . ., m.Based on the "average distance model" and the "two-point proximity principle," h can be calculated by the following equation: where a = min 1≤i≤n x i  ; b = max 1≤i≤n x i  .
From (2), it can be concluded that the value of h is mainly determined by the minimum a, the maximum b, and sample size n.
To equalize the numerical status of each set, the difusion function f i (u j ) is commonly normalized, as shown in the following equation: Ten, the normalized information distribution formula μ x i (u j ) can be defned as follows: Via the abovementioned information difusion, the single-valued sample point x i is successfully transformed into a fuzzy subset with the membership function μ x i (u j ).
Next, let p(u j ) be the estimate of the probability associated with sample point x i at u j .p(u j ) can be expressed as follows: Ten, the exceeding probability (u j ), which is the estimate used to assess disaster risk, can be computed by the following equation:

Optimization of Diffusion Coefficient Based on Entropy
From ( 1) and ( 2), it is clear that the difusion coefcient h signifcantly afects the expansion of incomplete sample X.However, the traditional empirical calculation method, namely, (2) has typical uncertainty and lacks a sufcient theoretical basis.It might be remarked that the information entropy is capable of measuring the uncertainty and the randomness of the system.It is mainly used as a probability density function to quantitatively describe the information capacity of the system.Te larger the entropy, the higher the uncertainty of the system.In other words, the more information we know about a system, the less uncertain it is.
According to Shannon's theorem [39], the information entropy function can be expressed as follows: With the aid of the maximum entropy principle, the maximum entropy H of the one-dimensional normal information difusion function can be obtained from the following equation: For a given sample, each random sampling event can be considered as an event with equal probability.Ten, the entropy reaches a maximum, as expressed in the following equation: According to ( 8) and ( 9), σ can be expressed as follows: Due to that h � σ • Δ n , where the average width Δ n = (b − a)/(n − 1), the difusion coefcient h can be further modifed as follows:

Urban Regional Risk Assessment Model Based on Information Diffusion
Urban regional risk assessment is a complicated issue commonly with incomplete information and numerous uncertainties [2,28].Fortunately, the information difusion and information entropy are precisely the way to solve such problems.Terefore, in this study, the two methods are combined to assess regional risk in relation to urban public safety.
In the process of urban regional risk assessment, information asymmetry and insufciency are often encountered.Terefore, it is difcult to raise the urban risk assessment from the "point" level of various hazards to the "face" of the region.As a set-valued fuzzy mathematical method, information difusion theory is commonly used for risk assessment of small sample systems.In view of fuzzy set theory, the probability distribution can be regarded as a mapping from events to probability values.Accordingly, the single sample point can be processed by fuzzy mathematics to establish an information difusion model.Te relationship among variables is constructed via the difusion function, and then, the incomplete information is appropriately expanded to compensate for information insufciency in small sample size problems.Tus, the probability is employed as a risk measure to evaluate the risk level or vulnerability of the hazard.Generally, information difusion can be implemented in two ways: (1) multisource information is distributed at diferent control points; (2) multiple information universes are expanded to obtain the fuzzy relationships of the system.In this study, the former is adopted to construct a risk assessment model of urban hazards.Te samples of urban risk indicators are regarded as incomplete sample sets.By establishing the universe of each single-valued sample of hazard, the information distributions at diferent risk levels are computed by using information difusion theory.In this case, an urban regional risk assessment model based on information difusion and maximum entropy can be established, as shown in Figure 1.
Step 1. Determine the regional risk assessment index system.Let U � {u 1 , u 2 , . . ., u m } and X � {x 1 , x 2 , . . ., x n } be the universe and sample set of urban risk indicators, respectively.
Step 2. Compute the difusion coefcient h with the aid of entropy.According to the collected sample set in Step 1, the minimum a, the maximum b, and sample size n of each indicator X are determined.Ten, the entropy H can be derived from equation (9).Via equations (10) and (11), h can be computed.
Step 3. Construct the normalized information distribution formula μ xi (u j ).Substitute h into equation (1), and then, a one-dimensional normal difusion function can be established.With the aid of equations ( 3) and ( 4), μ xi (u j ) can be defned, which can transform single-valued sample point x i into a fuzzy subset.
Step 4. Assess the regional risk for urban public safety.Estimate the probability p (u j ) and exceeding probability P(u j ) associated with all risk indicators via equations ( 5) and (6).

Case Study
5.1.Database.Urban public safety risk assessment is a large and complex system that should cover the elements of urban security as much as possible.In order to validate the performance of the proposed method, the statistical data of urban death accidents in Hangzhou city, China, were selected as test samples [40].Hangzhou is an international tourist city and a famous national historic and cultural city.It has a total area of 16850 square kilometers and a permanent population of 12.204 million.In 2022, Hangzhou will achieve a GDP of 1875.3 billion Yuan.Recently, more and more international conferences and events were held there, such as the G20 summit and the Asian Games.It is becoming the megalopolis of China and is faced with public safety risks far beyond small and medium-sized cities. Due to the rapid growth of the economy, city size, population, and trafc density, the urban safety risks in Hangzhou are becoming increasingly severe.As commonly used indicators in China, the death numbers of industrial and mining accidents (IMA), road trafc accidents (RTA), water trafc accidents (WTA), fshing vessel transportation and fshing accidents (FVTFA), and mortality per hundred million GDP (MHM-GDP) from 2005 to 2021 were studied to assess the safety production risk in the Hangzhou region.Te original data for each sample are shown in Table 1.
Based on the proposed urban risk assessment model, the established sample X could be appropriately expanded to the discrete domain U.With the aid of MATLAB, the fuzzy membership functions corresponding to each indicator could be obtained as follows: ...

Risk factors
Figure 1: Framework of regional risk assessment model for urban safety based on information difusion combining entropy.
According to sample X 1 , the probability and exceeding probability of IMA at diferent risk levels were calculated by using ( 5) and ( 6), as shown in Table 3.Similarly, corresponding results of RTA, WTA, FVTFA, and MHM-GDP could also be determined according to the samples X 2 , X 3 , X 4 , and X 5 , as shown in Tables 4-7.

Discussion.
In the past 20 years, the regional risk of Hangzhou city has improved signifcantly.Tis can be clearly observed in Table 1.Te overall risk level of Hangzhou city showed a downward trend as time proceeded.However, the proportions of diferent hazards signifcantly difered from each other, as illustrated in Figure 2. Of the fve indicators, the risk related to IMA was medium but showed a fuctuating trend.Te risk related to RTA was high but showed a downward trend.Obviously, the hazards of IMA and RTA were still the areas with high incidences of casualties and property losses, accounting for 11.15% and 87.64%, respectively.In contrast, the risk related to the WTA and FVTFA was lower and showed a dropping trend.Te hazard prevention of them was remarkable, annual deaths of which had fallen into the single digits or not occurred.Tey were expected to achieve the terminal goal of risk-free by taking necessary targeted policies and technical safety measures in the future.
It was generally recognized that the difusion coefcient was crucial to the performance of information difusion.In this study, the entropy theory was employed to modify the traditional difusion coefcient.Let H e and H 0 be the entropy value of information difusion estimation under h e and h 0 , respectively.Table 8 shows the results of H 1 and H 2 derived from two diferent strategies.It was obvious that as for any indicator X, H 0 < H e .Tis indicated that the probability of the information difusion results computed by h e was greater than that calculated by h 0 .Te entropy-modifed coefcient was more reliable and reasonable.
Te proposed information difusion method combining entropy could successfully estimate the urban regional risk.It could be seen from Tables 3-7 that the exceeding probability P (u) associated with diferent hazards decreased with the increase of risk level.However, the corresponding risk probability p(u) increased frst and then decreased, indicating that there would be a concentration area of risk levels for each hazard.Figure 3 shows the risk probability distribution curves of fve hazards by using MATLAB 7.0.For comparison, the estimation results derived from the traditional difusion coefcient were also given.It was clear that the regional risk assessment results difered from each other.For IMA and FVTFA, the highest risk levels associated with h e and h 0 were consistent.However, there was a slight deviation in the probability p.For RTA, WTA, and MHM-GDP, both the highest risk levels and their probabilities p were diferent.Tese displayed again the importance of the correct difusion coefcient.It could be imagined that once the difusion coefcient was improperly selected, it would lead to a large deviation of urban regional risk, which would further afect the formulation of policies and measures for urban risk prevention and control.Terefore, it was necessary to optimize the difusion coefcient in this study.As shown in Figure 3(a), it can be observed that the risk level of 100 deaths associated with IMA in Hangzhou city had the highest p value, namely, 9.23%.As the risk level further exceeded 100, corresponding probabilities showed a signifcant downward tendency.Te risk level of more than 80 deaths occurred with high probability (P > 64.64%), which happened about every 1.5 years.Tis also meant that 80 deaths would exist almost every 3 years in the future.But that of more than 160 deaths was small (P < 14.35%), which occurred about every 7 years.Te peak value of the death number associated with RTA was 700 (p � 5.47%), as shown in Figure 3(b).Te risk level of more than 400 deaths occurred with high probability (P > 76.06%), which happened about every 1.3 years.Comparably, the risk level of more than 1000 deaths was small (P < 17.09%), which only happened in 2005.Figure 3(c) demonstrated that the peak value of the death number associated with WTA was 2 (p � 12.18%).Te risk level of more than 1 death occurred with high probability (P > 89.42%), occurring about 1.1 years.However, that of more than 8 deaths was small (P < 19.02%), which occurred every 5.3 years.FVTFA shows the lowest risk due to the probability of zero death reaching 70.59%, as displayed in Figure 3(d).More than 1 death per year occurs about every 2-4 years.As for MHM-GDP, the peak value of the death number associated with RTA was 0.08 (p � 6.91%).Figure 3(e) demonstrated that more than 0.32 per year occurred with low probability (P < 14.37%), happening about every 7 years.From 2014, MHM-GDP was less than 0.08 and sharply declined.Overall, compared to the original data of urban hazards in Table 1, the abovementioned conclusions drawn by the proposed information difusion method were basically consistent with the reality.
Based on the abovementioned analysis, it can also be found that a centralized distribution interval of the risk level frequency associated with each urban hazard existed, represented by S in this study.Te corresponding results of 5 urban hazards are shown in Table 9. Obviously, urban risks in S were more likely to happen and had higher vulnerability.Similar to the abovementioned analysis results, IMA and RTA    Discrete Dynamics in Nature and Society were the main urban hazards with relatively higher S values.It also illustrated that urban regional risks were inevitable during the rapid development of society.However, efective countermeasures could be adopted to not only reduce the likelihood of hazards but also prevent dangerous events.Terefore, based on the risk situation and risk development trend results, appropriate measures can be taken to reduce the risk.It is suggested that Hangzhou should strengthen the safety supervision of the IMA and RTA in the future.Overall, the information difusion method was easily carried out and capable of dealing with incomplete information events with high accuracy.It can provide guidance for the government's urban safety management and policymaking.According to diferent hazards and risk levels, safety management measures can be formulated based on the actual state by resolving the estimated probabilities, so as to continuously improve the level of urban safety management.

Conclusion
In this study, information difusion theory was introduced to assess regional risk for urban public safety.Meanwhile, the entropy theory was utilized to modify the difusion coefcient to reduce the uncertainty.A framework of urban regional risk assessment model based on information diffusion and entropy was established.Te regional risk of urban public safety in Hangzhou city was studied by using the proposed method.Some main conclusions can be drawn as follows.
(1) Te difusion coefcient was crucial to the performance of information difusion.Te information difusion results derived from entropy entropymodifed difusion coefcient earned less uncertainty and randomness than the traditional method.Such capacity could reduce the estimated bias of urban regional risk and contribute to the formulation of policies and measures for risk prevention and control.With the aid of the modifed method, the urban regional risk of Hangzhou city in China was successfully estimated.(2) In Hangzhou City, the peak risk levels of IMA, RTA, WTA, FVTFA, and MHM-GDP were 100 deaths, 700 deaths, 2 deaths, 0 death, and 0.08 deaths, respectively, which were basically consistent with the reality.Comparably, the hazards with respect to IMA and RTA were extremely serious.More than 80 deaths of IMA would occur almost every 3 years, and more than 400 deaths of RTA would occur almost every 2.6 years.(3) Centralized intervals of the risk level associated with fve hazards in Hangzhou city could be found.Urban risks in such intervals were more likely to happen and had higher vulnerability, almost occurring every 1-2 years.Efective countermeasures could be formulated based on the actual state by resolving the estimated probabilities, so as to continuously improve the level of urban safety management.

Figure 3 :
Figure 3: Death risk estimated value of urban major hazards: p 0 and P 0 represent the probability and exceeding probability computed by h 0 ; p e and P e are probability and exceeding probability computed by h e .(a) IMA.(b) RTA.(c) WTA.(d) FVTFA.(e) MHM-GDP.

Table 1 :
Safety production risk indices of Hangzhou from 2005 to 2021.

Table 2 :
Difusion coefcients of h e and h 0 by using diferent methods.

Table 3 :
Risk assessment of annual fatalities of IMA.

Table 4 :
Risk assessment of annual fatalities of RTA.

Table 5 :
Risk assessment of annual fatalities of WTA.

Table 6 :
Risk assessment of annual fatalities of FVTFA.

Table 7 :
Risk assessment of annual fatalities of MHM-GDP.

Table 8 :
Entropies of H e and H 0 by using diferent difusion coefcients.

Table 9 :
Concentrated distribution area of regional risk.