Robustness in Weighted Networks with Cluster Structure

The vulnerability of complex systems induced by cascade failures revealed the comprehensive interaction of dynamics with network structure. The effect on cascade failures induced by cluster structure was investigated on three networks, small-world, scale-free, and module networks, of which the clustering coefficient is controllable by the random walk method. After analyzing the shifting process of load, we found that the betweenness centrality and the cluster structure play an important role in cascading model. Focusing on this point, properties of cascading failures were studied on model networks with adjustable clustering coefficient and fixed degree distribution. In the proposed weighting strategy, the path length of an edge is designed as the product of the clustering coefficient of its end nodes, and then the modified betweenness centrality of the edge is calculated and applied in cascade model as its weights. The optimal region of the weighting scheme and the size of the survival components were investigated by simulating the edge removing attack, under the rule of local redistribution based on edge weights. We found that the weighting scheme based on the modified betweenness centrality makes all three networks have better robustness against edge attack than the one based on the original betweenness centrality.


Introduction
Cascading failure, whether caused by components broken or congestion failure, is a pervasive and prevalent topic in research fields known as power grid, wireless sensor networks (WSN), and Internet of Things (IOT).Accordingly, networks should be elaborately designed to alleviate the losses induced by random failure or intentional attack.If it were otherwise, the depth of the breakdown would catch us off guard, just like the large blackouts in the history of the United States [1].All the above, together with the increasing complexity of the future interconnected networks, make cascading failure a classical topic in network community.
Eliminating the cascading failures of network is a hard work because of the gap between theoretical research and the practice.Previous researches have achieved methods to deal with some problems in corresponding fields.The CASCADE models based on load redistribution [2][3][4], the deterministic models [5,6], and probabilistic analytical models [7][8][9] are three major methodologies in the study of cascade failures.CASCADE models are widely used in many fields.Using the local weighted flow redistribution rule, a novel edge weighting scheme based on betweenness centrality has been proposed to enhance the robustness of the northern American power grid, the Internet in the level of autonomous system, the railway network of Europe, and the United States airports network [4].Starting from percolation model, the authors in [10] investigated first-order and secondorder transitions dynamics of the shifting load once cascading failure takes place.Also cascade in vehicle ad hoc networks caused by congestion received wide concern.A failure-aware framework is proposed for handling cascade failures breaking out among vertices which depend on cascade suppression to reduce communication cost [11].
The vulnerability of the power grid is inherent to its properties of the underlying network structure.Based on the analysis of network structure, the ultravoltage power transmission network of Iran displays a small-world characteristic with exponential degree distribution, which is vulnerable against random and target attacks [12].The relationship between 2 Mathematical Problems in Engineering critical load and topology structure is studied on scale-free wireless sensor networks with the one-node random attack model.A conclusion they drew is that robust stability of wireless sensor networks is closely bound up with the degree distribution, clustering coefficient, and the level of scale-free property of the network [13,14].
Related studies have shown that cluster structure is a common phenomenon that exists in many real-world networks.Previous works concerning cluster structure in cascade model mainly focus on the clustering coefficient of vertices, while the role of intercluster links and the connectivity between clusters have not been explored widely in this topic.We are thus led to find a way to solve this problem.In our work, these two ingredients mentioned above were considered as follows: first, the nodal clustering coefficient is used to capture the cluster structure, to distinguish the intracluster links from the intercluster links; second, the measurement of betweenness centrality is used to achieve the importance sequencing of the edges, which is designed for capturing the nature of the load shifting between the failed edge and its neighbors.In our cascade model, two steps are used to achieve these two goals: first, edges are weighted according to the product of clustering coefficient of their end nodes, and the betweenness centrality of each edge is calculated and applied as the final weight; second, the local redistribution rule according to that final weight is carried on for the cascade dynamics.In this way, the load and the capacity of the intercluster edges are scaled up, and the load and the capacity of the intracluster edges are scaled down for keeping the weights sum to one.What is more, by using clustering coefficient instead of nodal degree, our weighting scheme not only detects the trap region built by the nodes that possess larger degree but also finds the trap created by the dense array of triangle forms.The details of the cascade failures triggered by the latter kind of the trap are presented in the next section.
There are mainly two kinds of attack mode in cascade dynamics: node removing attack and edge removing attack.Researches focused on node attack have explored the node weighting scheme based on the degree, cluster coefficient, and betweenness centrality of node [15,16].And the edge weighting strategies based on betweenness centrality of edge, the production of degree, or betweenness centrality of end nodes in edge attack model have also made important contributions to the advancement of network robustness study [2][3][4].In our scheme, edge is weighted by its modified betweenness centrality which results from solving the number of shortest paths in weighted matrix whose element is the production of clustering coefficients of its end nodes.To the limit of our reading, we have not found the same weighting strategy in previous researches.

Method
In this section, first, we discuss the importance sequence of edges which are neighbors of one broken edge, in load shifting process.We start from the analysis of electric resistance networks and draw a conclusion that the betweenness centrality of edge is a reasonable and practicable measurement to capture the priority ranking of shifting flows in complex networks.Then, we focus on the trap area built by triangle structure and show that cluster structure is an important ingredient to be considered in cascade model.At last, we propose a weighting scheme which combined the betweenness centrality with cluster structure tactfully.
Data and research reports from power grid blackout and congestion cascading suggested that the weighted scheme and flow redistributed rules based on edges are reasonable and practical model in cascading failures [3,8].We assume that a weighted network with  nodes is denoted by (, , ), where  is the set of  nodes,  is the set of edges, and  is the adjacency matrix.Based on the capacity model, we denote the weights on edge (, ) by   ; then we have the load factor on each node and the distribution matrix  as follows: Let us consider the Laplace matrix  of ; then we have Since (2) has the same form as Kirchhoff circuit law, we can conclude that edge weights in our weighting scheme have the same meaning as conductance in electric resistance networks.While   equals the charge in node , /  is its voltage.
From the above analyses, we can extend the rich results from electric resistance networks to our edge-weighted cascade model.Let us consider the situation where one edge failed in the graph; then the load shifted to other edges can be denoted by Δ, due to the substitution theorem in circuits; then the Kirchhoff equation in node  is as follows: where   is the self-conductance,   is the mutual conductance, and   are current source to be injected and extracted in node .Combined with (2), we have From ( 4), we can conclude that when the link (, ) fails and if edge (, ) is in the path from  to , then the voltage difference changing between node  and node  and the load shifted to edge (, ) are only determined by the network topology and edge weights; that means that the distribution law of the load depends on the effective resistance of each path.The effective resistance is an important parameter in the study of power grid cascade failures, which is denoted as the potential difference between the nodes  and  at a unit current.Though the effective resistance calculated by using the Moore-Penrose pseudoinverse of Laplacian  is successfully applied in voltage stability in electronic networks, its approximate form combined with other physical factors is more applicable to capture the nature of load shifting process in other complex networks, such as WSN, vehicle ad hoc networks, and IOT networks.For example, in transportation network, a shift in passenger traffic mainly occurs in the first and second shortest alternative routes and rarely in all other alternative routes.Also in power grid, safety device and the disconnecting protective behavior made by safety supervisor limit the shifting flows to limited lines, not upon global area.So, it is necessary for designers to find the most suitable edges for bearing the shifting loads.Following Jorgensen and Pearse [17] and Xiao and Gutman [18], we used some results from electric resistance networks to guide our work on cascade models.Jorgensen and Pearse found that the effective resistance   is bounded above by the shortest path distance between  and , and Xiao and Gutman found that the   is equal to the commute times of random walk between two nodes.While the definition of random walk has the same meaning as one assumption in our cascade models, that is, there are directed and distributed currents in each pair of nodes in the graph, unit injection current in source node  and unit extraction current in sink node , then all the above conclusions remind us of the betweenness centrality in graph theory.
Betweenness centrality of an edge indicates the centrality of an edge in system which is more accurate than connectivity to some degree in the sense that the former captures much more dynamic characteristics of the load shifting process than the latter.Betweenness centrality of an edge applied in many research fields, including transport, biology, and social networks, is defined as the amount of shortest paths from all nodes to all others which pass through the given edge.From the viewpoint of cascade model, the betweenness centrality reveals the priority of the edge for bearing the shifting load.So, the problem is, which kind of network structure should be considered to modify the original betweenness centrality in a general cascade model.
Clustering coefficient is used to quantify the degree of clustering of vertices in a network in graph theory.A vertex's clustering coefficient is equal to the ratio of the number of edges that exist in its neighborhoods and the maximum number of edges that could possibly exist among them.
Figure 1 shows a simple example of trap region built by triangle form structure.Figures 1(a) and 1(b) have almost the same structure except for the connected relation made by edge 4. In Figure 1(a) there is a triangle form built by edges 2, 3, and 4. We start the sandbox analogy by setting edges 1 and 5 which are failed under attack at the beginning.Then the load shift to their neighbor edges and the flows are represented by dash lines with arrow.At this moment, edge 2 is in a high risk level to be overloaded, and another three edges are in low risk level.Let us assume that edge 2 fails under the overloaded stress; then its load shifts to the neighbor edges of its two end nodes.The shifting flows in this moment are represented by dash and dot lines with arrow.Edges 3 and 4 are in the same high risk level in here.Until now, Figures 1(a) and 1(b) still have no difference in risk of global cascade failures.However, if edge 3 or 4 is failed under the stress of overload in this moment, there are great differences in dynamics between two graphs.In Figure 1(a), edges in the top left area are in high risk level of overload, and the possibility of global cascade failure is high, while in Figure 1(b) edges in the top left area such as 6, 7, 8, and even 4 (if the broken edge is 3) are still in low risk level.
From the above analysis, we can draw a conclusion that edges with higher product of clustering coefficient of its end nodes tend to form a trap area.Once an edge in this area fails, its load is hardly redistributed to the neighbor area but is accumulated in some edges within the area, just like the heat trapped in the limit space rather than diffusing outwards.The accumulation increases the risk of cascade failures upon the whole area.For improving the robust performance of the network and capturing the importance sequence of edges in load shifting process which depends on the betweenness centrality measurement, our weighting scheme is designed by combining the clustering coefficient and betweenness centrality as follows: we calculate the betweenness centrality by using the product of the clustering coefficient as the edge weights.This calculating results in decrease of the betweenness centrality of the edges located in trap area, because of the bigger weights they own, and increase of the betweenness centrality of the edges outside the trap area.It should be noted that the original betweenness centrality in graph theory is the number of shortest paths passing through the edge, with all edge weights being set to unit one.So, our weighting scheme enables the edges which connect the trap area with its neighbor areas to bear more redistributed load, thus reducing the risk of cascade failures occurring in a closed structure.
Cluster structure is pervasive in both real-world networks and model networks, and many of them, such as smallworld network, BA network, and modular network, possess considerable clustering coefficient.Table 1 shows the correlation between edge betweenness (  ) and the product of the clustering coefficients of the end nodes (    ) with the networks of Western States Power Grid of northern America, the neural network of .Elegans and Facebook, whose data can be found in [19,20].As shown in Table 1, we can see that, with the increasing of degree , there is positive correlation between   and     .However, the correlation is nonlinear dependence, suggesting the different meaning of two parameters in real world.
Our strategy discerns edges with different roles through the product of clustering coefficient of end nodes instead of the product of degree of end nodes.There are several points to explain this.The larger value of the product of clustering coefficient of end nodes indicates that the edge belongs to highly clustered areas.Compared with the method that distinguishes edges via the product of the degree of end nodes, the proposed strategy can point out the links that play critical roles in the network without large degree.
The load redistribution in our model obeys the rule introduced by Wang and Chen [3].They stipulated that the flow passing through the failure edge is redistributed to the links that connected to its end nodes according to their weight.That means where   denotes the failure link,   denotes the load of   ,   denotes the weight of   , and Γ  and Γ  denote the neighbors of nodes  and , respectively.On the other hand, each edge has a maximum capacity   which is proportional to its weight (i.e.,   =   ).If the load of an edge exceeds its capacity, the edge would fail.Thus, the network has to shift the flow to the survival edges again and again until no more edges fail.We focus on finding a minimum  which ensures that slight perturbation does not trigger cascading failure.This minimum  is denoted as critical threshold (  ).And the lower the value of   is, the stronger robustness the network would own against cascading failures.In order to understand the degree of cascade, we compute the average number of broken links as successive failure occurs based on the following formula [4]: where   denotes the failure size,  denotes the total number of edges in the network, and   denotes global cascade degree.

Results and Discussion
In this section, we investigate two different weighting strategies in SW and BA networks with population  = 1000 and module networks with population  = 3000 and community number  = 5.
(1)   = ( *  )  , where  *  denotes the modified betweenness centrality of edge   which is gotten through two steps.Firstly, each edge is assigned to an original weight according to the product of clustering coefficient of their end nodes.And then calculate the betweenness centrality of each edge as its final weight.
A graph with small-world properties is produced by using the method proposed by Newman and Watts.Initially, generate a regular ring graph with  nodes; each is attached to  neighbors, /2 on each side.Then, remote shortcuts are added by connecting any pair of remote nodes randomly with a probability  0 .But the number of edges added in any pair of nodes is only one at most.
To generate the scale-free and modular networks with adjustable clustering coefficient, we use the random walk method which is inspired by the previous research [21,22].Construct a completely connected core with  0 nodes at first.Each node is assigned a jump factor according to a distribution ().Then a node  0 is randomly selected as the beginning of random walks.The arrival node after  steps is marked.Beginning from the latest marked node, walk one step ( = 1) randomly to reach a new node if () is 1.And then connect the new added node with the marked node and the neighbor chosen.If () is equal to 0, walk  steps ( > 1) to arrive to a new node  new .Then connect the new added node with the marked node and  new .In the whole process, if  = 1, connect the marked node with  new to generate a triangle which will change the clustering coefficient of the network as it is closely related to the number of triangles.If  > 1, we will not link two of them.The growing process and preferential attachment will ensure that we get a scalefree network finally.For modular network, our model begins with  isolated cores, and nodes in each core are completely connected.For each core, one new node is added at each time step.Meanwhile, each new node will connect to  nodes within the same core and  nodes in other cores as well.The value of  and  depends on the community strength parameter .The end node for each new adding edge is chosen according to the random walk method described above to control the clustering coefficient.
In this way, we obtain networks with adjustable clustering coefficient without changing its other properties, as shown in Figure 2 with population  = 3000, mean degree ⟨⟩ = 6, and  = 5.We can see that clustering coefficient has linear correlation with control factor  on BA and module networks and the community strength  in module network as well, while clustering coefficient in small-world network has a negative linear correlation with control factor .
Firstly, we discuss the optimal value of  by analyzing the change law of the critical threshold   , and the value is used for the following simulation to find the optimal value of .Figures 3, 4, and 5 display the simulation of the correlation between   and  with our proposed weights scheme, in three networks, respectively.We found that  ∈ (0.6, 1) results in optimal   in different  and  on three networks.With the same , lines with larger  move down overall in contrast to lines with smaller .With the same , the value of   with larger  is larger than   with smaller , especially in regions where  ∈ [−1, 0] ∪ [1.5,2], in BA and module networks, but not in SW network.Since capacity minus load is the redundancy of an edge and load  time  is capacity, we can conclude that the redundancy of an edge depends on the edge weights.In regions where  ∈ [−1, 0] ∪ [1.5, 2], the shifting load is more likely to be trapped in cluster structure in the former region and the lower redundancy increases the risk of overload on intracluster links in the latter region.However, there are only slight changes with different  in optimal region, especially in small-world network.From the above, we can conclude that, by taking advantage of clustering coefficient and betweenness centrality, our weighting scheme does well in networks with clustering structure and heterogeneous distribution of edge betweenness centrality in regions where  ∈ [0.6, 1].
Since the small network we used in our test is constructed by NW model, the increasing of the value of  means that there are more shortcuts in graph without rewiring the local edges.So, the increasing of  in SW network elevates the connectivity of the whole network and decreases the clustering coefficient.So it is logical to deduce that boosting network connectivity will get a lower   , which means a stronger robustness against cascade failure.We investigate the size of cascading failure on three networks with adjustable clustering coefficient, including smallworld network, scale-free network, and modular network.Figures 6, 7, and 8 show us the comparison of our weights scheme based on modified betweenness centrality with the original one, with varied  and fixed  = 1.From those figures we can clearly see that, no matter on what kind of network, the networks weighted by modified betweenness centrality possess better robustness against cascading failure.Meanwhile, all those figures show that the networks weighted by our weighting strategy have a common property: the larger the clustering coefficient is, the more rapidly the   decreases at a given .By giving specific focus to the fragile area of clustering structure, robustness of system is improved under our weighting strategy.In other words, for a given , a network with higher cluster structure weighted through our strategy has greater performance improvements to resist random edge attack.However, there is another interesting phenomenon: compared with scale-free network and modular network, smallworld network weighted by modified betweenness centrality does not have distinct advantage.The following point could demonstrate it.Firstly, for small-world network, the increasing of the value of  only increases remote shortcuts but does little to change the clustering coefficient.This decreases the effectiveness of our scheme.More importantly, the growing of network's connectivity bears main responsibility for the discrepancy of lines with different .Secondly, unlike smallworld network, scale-free network displays great heterogeneity both in nodal degree distribution and in clustering coefficient which contribute to a steeper decline trend of   .The performance gap in suppressing cascades between the modified and the original weighting schemes is widening fast with the increasing of  when  ∈ [1.04, 1.06], and is gonging to disappear when  ≥ 1.08.The curve of the cascade size presented a right shift when  increased, under both the original and the modified weighting patterns.The robustness of the modular network weighted with modified betweenness centrality shows rapid changes, as reflected in the increases of the curves' steepness in Figure 7. Compared with the scheme based on original betweenness centrality, the curve of the weighting scheme based on modified betweenness centrality declines suddenly when  ∈ [1.04, 1.05].

Conclusions
In this paper, we investigated the vulnerability induced by cascaded failures on networks with cluster structure.A number of artificial networks with adjustable clustering coefficients were considered and comparisons between two weighting schemes on these networks were analyzed.Under the assumption of the local weighted flow redistribution rule, we investigated the optimal region and the scale of the removal component under the edge removal attack.We demonstrate that the clustering coefficient of the network is an important property in risk analysis.We found that weighting scheme based on the modified betweenness centrality had better robustness than the one based on original betweenness centrality.We also found that, with the heterogeneity of graph increasing, our weighting scheme becomes more effective.The curves of S N imply that the modified weighting scheme can suppress the cascade dynamics with smaller T, by taking advantage of the information of clustering coefficient.This mechanism combined with special conditions, such as large  (e.g.,  = 0.4) on scale-free networks, induced a smoother slope in the curve of the modified weighting scheme than that of the original weighting scheme.Results presented here focused on three widely studied model networks with small-world or scale-free prosperities, while more topological structure and the role of inter-and intracluster links in module network could be expected to be explored in the future study.

Figure 1 :
Figure 1: A simple example of trap area built by triangle form structure.

Figure 2 :
Figure 2: The linear correlation between clustering coefficient and .

4 𝜃Figure 3 :
Figure 3: The critical threshold   varies with  on small-world network.

Figure 4 :
Figure 4: The critical threshold   varies with  on BA network.

4 Figure 5 :Figure 6 :
Figure 5: The critical threshold   varies with  on modular network.

Figure 7 :Figure 8 :
Figure 7: Average size of the removed edges   as a function of the threshold  on scale-free network with  = 0.2 (dashed line) and  = 0.4 (solid line),  = 1.

Table 1 :
Correlation between   and     in some real-world network.