A Weighted Multi-Local-World Network Evolving Model and Its Application in Software Network Modeling

The phenomenon of local worlds (also known as communities) exists in numerous real-life networks, for example, computer networks and social networks. We proposed the Weighted Multi-Local-World (WMLW) network evolving model, taking into account (1) the dense links between nodes in a local world, (2) the sparse links between nodes from different local worlds, and (3) the different importance between intra-local-world links and inter-local-world links. On topology evolving, new links between existing local worlds and new local worlds are added to the network, while new nodes and links are added to existing local worlds. On weighting mechanism, weight of links in a local world and weight of links between different local worlds are endowed different meanings. It is theoretically proven that the strength distribution of the generated network by theWMLWmodel yields to a powerlaw distribution. Simulations show the correctness of the theoretical results. Meanwhile, the degree distribution also follows a power-law distribution. Analysis and simulation results show that the proposedWMLWmodel can be used to model the evolution of class diagrams of software systems.


Introduction
Networked structures appear in a wide range of complex systems, such as the Internet, traffic systems, and scientific collaboration relationships.Since the small-world and scalefree network models were proposed [1,2], researchers have made tremendous progress in network modeling [3][4][5].
In 2003, Li and Chen proposed a local-world evolving network model, which represents a transition between power-law and exponential scaling [6].The concept of localworld structure exists in various real-life complex networks and has been taken by many studies [7,8], allowing better describing and understanding more real-life complex networks.Based on the aforementioned local-world evolving network model, Pan et al. proposed two generalized local-world models for unweighted and weighted complex networks, respectively [8].Li et al. proposed a weighted local-world evolving network model based on the edge weights preferential selection mechanism [9].However, in the aforementioned models, the nodes composing a local world were selected randomly from the network and the number of nodes in a local world is fixed, which did not comply well with the real-life complex networks.
In 2005, Chen et al. proposed a Multi-Local-World (MLW) model [10] which aims at modeling the Internet and emphasizes the property of the Multi-Local-World of the structure of the Internet.A comparison clearly demonstrated that the MLW model was the best model for describing the Internet topology in the sense that the basic network properties (e.g., mean degree and clustering coefficient) of the MLW model are closer to those of the real Internet topology than other models, such as the BA model.In the MLW model, local worlds are relatively stable in the sense that each local world is not comprised of randomly selected nodes from the network when new nodes are added to a local world of the network.
The aforementioned network models did not take into account the difference between a link within a local world and a link across two local worlds.In real-life cases, an inter-localworld link tends to play a more important role than an intralocal-world link.For instance, in a computer network, a link between two nodes from two subnetworks is responsible for the communication between the two subnetworks.In addition, nodes differ on their capacity, which can be represented by their strengths.
Inspired by the MLW model, we try to extend the MLW model and propose a Weighted Multi-Local-World (WMLW) model in which a weight mechanism is introduced.To our knowledge, there has been no study about MLW taking into account the edge weight and node strength.In fact, the reallife complex systems would be better reflected if the edge weight and node strength were taken into consideration.The proposed WMLW model does not aim at modeling the Internet, but it should be feasible to model more real-life complex networks, for example, software systems.
There have been some studies on diverse aspects of software systems from a perspective of complex networks.Qu et al. explored community structure of software call graph and its applications in class cohesion measurement [11].Concas et al. investigated software quality and community structure in Java software networks [12].A complex network approach was used to study software dependency network evolution [13,14].Chong and Lee used weighted complex network with graph theory analysis to automate the derivation of clustering constraints from object-oriented software [15].Joblin et al. investigated the evolutionary trends of developer coordination using a network approach [16].Ding et al. identified key classes in weighted software networks [17].In this paper, we apply the MLWM to modeling the evolution of object-oriented software networks, in which a node is a class and an edge between two nodes is a dependency link between two classes.
The remaining of this paper is organized as follows.We propose the WMLW model in Section 2. In Section 3, we analytically study the strength distribution of WMLW model.We report numerical simulations in Section 4 and the application of WMLW model to software systems in Section 5. Finally, we conclude our work with future work in Section 6.

Weighted Multi-Local-World Model
We first introduce several key concepts on the WMLW model and then propose two preferential attachment rules to construct the WMLW model and finally describe the WMLW model in detail.

Key Concepts on the WMLW Model.
A weighted network is characterized by an adjacency matrix , whose element   denotes the weight of the edge between nodes  and , where ,  = 1, 2, . . ., , and  is the total number of nodes of the network.
Here, we restrict our interest to undirected networks in which weights of edges are symmetric (  =   ) and assume that   = 0.In our model, when adding new edge connected nodes  and  that belong to the same local world, we have   = ; otherwise, when adding a new edge between nodes  and  from different local worlds, we have   = .
Naturally, as the generalization of degree   of node , we define the strength of node  as   = ∑ ∈Γ()   , where Γ() denotes the set of neighbors of node .The strength of a node integrates the information about its connectivity and the weights of its links [18,19].The strength of the node reflects its power and importance.A larger strength of a node indicates that it is more important.

Probability about Selecting a Local World following the
Scale Preferential Rule.The scale of local world Ω is denoted by  Ω ; the total number of local worlds is denoted by   . is a constant that represents the "attractiveness" of local world Ω, and it is used to govern the probability for those "young" local worlds to get new links [10].In the WMLW model, the probability with which local world Ω is selected is described by If the constant  = 0, we have For the sake of simplicity, we consider  = 0 (i.e., (2)) in the rest of this paper.
In (2), ∑   =1   denotes the sum of the scales of local worlds in the network, that is, the total number of the nodes in the network.
In fact, it is more reasonable to select a local world following the scale preferential rule than to select a local world randomly.In many real cases, the scales of different local world are not equal, for example, software systems, Economic Trade Web, and Internet.The larger the sale of a local world is, the more attractive it is; that is, it also obeys the rule of "the rich get richer."

Algorithm of the WMLW Model
Step 1 (initial condition).Start with an initial network containing  1 local worlds, in which it is supposed that there are  1 nodes and  1 links in each local world, and every edge is assigned a weight .
Step 2. At each step, perform one of the following four operations on probability: (i) With probability  1 , a new local world is created, which contains  1 nodes and  1 links.
(ii) With probability  2 , a new node is added to an existing local world, which has  2 links connecting to the nodes within the same local world.The local world is selected with a probability given by ( 2), and then a node in the local world is chosen with a probability given by (3).Every new added edge is assigned a weight .This process is repeated  2 times.
(iii) With probability  3 ,  3 new links are added to a chosen local world.A local world is selected with a probability given by ( 2), and then one end of a link is selected with a probability given by (3), while the other end of the link is chosen randomly.Every new added edge is assigned a weight .This process is repeated  3 times.
(iv) With probability  4 , a local world is selected and it has  4 new links to the other existing local worlds.A local world (Ω 1 ) is selected with a probability given by (2).Then, each link is added according to the following process: a node is selected from Ω 1 with a probability given by (3), and this node acts as one end of a link; the other end of the link is a node selected with a probability given by (3) from another local world (Ω 2 ) chosen at random; and every new added edge is assigned a weight .This process is repeated  4 times.
Step 3. Repeat Step 2 until the total number of the nodes reaches the number given in advance.

Strength Distribution of the WMLW Model
Using the mean-field theory [21,22], one can obtain the strength distribution of node  in local world Ω, which can be divided analytically as follows.
(i) With probability  1 , create a new local world.In this case, the strength of node  in an existing local world Ω does not change over time, since the original nodes in the newly created local world have no links with any other nodes in existing local worlds.Thus, (ii) With probability  2 , a new node is added to an existing local world Ω, and this node has  2 links connected to the nodes within the same local world.Every newly added edge is assigned a weight : Ω / ∑   =1   is the probability that local world Ω is selected according to preferential choosing with the probability given by (2).  / Ω is the probability that a node is selected according to strength preferential attachment with the probability given by (3). Ω represents the total strength of all nodes in local world Ω.There are  2 links between the newly added node and existing nodes in local world Ω, and the weight of each newly added edge is , and thus the coefficient is equal to  2  2 .
(iii) With probability  3 ,  3 links are added to a chosen local world Ω.Each newly added edge is assigned a weight .We have On the right-hand side of ( 6), local world Ω is selected according to the probability given by ( 2),   / Ω represents the preferential selection within local world Ω, and 1/ Ω means the random selection of node  within the same local world Ω.
(iv) With probability  4 , a selected local world (Ω) has  4 links connected to other existing local worlds.Each newly added edge is assigned a weight .Hence, In the first term on the right-hand side of ( 7),  Ω / ∑   =1   is the probability given by ( 3) with which we select local world Ω.In the second term on the right-hand side of (7), another local world is chosen at random (i.e., with the probability of 1/( 1 + 1 )), and  random-local represents the total strength of all nodes in the randomly chosen local world.The two ends of a link are all selected by (3) in their own local worlds.
Combining (4)-( 7) together, we have In the following text, ⟨⟩ Ω-local denotes the average strength of the nodes in local world Ω, and ⟨⟩ represents the average strength of all nodes in the network.Hence, we have  Ω =  Ω ⟨⟩ Ω-local .Since  random-local denotes the sum of the strength of all nodes in a local world, which is selected randomly, in the average sense, we have Thus, (8) becomes For large , we have In the following text, we deduce the strength distribution of the network when the probability  1 = 0 (i.e., in the case when the number of the local worlds is unchanged).In the case when the probability  1 ̸ = 0, we only give a simulation result for simplicity in Section 4.
When the probability  1 = 0, the number of the local worlds is the initial  1 .All local worlds perform the same steps (ii), (iii), and (iv), though with different probabilities.Different probabilities just bring on different scales of the local worlds.
We can see from works [6][7][8] that the average degree of the network is free from the scale of the network.In our model, we can also consider that the average strength of every local world is equal for large ; that is,

⟨𝑠⟩ =
total strength of total nodes the number of total nodes Also, ⟨⟩ Ω-local is equal to ⟨⟩.Both of them are free from .
In the process of the growth of the network, we add nodes every time interval.Then, the probability density of   is So Therefore, For large , the strength distribution approximately is where  = ( 2  2 +  3  3 + 2 4  4 )/2( 2  2 +  3  3 +  4  4 ) and  =  3  3 / 2 , as defined before.

Numerical Simulations
We design several numerical simulations to verify our theoretical analysis results.We consider two cases, that is, a given number of local worlds ( 1 = 0) and an increasing number of local worlds ( 1 ̸ = 0).

A Given Number of Local Worlds
( 1 = 0).To testify the theoretical result of strength distribution of the WMLW model, we conducted a numerical simulation in which a network was generated by our WMLW model and the simulation and theoretical results were compared.Figure 1 shows the simulation result.In this simulation, the size of the generated network was set to one million.The details of parameter settings of the WMLW model for the simulation are described in Figure 1.In Figure 1(a), the simulation and theoretical results are shown to facilitate the comparison between them.The theoretical data are calculated by (20), and the simulation data were calculated based on the generated network.The comparison shows that the simulation data and the theoretical data fit well, which implies the correctness of the theoretic result of the strength distribution of the WMLW model.In addition, Figure 1(b) shows that the degree distribution of the WMLW model also follows a powerlaw distribution.Hence, the WMLW network is scale-free network.
There is a deviation between the numerical values and theoretical ones for large nodal strengths in Figure 1(a).This is the so-called "fat tail" phenomenon.The theoretical values are calculated when the number of nodes of the network is infinite.In contrast, for a numerical simulation, the number of nodes of the generated network is finite (one million in this simulation), and thus the phenomenon of "fat tail" happens.
Next, we changed the parameters of the WMLW model and calculated the cumulative strength and degree distributions of generated networks.Figure 2 shows the cumulative strength and degree distributions when the size of the generated networks is set to 100,000 and  is set to 5, 10, and 20, respectively.The results show that both the cumulative strength and degree distributions follow power laws.

Application of the WMLW Model in Software Network Modeling
Object-oriented software systems can be described using class diagrams that depict the relationship between classes.
Valverde and Solé found that the small-world and scale-free characteristics also exist in class diagrams of software systems when class diagrams are considered as undirected networks [20].A class diagram can be considered as a network, in which a node is a class and relationships between classes are links between nodes.In this section, we will employ the WMLW model to model the class diagrams of a set of software systems.
In the design of object-oriented software systems, a software system can be decoupled into components.Classes in the same component tend to own close association and different components couple incompactly because they communicate only by a few interfaces.In fact, this is the rule "high cohesion, low coupling" in the software engineering domain.This rule is clearly implemented in the WMLW model.Thus, the WMLW model is suitable for the modeling of the evolution of class diagrams.
For a software system of a relatively small size, let  1 = 0, which represents that the total number of local worlds is a constant; that is, the architecture is designed in advance and no important function was ignored before implementation.Thus, the number of components (e.g., packages) will not increase.For a software system of a relatively large size    (e.g., more than 1,000 classes),  1 is assigned a small value, which represents the fact that the total number of local worlds of the software system increases slowly.It is reasonable since adding a component is a big change to the software system at the architecture level, and such changes tend not to happen frequently in the development lifecycle of the software system.
We apply the WMLW model to generating networks with similar network properties to real-life software systems.We get the results shown in Table 1 by regulating the parameters of the WMLW model to model different software systems.Table 1 shows that the modeling of the software systems is satisfactory, which indicates the success of the application in class diagrams modeling.Table 2 shows the parameters used by the WMLW model in the modeling of each software system.Figure 4 shows the cumulative degree distributions and the cumulative strength distributions of the software systems using the WMLW model.Figure 5 shows the strength distributions and the cumulative strength distributions of the software systems using the WMLW model.

Figure 4 :
Figure 4: Degree distributions and cumulative degree distributions of the modeled software systems.(a) and (b) show the degree distributions and cumulative degree distributions, respectively.

Table 1 :
[20]ork properties of the software systems in real and modeling cases.Note. is the total number of nodes,  is the number of the edges,  is average path length, and  is clustering coefficient.The subscript  represents the real case, and the subscript  represents the modeling case.The real data were reported in[20].

Table 2 :
Parameters of the WMLW model for the modeling of the software systems.