The phenomenon of local worlds (also known as communities) exists in numerous real-life networks, for example, computer networks and social networks. We proposed the Weighted Multi-Local-World (WMLW) network evolving model, taking into account (1) the dense links between nodes in a local world, (2) the sparse links between nodes from different local worlds, and (3) the different importance between intra-local-world links and inter-local-world links. On topology evolving, new links between existing local worlds and new local worlds are added to the network, while new nodes and links are added to existing local worlds. On weighting mechanism, weight of links in a local world and weight of links between different local worlds are endowed different meanings. It is theoretically proven that the strength distribution of the generated network by the WMLW model yields to a power-law distribution. Simulations show the correctness of the theoretical results. Meanwhile, the degree distribution also follows a power-law distribution. Analysis and simulation results show that the proposed WMLW model can be used to model the evolution of class diagrams of software systems.
National Key Research Technology and Development Program of China2016YFB0800404National Natural Science Foundation of China61702377617731756157237161403154China Postdoctoral Science Foundation2015M582272Natural Science Foundation of Hubei Province2016CFB1582017CFB4261. Introduction
Networked structures appear in a wide range of complex systems, such as the Internet, traffic systems, and scientific collaboration relationships. Since the small-world and scale-free network models were proposed [1, 2], researchers have made tremendous progress in network modeling [3–5].
In 2003, Li and Chen proposed a local-world evolving network model, which represents a transition between power-law and exponential scaling [6]. The concept of local-world structure exists in various real-life complex networks and has been taken by many studies [7, 8], allowing better describing and understanding more real-life complex networks. Based on the aforementioned local-world evolving network model, Pan et al. proposed two generalized local-world models for unweighted and weighted complex networks, respectively [8]. Li et al. proposed a weighted local-world evolving network model based on the edge weights preferential selection mechanism [9]. However, in the aforementioned models, the nodes composing a local world were selected randomly from the network and the number of nodes in a local world is fixed, which did not comply well with the real-life complex networks.
In 2005, Chen et al. proposed a Multi-Local-World (MLW) model [10] which aims at modeling the Internet and emphasizes the property of the Multi-Local-World of the structure of the Internet. A comparison clearly demonstrated that the MLW model was the best model for describing the Internet topology in the sense that the basic network properties (e.g., mean degree and clustering coefficient) of the MLW model are closer to those of the real Internet topology than other models, such as the BA model. In the MLW model, local worlds are relatively stable in the sense that each local world is not comprised of randomly selected nodes from the network when new nodes are added to a local world of the network.
The aforementioned network models did not take into account the difference between a link within a local world and a link across two local worlds. In real-life cases, an inter-local-world link tends to play a more important role than an intra-local-world link. For instance, in a computer network, a link between two nodes from two subnetworks is responsible for the communication between the two subnetworks. In addition, nodes differ on their capacity, which can be represented by their strengths.
Inspired by the MLW model, we try to extend the MLW model and propose a Weighted Multi-Local-World (WMLW) model in which a weight mechanism is introduced. To our knowledge, there has been no study about MLW taking into account the edge weight and node strength. In fact, the real-life complex systems would be better reflected if the edge weight and node strength were taken into consideration. The proposed WMLW model does not aim at modeling the Internet, but it should be feasible to model more real-life complex networks, for example, software systems.
There have been some studies on diverse aspects of software systems from a perspective of complex networks. Qu et al. explored community structure of software call graph and its applications in class cohesion measurement [11]. Concas et al. investigated software quality and community structure in Java software networks [12]. A complex network approach was used to study software dependency network evolution [13, 14]. Chong and Lee used weighted complex network with graph theory analysis to automate the derivation of clustering constraints from object-oriented software [15]. Joblin et al. investigated the evolutionary trends of developer coordination using a network approach [16]. Ding et al. identified key classes in weighted software networks [17]. In this paper, we apply the MLWM to modeling the evolution of object-oriented software networks, in which a node is a class and an edge between two nodes is a dependency link between two classes.
The remaining of this paper is organized as follows. We propose the WMLW model in Section 2. In Section 3, we analytically study the strength distribution of WMLW model. We report numerical simulations in Section 4 and the application of WMLW model to software systems in Section 5. Finally, we conclude our work with future work in Section 6.
2. Weighted Multi-Local-World Model
We first introduce several key concepts on the WMLW model and then propose two preferential attachment rules to construct the WMLW model and finally describe the WMLW model in detail.
2.1. Key Concepts on the WMLW Model
A weighted network is characterized by an adjacency matrix W, whose element wijdenotes the weight of the edge between nodes i and j, where i,j=1,2,…,N, and N is the total number of nodes of the network.
Here, we restrict our interest to undirected networks in which weights of edges are symmetric (wij=wji) and assume that wii=0. In our model, when adding new edge connected nodes i and j that belong to the same local world, we have wij=a; otherwise, when adding a new edge between nodes i and j from different local worlds, we have wij=b.
Naturally, as the generalization of degree ki of node i, we define the strength of node i as si=∑j∈Γ(i)wij, where Γ(i) denotes the set of neighbors of node i. The strength of a node integrates the information about its connectivity and the weights of its links [18, 19]. The strength of the node reflects its power and importance. A larger strength of a node indicates that it is more important.
2.2. Preferential Attachment Rules2.2.1. Probability about Selecting a Local World following the Scale Preferential Rule
The scale of local world Ω is denoted by MΩ; the total number of local worlds is denoted by Nc. K is a constant that represents the “attractiveness” of local world Ω, and it is used to govern the probability for those “young” local worlds to get new links [10]. In the WMLW model, the probability with which local world Ω is selected is described by(1)ΠlocalΩ=MΩ+K∑j=1NcMj+K.
If the constant K=0, we have(2)ΠlocalΩ=MΩ∑j=1NcMj.For the sake of simplicity, we consider K=0 (i.e., (2)) in the rest of this paper.
In (2), ∑j=1NcMj denotes the sum of the scales of local worlds in the network, that is, the total number of the nodes in the network.
In fact, it is more reasonable to select a local world following the scale preferential rule than to select a local world randomly. In many real cases, the scales of different local world are not equal, for example, software systems, Economic Trade Web, and Internet. The larger the sale of a local world is, the more attractive it is; that is, it also obeys the rule of “the rich get richer.”
2.2.2. Probability of Selecting a Node within a Local World following the Strength Preferential Rule
The probability that node i in local world Ω is chosen is(3)∏inΩi=si∑j∈Ωsj.
2.3. Algorithm of the WMLW ModelStep 1 (initial condition).
Start with an initial network containing n1 local worlds, in which it is supposed that there are m1 nodes and e1links in each local world, and every edge is assigned a weight a.
Step 2.
At each step, perform one of the following four operations on probability:
(i) With probability p1, a new local world is created, which contains m1 nodes and e1links.
(ii) With probability p2, a new node is added to an existing local world, which has m2 links connecting to the nodes within the same local world. The local world is selected with a probability given by (2), and then a node in the local world is chosen with a probability given by (3). Every new added edge is assigned a weight a. This process is repeated m2 times.
(iii) With probability p3, m3new links are added to a chosen local world. A local world is selected with a probability given by (2), and then one end of a link is selected with a probability given by (3), while the other end of the link is chosen randomly. Every new added edge is assigned a weight a. This process is repeated m3 times.
(iv) With probability p4, a local world is selected and it has m4 new links to the other existing local worlds. A local world (Ω1) is selected with a probability given by (2). Then, each link is added according to the following process: a node is selected from Ω1 with a probability given by (3), and this node acts as one end of a link; the other end of the link is a node selected with a probability given by (3) from another local world (Ω2) chosen at random; and every new added edge is assigned a weight b. This process is repeated m4 times.
Step 3.
Repeat Step 2 until the total number of the nodes reaches the number given in advance.
In this model, neither a node is allowed linking with itself nor two nodes are allowed linking repeatedly. In addition, the parameters (probabilities) have to satisfy 0≤p1,p2,p3,p4≤1 and p1+p2+p3+p4=1.
3. Strength Distribution of the WMLW Model
Using the mean-field theory [21, 22], one can obtain the strength distribution of node iin local world Ω, which can be divided analytically as follows.
(i) With probability p1, create a new local world. In this case, the strength of node i in an existing local world Ω does not change over time, since the original nodes in the newly created local world have no links with any other nodes in existing local worlds. Thus,(4)∂Si∂ti=0.
(ii) With probability p2, a new node is added to an existing local world Ω, and this node has m2 links connected to the nodes within the same local world. Every newly added edge is assigned a weight a:(5)∂Si∂tii=ap2m2·MΩ∑j=1NcMj·sisΩ.MΩ/∑j=1NcMj is the probability that local world Ω is selected according to preferential choosing with the probability given by (2). si/sΩ is the probability that a node is selected according to strength preferential attachment with the probability given by (3). sΩ represents the total strength of all nodes in local world Ω. There are m2 links between the newly added node and existing nodes in local world Ω, and the weight of each newly added edge is a, and thus the coefficient is equal to ap2m2.
(iii) With probability p3, m3links are added to a chosen local world Ω. Each newly added edge is assigned a weight a. We have(6)∂Si∂tiii=ap3m3·MΩ∑j=1NcMj·sisΩ+1MΩ.On the right-hand side of (6), local world Ω is selected according to the probability given by (2), si/sΩ represents the preferential selection within local world Ω, and 1/MΩ means the random selection of node i within the same local world Ω.
(iv) With probability p4, a selected local world (Ω) has m4 links connected to other existing local worlds. Each newly added edge is assigned a weight b. Hence,(7)∂Si∂tiv=bp4m4·MΩ∑j=1NcMj·sisΩ+bp4m4·1n1+p1t·sisrandom-local.
In the first term on the right-hand side of (7), MΩ/∑j=1NcMj is the probability given by (3) with which we select local world Ω. In the second term on the right-hand side of (7), another local world is chosen at random (i.e., with the probability of 1/n1+p1t), and srandom-local represents the total strength of all nodes in the randomly chosen local world. The two ends of a link are all selected by (3) in their own local worlds.
Combining (4)–(7) together, we have (8)∂si∂t=ap2m2+ap3m3+bp4m4·MΩ∑j=1NcMj·sisΩ+ap3m3·MΩ∑j=1NcMj·1MΩ+bp4m4·1n1+p1t·sisrandom-local.
In the following text, sΩ-local denotes the average strength of the nodes in local world Ω, and s represents the average strength of all nodes in the network. Hence, we have sΩ=MΩsΩ-local. Since srandom-local denotes the sum of the strength of all nodes in a local world, which is selected randomly, in the average sense, we have(9)srandom-local=n1m1+p1m1+p2tsn1+p1t.
Thus, (8) becomes(10)∂si∂t=ap2m2+ap3m3+bp4m4·MΩn1m1+p1m1+p2t·siMΩsΩ-local+ap3m3·1n1m1+p1m1+p2t+bp4m4·sin1m1+p1m1+p2ts.For large t, we have n1m1+p1m1+p2t≈p1m1+p2t; therefore,(11)∂si∂t=ap2m2+ap3m3+bp4m4·MΩp1m1+p2t·sisΩ-local+ap3m3·1p1m1+p2t+bp4m4·sip1m1+p2ts.
In the following text, we deduce the strength distribution of the network when the probability p1=0 (i.e., in the case when the number of the local worlds is unchanged). In the case when the probability p1≠0, we only give a simulation result for simplicity in Section 4.
When the probability p1=0, the number of the local worlds is the initial n1. All local worlds perform the same steps (ii), (iii), and (iv), though with different probabilities. Different probabilities just bring on different scales of the local worlds.
We can see from works [6–8] that the average degree of the network is free from the scale of the network. In our model, we can also consider that the average strength of every local world is equal for large t; that is,(12)s=total strength of total nodesthe number of total nodes=const+2ap1m1t+2ap2m2t+2ap3m3t+2bp4m4tn1m1+p1m1+p2t≈2ap1m1+2ap2m2+2ap3m3+2bp4m4p1m1+p2for large t.Also, sΩ-local is equal to s. Both of them are free from t.
Considering (12), when the probability p1=0, (11) becomes (13)∂si∂t=ap2m2+ap3m3+2bp4m42ap2m2+ap3m3+bp4m4·sit+ap3m3p2·1t.Define A=ap2m2+ap3m3+2bp4m4/2(ap2m2+ap3m3+bp4m4) and =ap3m3/p2.
Therefore,(14)∂si∂t=Asit+Bt.By integrating, we have sit=λtA-B/A, where λis any constant.
Using the initial condition siti=m2, one obtains(15)sit=m2+BAttiA-BA.Thus,(16)Psit<s=Pti>tm2+B/As+B/A1/A.
In the process of the growth of the network, we add nodes every time interval. Then, the probability density of ti is(17)pti=1n1m1+t.So (18)Pti>tm2+B/As+B/A1/A=1-tn1m1+tm2+B/As+B/A1/A.Therefore, (19)ps=∂Psi<s∂s=tn1m1+t·m2+B/A1/AAs+B/A1+1/A.For large t, the strength distribution approximately is (20)ps≈m2+B/A1/AAs+B/A1+1/A,where A=ap2m2+ap3m3+2bp4m4/2(ap2m2+ap3m3+bp4m4) and B=ap3m3/p2, as defined before.
4. Numerical Simulations
We design several numerical simulations to verify our theoretical analysis results. We consider two cases, that is, a given number of local worlds (p1=0) and an increasing number of local worlds (p1≠0).
4.1. A Given Number of Local Worlds (p1=0)
To testify the theoretical result of strength distribution of the WMLW model, we conducted a numerical simulation in which a network was generated by our WMLW model and the simulation and theoretical results were compared. Figure 1 shows the simulation result. In this simulation, the size of the generated network was set to one million. The details of parameter settings of the WMLW model for the simulation are described in Figure 1. In Figure 1(a), the simulation and theoretical results are shown to facilitate the comparison between them. The theoretical data are calculated by (20), and the simulation data were calculated based on the generated network. The comparison shows that the simulation data and the theoretical data fit well, which implies the correctness of the theoretic result of the strength distribution of the WMLW model. In addition, Figure 1(b) shows that the degree distribution of the WMLW model also follows a power-law distribution. Hence, the WMLW network is scale-free network.
Strength and degree distributions of a generated network by the WMLW model. (a) and (b) show the strength distribution and degree distribution of the generated network, respectively. The numerical simulation was carried out under the following settings: N=1,000,000, n1=100, m1=10, m2=2, m3=2, m4=2, p1=0.000, p2=0.700, p3=0.290, p4=0.010, a=1, and b=10.
There is a deviation between the numerical values and theoretical ones for large nodal strengths in Figure 1(a). This is the so-called “fat tail” phenomenon. The theoretical values are calculated when the number of nodes of the network is infinite. In contrast, for a numerical simulation, the number of nodes of the generated network is finite (one million in this simulation), and thus the phenomenon of “fat tail” happens.
Next, we changed the parameters of the WMLW model and calculated the cumulative strength and degree distributions of generated networks. Figure 2 shows the cumulative strength and degree distributions when the size of the generated networks is set to 100,000 and b is set to 5, 10, and 20, respectively. The results show that both the cumulative strength and degree distributions follow power laws.
Cumulative strength and degree distributions of generated networks by the WMLW model when p1=0. (a) and (b) show the cumulative strength distributions and cumulative degree distributions of the generated networks, respectively. The numerical simulations were carried out under the following settings: N=100,000, n1=30, m1=6, m2=2, m3=2, m4=2, p1=0.000, p2=0.700, p3=0.280, p4=0.020, and a=1, when b is set to 5, 10, and 20.
4.2. An Increasing Number of Local Worlds (p1≠0)
We carried out simulations when p1≠0, and the results are shown in Figure 3. The cumulative strength and degree distributions follow power laws. Comparing Figure 3 (the case when p1≠0) with Figure 2 (the case when p1=0), not only their cumulative strength distributions but also their degree distributions are very similar. Hence, when the parameter p1 is small (approximately equal to 0), we consider that sΩ-local is close to s. Therefore, by the same method as before, we have(21)ps≈m2+B/A1/AAs+B/A1+1/A,where A=ap2m2+ap3m3+2bp4m4/2(ap2m2+ap3m3+bp4m4) and B=ap3m3/p2.
Cumulative strength and degree distributions of generated networks by the WMLW model when p1≠0. (a) and (b) show the cumulative strength distributions and cumulative degree distributions of the generated networks, respectively. The numerical simulations were carried out under the following settings: N=100,000, n1=30, m1=6, m2=2, m3=2, m4=2, p1=0.005, p2=0.695, p3=0.280, p4=0.020, and a=1, when b is set to 5, 10, and 20.
5. Application of the WMLW Model in Software Network Modeling
Object-oriented software systems can be described using class diagrams that depict the relationship between classes. Valverde and Solé found that the small-world and scale-free characteristics also exist in class diagrams of software systems when class diagrams are considered as undirected networks [20]. A class diagram can be considered as a network, in which a node is a class and relationships between classes are links between nodes. In this section, we will employ the WMLW model to model the class diagrams of a set of software systems.
In the design of object-oriented software systems, a software system can be decoupled into components. Classes in the same component tend to own close association and different components couple incompactly because they communicate only by a few interfaces. In fact, this is the rule “high cohesion, low coupling” in the software engineering domain. This rule is clearly implemented in the WMLW model. Thus, the WMLW model is suitable for the modeling of the evolution of class diagrams.
For a software system of a relatively small size, let p1=0, which represents that the total number of local worlds is a constant; that is, the architecture is designed in advance and no important function was ignored before implementation. Thus, the number of components (e.g., packages) will not increase. For a software system of a relatively large size (e.g., more than 1,000 classes), p1 is assigned a small value, which represents the fact that the total number of local worlds of the software system increases slowly. It is reasonable since adding a component is a big change to the software system at the architecture level, and such changes tend not to happen frequently in the development lifecycle of the software system.
We apply the WMLW model to generating networks with similar network properties to real-life software systems. We get the results shown in Table 1 by regulating the parameters of the WMLW model to model different software systems. Table 1 shows that the modeling of the software systems is satisfactory, which indicates the success of the application in class diagrams modeling. Table 2 shows the parameters used by the WMLW model in the modeling of each software system. Figure 4 shows the cumulative degree distributions and the cumulative strength distributions of the software systems using the WMLW model. Figure 5 shows the strength distributions and the cumulative strength distributions of the software systems using the WMLW model.
Network properties of the software systems in real and modeling cases.
Software
Nr/Nm
Lr
Lm
Pr
Pm
Cr
Cm
Prospectus
99
168
159
3.80
3.92
0.140
0.145
eMule
129
218
223
3.87
4.07
0.237
0.225
Blender
495
834
856
6.54
6.39
0.155
0.157
OIV
1214
3903
4008
3.99
3.97
0.122
0.130
JDK-A
1376
2162
2314
5.40
5.73
0.159
0.156
CS
1488
3526
3746
3.92
3.94
0.135
0.131
Note. N is the total number of nodes, L is the number of the edges, P is average path length, and C is clustering coefficient. The subscript r represents the real case, and the subscript m represents the modeling case. The real data were reported in [20].
Parameters of the WMLW model for the modeling of the software systems.
Software
n1
m1
m2
m3
m4
p1
p2
p3
p4
a
b
Prospectus
3
6
1
2
1
0.000
0.700
0.250
0.050
1
10
eMule
4
6
1
3
1
0.000
0.720
0.230
0.050
1
10
Blender
15
6
1
2
1
0.000
0.700
0.260
0.040
1
10
OIV
6
15
3
2
1
0.004
0.696
0.150
0.150
1
10
JDK-A
8
6
1
2
1
0.022
0.680
0.240
0.058
1
10
CS
8
6
1
3
2
0.005
0.595
0.200
0.200
1
10
Degree distributions and cumulative degree distributions of the modeled software systems. (a) and (b) show the degree distributions and cumulative degree distributions, respectively.
Strength distributions and cumulative strength distributions of the modeled software systems. (a) and (b) show the strength distributions and cumulative strength distributions, respectively.
6. Conclusions
In this paper, we propose the Weighted Multi-Local-World (WMLW) model, in which the weight of the link and the strength of the node are considered. Through analyzing the weight of links and strength of nodes, it is theoretically proven that the distribution of strengths yields to a power-law distribution, and simulations show the correctness of the theoretical result. In addition, the degree distribution also follows a power-law distribution. The modeling of class diagrams of real software systems is found to be a good application of the WMLW model. In our future work, we will find its more practical applications by improving the model.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This work is supported by the National Key Research Technology and Development Program of China (Grant no. 2016YFB0800404), the National Natural Science Foundation of China (Grants nos. 61702377, 61773175, 61572371, and 61403154), the China Postdoctoral Science Foundation (Grant no. 2015M582272), and the Natural Science Foundation of Hubei Province of China (Grants nos. 2016CFB158 and 2017CFB426).
BarabasiA.AlbertR.Emergence of scaling in random networks1999286543950951210.1126/science.286.5439.509MR2091634Zbl1226.052232-s2.0-0038483826WattsD. J.StrogatzS. H.Collective dynamics of 'small-world' networks1998393668444044210.1038/309182-s2.0-0032482432NewmanM.2010Oxford University PressNewmanM. E. J.The structure and function of complex networks200345216725610.1137/S003614450342480MR2010377Zbl1029.680102-s2.0-0038718854BoccalettiS.LatoraV.MorenoY.ChavezM.HwangD. W.Complex networks: structure and dynamics20064244-517530810.1016/j.physrep.2005.10.0092-s2.0-31344474880LiX.ChenG.A local-world evolving network model20033281-227428610.1016/S0378-4371(03)00604-6MR2012478Zbl1029.905022-s2.0-0141860156WangB.TangH.ZhangZ.XiuZ.Evolving scale-free network model with tunable clustering20051926395139592-s2.0-2714449499710.1142/S0217979205032437Zbl1111.91339PanZ.LiX.WangX.Generalized local-world models for weighted networks20067352-s2.0-3364659413410.1103/PhysRevE.73.056109056109LiP.ZhaoQ.WangH.A weighted local-world evolving network model based on the edge weights preferential selection201327122-s2.0-8487667028710.1142/S02179792135003921350039ChenG.FanZ.LiX.KocarevL.VattayG.Modelling the complex internet topology2005Springer Berlin Heidelberg213234QuY.GuanX.ZhengQ.LiuT.WangL.HouY.YangZ.Exploring community structure of software Call Graph and its applications in class cohesion measurement20151081932102-s2.0-8493777267510.1016/j.jss.2015.06.015ConcasG.MarchesiM.MonniC.OrrùM.TonelliR.Software quality and community structure in java software networks20172771063109610.1142/S02181940175004012-s2.0-85029620864DecanA.MensT.GrosjeanP.An empirical comparison of dependency network evolution in seven software packaging ecosystems201810.1007/s10664-017-9589-yKikasR.GousiosG.DumasM.PfahlD.Structure and evolution of package dependency networksProceedings of the 14th IEEE/ACM International Conference on Mining Software Repositories (MSR '17)May 201710211210.1109/MSR.2017.552-s2.0-85026515162ChongC. Y.LeeS. P.Automatic clustering constraints derivation from object-oriented software using weighted complex network with graph theory analysis201713328532-s2.0-8502805848210.1016/j.jss.2017.08.017JoblinM.ApelS.MauererW.Evolutionary trends of developer coordination: a network approach2017224205020942-s2.0-8500040601310.1007/s10664-016-9478-9DingY.LiB.HeP.An improved approach to identifying key classes in weighted software network201620169385863710.1155/2016/38586372-s2.0-84988696960BarratA.BarthélemyM.VespignaniA.Weighted evolving networks: coupling topology and weight dynamics200492222287012-s2.0-8492795515010.1103/PhysRevLett.92.228701WangB.-H.WangW.-X.ZhouT.A weighted complex network model driven by traffic flow20064012ValverdeS.SoléR. V.Hierarchical small worlds in software architecture2007https://arxiv.org/abs/cond-mat/0307278AlbertR.BarabásiA.-L.Topology of evolving networks: local events and universality200085245234523710.1103/PhysRevLett.85.52342-s2.0-0142077573BarabásiA. L.AlbertR.JeongH.Mean-field theory for scale-free random networks1999272117318710.1016/S0378-4371(99)00291-52-s2.0-18744421488