Complex Networks : Statistical Properties , Community Structure , and Evolution

We investigate the function for different networks based on complex network theory. In this paper, we choose five data sets from various areas to study. In the study of Chinese network, scale-free effect and hierarchical structure features are found in this complex system.These results indicate that the discovered features of Chinese character structure reflect the combination nature of Chinese characters. In addition, we study the community structure in Chinese character network. We can find that community structure is always considered as one of the most significant features in complex networks, and it plays an important role in the topology and function of the networks. Furthermore, we cut all the nodes in the different networks from low degree to high degree and then obtainmany networks with different scale. According to the study, two interesting results have been obtained. First, the relationship between the node number of themaximum communities and the number of communities in the corresponding networks is studied and it is linear. Second, when the number of nodes in the maximum communities is increasing, the increasing tendency of the number of its edges slows down; we predict the complex networks have sparsity. The study effectively explains the characteristic and community structure evolution on different networks.


Introduction
In recent years, complex networks have had a profound effect on many discipline researches [1], such as systems science, statistical physics, social sciences, and biology [2][3][4].Networks structure theory may help us to understand the properties, function, and evolution mechanism of complex networks [5].As we know, Chinese is one of the most widely used languages in the world.Chinese characters play an important role in its well-known civilization.So Chinese character structure analysis based on radicals is a challenging, interesting, and very important problem.It will help to study the characteristics and the evolution of Chinese character and the universal rule of Chinese character combination principles.
In recent years' research, it is found that any language in the word including Chinese character by connecting basic units is based on complex grammar, syntax, and semantics [6].Recently, with the rapid development of complex networks studies, Chinese complex networks are actively studied [7][8][9].For example, word cooccurrence networks [10] show two important characteristics: small-world effect and scale-free distribution.What is more, the study of Wordnet lexicon [11] demonstrates that Wordnet has global properties common to many self-organized systems, and polysemous links have a profound impact on the organization of the semantic graph which may be crucial for metaphoric thinking, imagery, and generalization.Again, network of free word associations [12] represents a proxy of the way in which our mind stores and organizes all words and related meanings.
Radicals of the Chinese character are treated as the basic units of Chinese.They are monosyllabic, square-shaped, and primitive, having some relationship to iconicity and combination [13].They combine into Chinese characters based on certain rules.
In this paper, we mainly investigate the statistical properties and community structure of Chinese character network.In addition, we also research the evolution of community structure in different networks.The paper is organized as follows.Firstly, we focus on the graph features of Chinese character network based on complex network theory.Secondly, we present an analysis aimed at community structure in Chinese character network.Thirdly, we study the community structure evolution of five different networks and the community structure can help us to understand the global structures [14].Finally, we draw the conclusions in the last section.

Data and Network
In order to study the features for different networks, we need to choose data sets from various areas.Table 1 lists the data sets we have used.
(1) MATLAB Help Document.The nodes are key terms in MATLAB help document.If two nodes have a hyperlink relationship, they are connected.
(2) Chinese Characters.The nodes are radicals in Chinese characters, and the relation of two radicals exists if they can cooccur in a word.
(3) Yeast.The nodes are proteins in yeast, and the relation of two proteins exists if they have chemical reaction with each other [15].
(4) Electronic Collaboration Networks.The nodes are authors who research general relativity and quantum cosmology.The relation of two authors exists if they cooperate to complete a paper [16].
(5) Peer-to-Peer Files Sharing Networks.The nodes are hosts in peer-to-peer files sharing networks topology, and the relation of two hosts exists if they connect to each other [17].
In the following we attempt to uncover the features of different networks based on graph theory.Let us consider the undirected graph,  = (, ), where  = {  } ( = 1, 2, . . .,   ) is the set of nodes and  = {{  ,   }} is the set of connections.Here,  = {  ,   } indicates that there is an edge between   and   .

Scale-Free and Hierarchical on Chinese Character Network
In this section, we will study the characteristics of the Chinese character network.First, we research the degree distribution ().The degree of a given radical is the number of edges that connect the given radical with other radicals.Degree distribution is defined as the existence probability of nodes with degree.Degree distribution means the word-formation ability of radicals which is reflected in Chinese character network.
Figure 1 shows that degree distribution of Chinese character network follows power-law distribution () ∼  − .The exponent  is 2.07.A network that exhibits power-law degree distribution is called a scale-free network.Scale-free network indicates that the majority of the nodes have a small amount   of links, but a few nodes, called hubs, can link to most of the nodes in the network.For example, in Chinese character network, the characters "口" (mouth) and "木" (wood) are highly connected nodes because they are familiar to us and they can form a great deal of Chinese characters.Another feature is clustering coefficient [18].Clustering coefficient means the probability that two neighbors of a node are also neighbors to each other (nodes   and   are neighbors if there is a link between   and   ).For a node with neighbors, the local clustering coefficient   is defined as the ratio between the number of links among the   neighbors and the maximum possible number of links among these neighbors.This can be expressed as follows: where   is the number of existing links between the   neighbors.Clustering spectrum () is defined as an average clustering coefficient of nodes with degree .As Figure 2 shows, clustering coefficient decreases linearly with the degree.This implies that the small nodes are part of highly cohesive, densely interlinked clusters, while the hubs are not, as their neighbors have a small chance of linking to each other.The power-law clustering spectrum () ∼  − shows that the network has a hierarchical feature [19].
The hierarchical feature of Chinese character network is consistent with hierarchical network models in lexical networks [20].As shown in Figure 3, the most common and important characters radicals, such as "木" (wood), "氵" (water), "目" (eye), "亻" (single side), and "艹" (cursive head), should be stored in higher level so that people can learn Chinese characters conveniently and efficiently.

Community Structure in Chinese
Character Network Chinese characters have many hundreds of thousands of words, most of which are created by combining just with a few thousand radicals.Although tens of thousands of Chinese characters have been created in history, the Chinese characters in common use are about thousands.According to the characters structure, there are about hundreds of traditional radicals, which are mainly used to index and look up characters in the dictionary.In this section, we will study community structure of the Chinese character network.Community structure [21][22][23][24][25][26] refers to a high density of links between nodes of the same group and a comparatively low density of links between nodes of different groups.Community structure analysis has a wide range of application in biology, physics, computer graphics, and sociology [27,28].For example, in social groups, people with the same hobbies or beliefs always appeal to each other.
In molecular response network, we could distinguish roles or features of molecular from aggregated functional module nodes.
In order to detect community in Chinese character network, we calculate the degree of each node in the network, and then we cut all the nodes in the network from low degree to high degree and obtain many networks with different scale.
In this paper, we only analyze the nodes with high degree.As Figures 4, 5, and 6 show, the most common and important radicals, such as "钅" (metal), "木" (wood), "氵" (water), "火" (fire), and "土" (earth), should be stored in higher level so that people can learn Chinese characters conveniently and efficiently.Furthermore, we found an interesting phenomenon where "钅" (metal), "木" (wood), "氵" (water), "火" (fire), and "土" (earth) correspond to the Yin-Yang and the five elements.It shows the relationship between Chinese traditional culture and Chinese characters.And it also reveals that there is a natural internal relationship between Yin-Yang and the five elements of Chinese philosophy and Chinese characters [29].The research of radicals of Chinese characters could help us to understand the Chinese culture better.

Evolution of Community Structure in Five Different Networks
In this section, we try to explore the relationship between the nodes number of the maximum communities and the number of communities in the corresponding networks to investigate the evolution characteristic of community structure, as shown in Figure 7.
Figure 7 displays the relationship, from which we can observe that the number of communities is gradually declining as the number of nodes is increasing.However, the trends of rate are diverse and we can find that different trends can reflect the structure of different complex networks.For example, the linear correlation of MATLAB help document is steep, which reveals they are connected tensely inside and connected sparsely outside.Meanwhile, the linear correlation of yeast is relatively smooth, which means they are connected sparsely inside and connected tensely outside.Depending on the linear correlation between the numbers of communities in whole network, we are able to study the characteristics of community structure of complex network better.
We explore the relationship between the nodes number of the maximum communities and the number of edges in the corresponding maximum communities.Figure 8 displays the relationship of five various data sets.From Figure 8, we can observe that the number of edges is increasing rapidly while the number of nodes is increasing at the beginning.However, when nodes increase to a certain extent, we can observe that the number of edges, in the corresponding communities, is increasing tend to smooth while the number of nodes is increasing.From Figure 8, we can also know that it is different in five data sets; some data sets change to steep, such as electronic collaboration networks, and other data sets change to smooth, such as peer-to-peer files sharing networks.When nodes increase to a certain extent, the number of edges is not increasing but achieves a smooth state.Based on the   4.
above analysis, we could conclude that complex networks have sparsity feature.  5.

Conclusion
In this paper, we investigate the function of five different networks based on complex network theory.
Firstly, we have presented the results of the analysis performed on Chinese character network.Chinese character network displays scale-free and hierarchical structure features, which are responsible for robustness.A group of Chinese characters tends to share the same radicals in order to communicate efficiently with the least effort.The most interesting phenomenon that antonymous nodes emerge in community structure should be further investigated.
Chinese characters have their special organizing principles; the radicals based on semantics show that they have intimate relations to nature.Chinese culture and our life Secondly, the appearance of Chinese character radical not only enriched human language but also has important effect on word association.It is well known that information in our brain is associative and is retrieved by connecting similar concepts.Our experiment has been brought to the attention of community structure as a valuable tool to understand the basic cognitive mechanisms and information retrieval processes.The structure features of community structure may be related to increasing our memory retention and recall, which is probably necessary for the brain to store information and associate.
Thirdly, we study the evolution of community structure in five different networks.After fitting between the number of nodes in the maximum communities and the number of communities in the corresponding networks, we observed the linear correlation of them.We could expend the linear correlation to other kinds of networks and predict the scale of them.In addition, depend on the study of the relationship between the nodes number of the maximal communities and the number of edges in the corresponding maximal communities, which revealed the sparse feature of the community structure in network and the generality of complex networks.
Although the resulting Chinese character networks display statistical features different from random networks, this does not mean to rule out the random factors.As a complex adaptive system, there exist random factors for combination of Chinese characters.Even in scale-free networks, random attachment still plays an important role and is a preferential attachment.This problem will be considered in the future.
There is no shared definition of community, which is justified by the nature of the problem itself.What is more, the network in the real world is always dynamic.Most researches on complex networks are focusing on excavating the hidden relations and features in real networks such as social network.Therefore, the improvement of implementing our research on ever-changing dynamic networks will be an innovative and challenging topic in our future work.In addition, more complex and more realistic model should be considered.

Figure 5 :
Figure5: The degree of each node is more than 83 in the community structure.The meanings of the nodes are listed in Table4.

Figure 6 :
Figure6: The degree of each node is more than 63 in the community structure.The meanings of the nodes are listed in Table5.

Figure 7 :Figure 8 :
Figure7: Description of the relationship between the nodes number of the maximum communities and the number of communities is linear, but the slopes of the lines from five data sets are different, where k1, k2, k3, k4, and k5 represent the slopes of five linear fittings of MATLAB help document, Chinese characters, yeast, electronic collaboration networks, and peer-to-peer files sharing networks, respectively (  : the node number of the maximum communities and   : the number of communities in the corresponding networks).

Table 1 :
Five various data sets and all nodes in each data set and the existing edges.

Table 2 :
Hierarchical structure of Chinese character network.The meanings of the nodes are listed in Table2.Meaning of Chinese characters which are nodes of Figure3is listed.The degree of each node is more than 100 in the community structure.The meanings of the nodes are listed in Table3.

Table 3 :
Meaning of Chinese characters which are nodes of Figure 4 is listed.

Table 4 :
Meaning of Chinese characters which are nodes of Figure 5 is listed.

Table 5 :
Meaning of Chinese characters which are nodes of Figure6is listed.