On Measuring the Complexity of Networks : Kolmogorov Complexity versus Entropy

One of the most popular methods of estimating the complexity of networks is to measure the entropy of network invariants, such as adjacency matrices or degree sequences. Unfortunately, entropy and all entropy-based information-theoretic measures have several vulnerabilities. These measures neither are independent of a particular representation of the network nor can capture the properties of the generative process, which produces the network. Instead, we advocate the use of the algorithmic entropy as the basis for complexity definition for networks. Algorithmic entropy (also known as Kolmogorov complexity or K-complexity for short) evaluates the complexity of the description required for a lossless recreation of the network. This measure is not affected by a particular choice of network features and it does not depend on the method of network representation. We perform experiments on Shannon entropy and K-complexity for gradually evolving networks. The results of these experiments point to K-complexity as the more robust and reliable measure of network complexity. The original contribution of the paper includes the introduction of several new entropy-deceiving networks and the empirical comparison of entropy andK-complexity as fundamental quantities for constructing complexity measures for networks.


Introduction
Networks are becoming increasingly more important in contemporary information science due to the fact that they provide a holistic model for representing many real-world phenomena.The abundance of data on interactions within complex systems allows network science to describe, model, simulate, and predict behaviors and states of such complex systems.It is thus important to characterize networks in terms of their complexity, in order to adjust analytical methods to particular networks.The measure of network complexity is essential for numerous applications.For instance, the level of network complexity can determine the course of various processes happening within the network, such as information diffusion, failure propagation, actions related to control, or resilience preservation.Network complexity has been successfully used to investigate the structure of software libraries [1], to compute the properties of chemical structures [2], to assess the quality of business processes [3][4][5], and to provide general characterizations of networks [6,7].
Complex networks are ubiquitous in many areas of science, such as mathematics, biology, chemistry, systems engineering, physics, sociology, and computer science, to name a few.Yet the very notion of network complexity lacks a strict and agreed-upon definition.In general, a network is considered "complex" if it exhibits many features such as small diameter, high clustering coefficient, anticorrelation of node degrees, presence of network motifs, and modularity structures [8].These features are common in real-world networks, but they rarely appear in artificial random networks.Finding a good metric with which one can estimate the complexity of a network is not a trivial task.A good complexity measure should not depend solely on the number of vertices and edges, but it must take into consideration topological characteristics of the network.In addition, complexity is not synonymous with randomness or unexpectedness.As has been pointed out [8], within the spectrum of possible networks, from the most ordered (cliques, paths, and stars) to the most disordered (random networks), complex networks occupy the very center of this spectrum.Finally, a good complexity measure should not depend on a particular network representation and should yield consistent results for various representations of the same network (adjacency matrix, Laplacian matrix, and degree sequence).Unfortunately, as current research suggests, finding a good complexity measure applicable to a wide variety of networks is very challenging [9][10][11].
Among many possible measures which can be used to define the complexity of networks, the entropy of various network invariants has been by far the most popular choice.Network invariants considered for defining entropy-based complexity measures include number of vertices, number of neighbors, number of neighbors at a given distance [12], distance between vertices [13], energy of network matrices such as Randić matrix [14] or Laplacian matrix [15], and degree sequences.There are multiple definitions of entropies, usually broadly categorized into three families: thermodynamic entropies, statistical entropies, and information-theoretic entropies.In the field of computer science, informationtheoretic measures are the most prevalent, and they include Shannon entropy [16], Kolmogorov-Sinai entropy [17], and Rényi entropy [18].These entropies are based on the concept of the information content of a system and they measure the amount of information required to transmit the description of an object.The underlying assumption of using information-theoretic definitions of entropy is that uncertainty (as measured by entropy) is a nondecreasing function of the amount of available information.In other words, systems in which little information is available are characterized by low entropy and therefore are considered to be "simple."The first idea to use entropy to quantify the complexity of networks comes from Mowshowitz [19].
Despite the ubiquitousness of general-purpose entropy definitions, many researchers have developed specialized entropy definitions aimed at describing the structure of networks [10].Notable examples of such definitions include the proposal by Ji et al. to measure the unexpectedness of a particular network by comparing it to the number of possible network configurations available for a given set of parameters [20].This concept is clearly inspired by algorithmic entropy, which defines the complexity of a system not in terms of its information content, but in terms of its generative process.A different approach to measure the entropy of networks has been introduced by Dehmer under the form of information functional [21].Information functional can be also used to quantify network entropy in terms of -neighborhoods of vertices [12,13] or independent sets of vertices [22].Yet another approach to network entropy has been proposed by Körner, who advocates the use of stable sets of vertices as the basis to compute network entropy [23].Several comprehensive surveys of network entropy applications are also available [9,11].
Within the realm of information science, the complexity of a system is most often associated with the number of possible interactions between elements of the system.Complex systems evolve over time, they are sensitive to even minor perturbations at the initial steps of development and often involve nontrivial relationships between constituent elements.Systems exhibiting high degree of interconnectedness in their structure and/or behavior are commonly thought to be difficult to describe and predict, and, as a consequence, such systems are considered to be "complex."Another possible interpretation of the term "complex" relates to the size of the system.In the case of networks, one might consider to use the number of vertices and edges to estimate the complexity of a network.However, the size of the network is not a good indicator of its complexity, because networks which have well-defined structures and behaviors are, in general, computationally simple.
In this work, we do not introduce a new complexity measure or propose new informational functional and network invariants, on which an entropy-based complexity measure could be defined.Rather, we follow the observations formulated in [24] and we present the criticism of the entropy as the guiding principle of complexity measure construction.Thus, we do not use any specific formal definition of complexity, but we provide additional arguments why entropy may be easily deceived when trying to evaluate the complexity of a network.Our main hypothesis is that algorithmic entropy, also known as Kolmogorov complexity, is superior to traditional Shannon entropy due to the fact that algorithmic entropy is more robust, less dependent on the network representation, and better aligned with intuitive human understanding of complexity.
The organization of the paper is the following.In Section 2, we introduce basic definitions related to entropy and we formulate arguments against the use of entropy as the complexity measure of networks.Section 2.3 presents several examples of entropy-deceiving networks, which provide both motivation and anecdotal evidence for our hypothesis.In Section 3, we introduce Kolmogorov complexity and we show how this measure can be applied to networks, despite its high computational cost.The results of the experimental comparison of entropy and Kolmogorov complexity are presented in Section 4. The paper concludes in Section 5 with a brief summary and future work agenda.

Entropy as the Measure of Network Complexity
2.1.Basic Definitions.Let us introduce basic definitions and notation used throughout the remainder of this paper.A network is an ordered pair  = ⟨, ⟩, where  = {V 1 , . . ., V  } is the set of vertices and  = {(V  , V  ) ∈  × } is the set of edges.The degree (V  ) of the vertex V  is the number of vertices adjacent to it, (V  ) = |{V  : (V  , V  ) ∈ }|.A given network can be represented in many ways, for instance, using an adjacency matrix defined as An alternative to the adjacency matrix is the Laplacian matrix of the network defined as Other popular representations of networks include the degree list defined as  = ⟨(V 1 ), (V 2 ), . . ., (V  )⟩ and the degree distribution defined as Although there are numerous different definitions of entropy, in this work we are focusing on the definition most commonly used in information sciences, the Shannon entropy [16].This measure represents the amount of information required to provide the statistical description of the network.Given any discrete random variable  with  possible outcomes, the Shannon entropy () of the variable  is defined as the function of the probability  of all outcomes of : Depending on the selected base of the logarithm, the entropy is expressed in bits ( = 2), nats ( = ), or dits ( = 10) (bits are also known as Shannon, and dits are also known as Hartley).The above definition applies to discrete random variables; for random variables with continuous probability distributions differential entropy is used, usually along with the limiting density of discrete points.Given a variable  with  possible discrete outcomes such that in the limit  → ∞ the density of  approaches the invariant measure (), the continuous entropy is given by lim →∞  () = − ∫  ()  ()  () .
In this work, we are interested in measuring the entropy of various network invariants.These invariants can be regarded as discrete random variables with the number of possible outcomes bound by the size of the available alphabet, either binary (in the case of adjacency matrices) or decimal (in the case of other invariants).Consider the 3-regular graph presented in Figure 1.This graph can be described using the following adjacency matrix: This matrix, in turn, can be flattened to a vector (either row-wise or column-wise), and this vector can be treated as a random variable with two possible outcomes, 0 and 1.Counting the number of occurrences of these outcomes, we arrive at the random variable  = { 0 = 0.7,  1 = 0.3} and its entropy () = 0.88.Alternatively, this graph can be described using the degree list  = ⟨3, 3, 3, 3, 3, 3, 3, 3, 3, 3⟩ which can be treated as the random variable with the entropy () = 0. Yet another possible random variable that can be derived from this graph is the degree distribution  = { 0 = 0,  1 = 0,  2 = 0,  3 = 1} with the entropy () = 0.In summary, any network invariant can be used to extract a random variable and compute its entropy.
Thus, in the remainder of the paper, whenever mentioning entropy, we will refer to the entropy of a discrete random variable.In general, the higher the randomness, the greater the entropy.The value of entropy is maximal for a random variable with the uniform distribution and the minimum value of entropy is attained by a constant random variable.This kind of entropy will be further explored in this paper in order to reveal its weaknesses.
As an alternative to Shannon entropy, we advocate the use of Kolmogorov complexity.We postpone the discussion of Kolmogorov complexity to Section 3, where we provide both its definition and the method to approximate this incomputable measure.For the sake of brevity, in the remainder of this paper, we will use the term "entropy" to refer to Shannon entropy and the term "-complexity" to refer to Kolmogorov complexity.

Why Is Entropy a Bad Measure of Network Complexity.
Zenil et al. [24] argue that entropy is not appropriate to measure the true complexity of a network and they present several examples of networks which should not qualify as complex (using the colloquial understanding of the term), yet which attain maximum entropy of various network invariants.We follow the line of argumentation of Zenil et al., and we present more examples of entropy-deceiving networks.Our main aim is to show that it is relatively easy to construct a network which achieves high values of entropy of various network invariants.Examples presented in this section outline the main problem with using entropy as the basis for complexity measure construction: namely, that entropy is not aligned with intuitive human understanding of complexity.Statistical randomness, as measured by entropy, does not imply complexity in a useful, operational way.
The main reason why entropy and other entropy-related information-theoretic measures fail to correctly describe the complexity of a network is the fact that these measures are not independent of the network representation.As a matter of fact, this remark applies equally to all computable measures of network complexity.It is quite easy to present examples of two equivalent lossless descriptions of the same network having very different entropy values, as we will show in Section 2.3.In this paper, we experiment with four different representations of networks: adjacency matrices, Laplacian matrices, degree lists, and degree distributions.We show empirically that the choice of a particular representation of the network strongly influences the resulting entropy estimation.
Another property which makes entropy a questionable measure of network complexity is the fact that entropy cannot be applied to several network features at the same time, but it operates on a single feature, for example, degree and betweenness.In theory, one could devise a function which would be a composition of individual features, but high complexity of the composition does not imply high complexity of all its components and vice versa.This requirement to select a particular feature and compute its probability distribution disqualifies entropy as a universal and independent measure of complexity.
In addition, an often forgotten aspect of entropy is the fact that measuring entropy requires making an arbitrary choice regarding the aggregation level of the variable, for which entropy is computed.Consider the network presented in Figure 2. At the first glance, this network seems to be fairly random.The density of the network is 0.35 and its entropy computed over adjacency matrix is 0.92 bits.However, this network has been generated using a very simple procedure.We begin with the initial matrix: Next, we create 64 copies of this matrix, and each of these copies is randomly transposed.Finally, we bind all these matrices together to form a square matrix  24×24 and we use it as the adjacency matrix to create the network.So, if we were to coalesce the adjacency matrix into 3 × 3 blocks, the entropy of the adjacency matrix would be 0, since all constituent blocks are the same.It would mean that the network is actually deterministic and its complexity is minimal.On the other hand, it should be noted that this shortcoming of entropy can be circumvented by using the entropy rate (-gram entropy) instead, because entropy rate calculates the entropy for all possible levels of granularity of a variable.Given a random variable  = ⟨ 1 ,  2 , . . .,   ⟩, let (  ,  +1 , . . .,  + ) denote the joint probability over  consecutive values of .Entropy rate   () of a sequence of  consecutive values of  is defined as Entropy rate of the variable  is simply the limit of the above estimation for  → ∞.

Entropy-Deceiving Networks.
In this section, we present four different examples of entropy-deceiving networks, similar to the idea coined in [24].Each of these networks has a simple generative procedure and should not (intuitively) be treated as complex.However, if the entropy was used to construct a complexity measure, these networks would have been qualified as complex.The examples given in this section disregard any specific definition of complexity; their aim is to outline main shortcomings of entropy as the basis for any complexity measure construction.

Degree Sequence Network.
Degree sequence network is an example of a network which has an interesting property: there are exactly two vertices for each degree value 1, 2, . . ., /2;  = ||.
The procedure to generate degree sequence network is very simple.First, we create a linked list of all  vertices, for which ( The resulting network is presented in Figure 3.It is very regular, with a uniform distribution of vertex degrees, due to its straightforward generation procedure.However, if one would examine the entropy of the degree sequence, this entropy would be maximal for a given number  of vertices, suggesting far greater randomness of such network.This example shows that entropy of the degree sequence (and the entropy of the degree distribution) can be very misleading when trying to evaluate the true complexity of a network.

2.3.2.
Copeland-Erdös Network.The Copeland-Erdös network is a network which seems to be completely random, despite the fact that the procedure of its generation is deterministic.The Copeland-Erdös constant is a constant which is produced by concatenating "0" with the sequence of consecutive prime numbers [25].When prime numbers are expressed in base 10, the Copeland-Erdös constant is a normal number; that is, its infinite sequence of digits is uniformly distributed (the normality of the Copeland-Erdös constant in bases other than 10 is not proven).This fact allows us to devise the following simple generative procedure for a network.Given the number of vertices , take the first  2 digits of the Copeland-Erdös constant and represent them as the matrix of the size  × .Next, binarize each value in the matrix using the function () = div5 (integer division) and use it as the adjacency matrix to create a network.Since each digit in the matrix is approximately equally likely, the resulting binary matrix will have approximately the same number of 0's and 1's.An example of the Copeland-Erdös network is presented in Figure 4.The entropy of the adjacency matrix is maximal for a given number of  vertices; furthermore, the network may seem to be random and complex, but its generative procedure, as we can see, is very simple.

2-Clique
Network.2-Clique network is an artificial example of a network in which the entropy of the adjacency matrix is maximal.The procedure to generate this network is as follows.We begin with two connected vertices labeled red and blue.We add red and blue vertices alternatingly, each time connecting the newly added vertex with all other vertices of the same color.As a result, two cliques appear (see Figure 5).Since there are as many red vertices as there are blue vertices, the adjacency matrix contains the same number of 0's and 1's (not considering the 1 representing the bridge edge between cliques).So, entropy of the adjacency matrix is close to maximal, although the structure of the network is trivial.

Ouroboros Network.
Ouroboros (Ouroboros is an ancient symbol of a serpent eating its own tail, appearing first in Egyptian iconography and then gaining notoriety in later magical traditions) network is another example of an entropy-deceiving network.The procedure to generate this network is very simple: for a given number  of vertices, we create two closed rings, each consisting of /2 vertices, and we connect corresponding vertices of the two rings.Finally, we break a single edge in one ring and we put a single vertex at the end of the broken edge.The result of this procedure can be seen in Figure 6.Interestingly, even though almost all vertices in this network have equal degree of 3, each vertex has different betweenness.Thus, the entropy of the betweenness sequence is maximal, suggesting a very complex pattern of communication pathways though the network.Obviously, this network is very simple from the communication point of view and should not be considered complex.

𝐾-Complexity as the Measure of Network Complexity
We strongly believe that Kolmogorov complexity (complexity) is a much more reliable and robust basis for constructing the complexity measure for compound objects, such as networks.Although inherently incomputable, -complexity can be easily approximated to a degree which allows for the practical use of -complexity in real-world applications, for instance, in machine learning [26,27], computer network management [28], and general computation theory (proving lower bounds of various Turing machines, combinatorics, formal languages, and inductive inference) [29].
Let us now introduce the formal framework for complexity and its approximation.Note that entropy is defined for any random variable, whereas -complexity is defined for strings of characters only.-complexity   () of a string  is formally defined as where  is a program which produces the string  when run on a universal Turing machine  and || is the length of the program , that is, the number of bits required to represent .Unfortunately, -complexity is incomputable [30], or more precisely, it is upper semicomputable (only the upper bound of the value of -complexity can be computed for a given string ).One way for approximating the true value of   () is to use the notion of algorithmic probability introduced by Solomonoff and Levin [31,32].Algorithmic probability   () of a string  is defined as the expected probability that a random program  running on a universal Turing machine  with the binary alphabet produces the string  upon halting: Of course there are 2 || possible programs of the length ||, and the summation is performed over all possible programs without limiting their length, which makes algorithmic probability   () a semimeasure which itself is incomputable.Nevertheless, algorithmic probability can be used to calculate -complexity using the Coding Theorem [31] which states that algorithmic probability approximates -complexity up to a constant : −log 2   () −   ()     ≤ .
The consequence of the Coding Theorem is that it associates the frequency of occurrence of the string  with its complexity.In other words, if a particular string  can be generated by many different programs, it is considered "simple."On the other hand, if a very specific program is required to produce the given string , this string can be regarded as "complex."The Coding Theorem also implies that -complexity of a string  can be approximated from its frequency using the formula: This formula has inspired the Algorithmic Nature Lab group (https://www.algorithmicnaturelab.org) to develop the CTM (Coding Theorem Method), a method to approximate -complexity by counting output frequencies of small Turing machines.Clearly, algorithmic probability of the string  cannot be computed exactly, because the formula for algorithmic probability requires finding all possible programs that produce the string .Nonetheless, for a limited subset of Turing machines it is possible to count the number of machines that produce the given string , and this is the trick behind the CTM.In broad terms, the CTM for a string  consists in computing the following function: where T(, ) is the space of all universal Turing machines with  states and  symbols.Function (, , ) computes the ratio of all halting machines with  states and  symbols which produce the string  and its value is determined with the help of known values of the famous Busy Beaver function [33].The Algorithmic Nature Lab group has gathered statistics on almost 5 million short strings (maximum length is 12 characters) produced by Turing machines with alphabets ranging from 2 to 9 symbols, and based on these statistics the CTM can approximate the algorithmic probability of a given string.Detailed description of the CTM can be found in [34].Since the function (, , ) is an approximation of the true algorithmic probability   (), it can also be used to approximate -complexity of the string .
The CTM can be applied only to short strings consisting of 12 characters or less.For larger strings and matrices, the BDM (Block Decomposition Method) should be used.The BDM requires the decomposition of the string  into (possibly overlapping) blocks { 1 ,  2 , . . .,   }.Given a long string , the BDM computes its algorithmic probability as where CTM(  ) is the algorithmic complexity of the block   and |  | denotes the number of times the block   appears in .
Detailed description of the BDM can be found in [35].Obviously, any representation of a nontrivial network requires far more than 12 characters.Consider once again the 3-regular graph presented in Figure 1.The Laplacian matrix representation of this graph is the following: If we treat each row of the Laplacian matrix as a separate block, the string representation of the Laplacian matrix becomes  = { 1 = 3010100010,  2 = 0300100110, . . .,  10 = 0000101013} (for the sake of simplicity, we have replaced the symbol "−1" with the symbol "1").This input can be fed into the BDM, producing the final estimation of the algorithmic probability (and, consequently, the estimation of the -complexity) of the string representation of the Laplacian matrix.In our experiments, whenever reporting the values of -complexity of the string , we actually report the value of BDM() as the approximation of the true complexity.

Gradual Change of Networks.
As we have stated before, the aim of this research is not to propose a new complexity measure for networks, but to compare the usefulness and robustness of entropy versus -complexity as the underlying foundations for complexity measures.Let us recall what properties are expected from a good and reliable complexity measure for networks.Firstly, the measure should not Complexity depend on the particular network representation but should yield more or less consistent results for all possible lossless representations of a network.Secondly, the measure should not equate complexity with randomness.Thirdly, the measure should take into consideration topological properties of a network and not be limited to simple counting of the number of vertices and edges.Of course, statistical properties of a given network will vary significantly between different network invariants, but at the base level of network representation the quantity used to define the complexity measure should fulfill the above requirements.The main question that we are aiming to answer in this study is whether there are qualitative differences between entropy and -complexity with regard to the above-mentioned requirements when measuring various types of networks.
In order to answer this question we have to measure how a change in the underlying network structure affects the observed values of entropy and -complexity.To this end, we have devised two scenarios.In the first scenario, the network gradually transforms from the perfectly ordered state to a completely random state.The second transformation brings the network from the perfectly ordered state to a state which can be understood as semiordered, albeit in a different way.The following sections present both scenarios in detail.[36] is based on the process, which transforms a fully ordered network with no random edge rewiring into a random network.According to the small-world model, vertices of the network are placed on a regular -dimensional grid and each vertex is connected to exactly  of its nearest neighbors, producing a regular lattice of vertices with equal degrees.Then, with a small probability , each edge is randomly rewired.If  = 0, no rewiring occurs and the network is fully ordered.All vertices have the same degree, the same betweenness, and the entropy of the adjacency matrix depends only on the density of edges.When  ≥ 0, edge rewiring is applied to edges and this process distorts the degree distribution of vertices.

From Watts-Strogatz Small-World Model to Erdös-Rényi Random Network Model. A small-world network model introduced by Watts and Strogatz
On the other end of the network spectrum is the Erdös-Rényi random network model [37], in which there is no inherent pattern of connectivity between vertices.The random network emerges by selecting all possible pairs of vertices and creating, for each pair, an edge with the probability .Alternatively, one can generate all possible networks consisting of  vertices and  edges and then randomly pick one of these networks.The construction of the random network implies the highest degree of randomness, and there is no other way of describing a particular instance of such network other than by explicitly providing its adjacency matrix or the Laplacian matrix.
In our first experiment, we observe the behavior of entropy and -complexity being applied to gradually changing networks.We begin with a regular small-world network generated for  = 0. Next, we iteratively increase the value of  by 0.01 in each step, until  = 1.We retain the network between iterations, so conceptually it is one network undergoing the transition.Also, we only consider rewiring of edges which have not been rewired during preceding iterations, so every edge is rewired at most once.For  = 0, the network forms a regular lattice of vertices, and for  = 1 the network is fully random with all edges rewired.While randomly rewiring edges, we do not impose any preference on the selection of the target vertex of the edge being currently rewired; that is, each vertex has a uniform probability of being selected as the target vertex of rewiring.Barabási-Albert Preferential Attachment Model.Another popular model of artificial network generation has been introduced by Barabási and Albert [38].This network model is based on the phenomenon of preferential attachment, according to which vertices appear consecutively in the network and tend to join existing vertices with a strong preference for high degree vertices.The probability of selecting vertex V  as the target of a newly created edge is proportional to V  's degree (V  ).Scale-free networks have many interesting properties [39,40], but from our point of view the most interesting aspect of scale-free networks is the fact that they represent a particular type of semiorder.The behavior of low-degree vertices is chaotic and random, and individual vertices are difficult to distinguish, but the structure of high-degree vertices (so-called hubs) imposes a well-defined topology on the network.High-degree vertices serve as bridges which facilitate communication between remote parts of the network, and their degrees are highly predictable.In other words, although a vast majority of vertices behave randomly, the order appears as soon as high-degree vertices emerge in the network.

From Watts-Strogatz Small-World Model to
In our second experiment, we start from a small-world network and we increment the edge rewiring probability  in each step.This time, however, we do not select the new target vertex randomly, but we use the preferential attachment principle.In the early steps, this process is still random as the differences in vertex degrees are relatively small, but at a certain point the scale-free structure emerges and as more rewiring occurs (for  → 1), the network starts organizing around a subset of high-degree hubs.The intuition is that a good measure of network complexity should be able to distinguish between the initial phase of increasing the randomness of the network and the second phase where the semiorder appears.

Results and Discussion
. We experiment only on artificially generated networks, using three popular network models: Erdös-Rényi random network model, Watts-Strogatz small-world network model, and Barabási-Albert scale-free network model.We have purposefully left out empirical networks from consideration, due to a possible bias which might have been introduced.Unfortunately, for empirical networks, we do not have a good method of approximating the algorithmic probability of a network.All we could do is to compare empirical distributions of network properties (such as degree, betweenness, and local clustering coefficient) with distributions from known generative models.In our previous work [41], we have shown that this approach can lead to severe approximation errors as distributions of network properties strongly depend on values of model parameters (such as edge rewiring probability in the small-world model, or power-law coefficient in the scale-free model).Without a universal method of estimating the algorithmic probability of empirical networks, it is pointless to compare entropy and -complexity of such networks since no baseline can be established and the results would not yield themselves to interpretation.
In our experiments we have used the acss R package [42] which implements the Coding Theorem Method [34,43] and the Block Decomposition Method [35].
Let us now present the results of the first experiment.In this experiment, the edge rewiring probability  changes from 0 to 1 by 0.01 in each iteration.In each iteration, we generate 50 instances of the network consisting of  = 100 vertices, and for each generated network instance, we compute the following measures: (i) Entropy and -complexity of the adjacency matrix (ii) Entropy and -complexity of the Laplacian matrix (iii) Entropy and -complexity of the degree list (iv) Entropy and -complexity of the degree distribution We repeat the experiments described in Section 4.1 for each of the 50 networks, performing the gradual change of each of these networks, and for each value of the edge rewiring probability  we average the results over all 50 networks.Since entropy and -complexity are expressed in different units, we normalize both measures to allow for sideby-side comparison.The normalization procedure works as follows.For a given string of characters  with the length  = ||, we generate two strings.The first string  min consists of  repeated 0's and it represents the least complex string of the length .The second string  max is a concatenation of  uniformly selected digits and it represents the most complex string of the length .Each value of entropy and -complexity is normalized with respect to minimum and maximum value of entropy and -complexity possible for a string of equal length.This allows us not only to compare entropy and complexity between different representations of networks, but also to compare entropy to -complexity directly.The results of our experiments are presented in Figure 7.
We observe that traditional entropy of the adjacency matrix remains constant.This is obvious, the rewiring of edges does not change the density of the network (the number of edges in the original small-world network and the final random network or scale-free network is exactly the same), so entropy of the adjacency matrix is the same for each value of the edge rewiring probability .On the other hand, -complexity of the adjacency matrix slowly increases.It should be noted that the change of -complexity is small when analyzed in absolute values.Nevertheless, complexity consistently increases as networks diverge from the order of the small-world model toward the chaos of random network model.A very similar result can be observed for networks represented using Laplacian matrices.Again, entropy fails to signal any change in network's complexity because the density of networks remains constant throughout the transition, and the very slight change of entropy for  ∈ ⟨0, 0.25⟩ is caused by the change of the degree list which forms the main diagonal of the Laplacian matrix.The result for the degree list is more surprising.-complexity of the degree list slightly increases as networks lose their ordering but remains close to 0.4.At the same time, entropy increases quickly as the edge rewiring probability  approaches 1.The pattern of entropy growth is very similar for both the transition to random network and the transition to scale-free network, with the latter characterized counterintuitively by larger entropy.In addition, the absolute value of entropy for the degree list is several times larger than for the remaining network representations (the adjacency matrix and the Laplacian matrix).Finally, both entropy and -complexity behave similarly for networks described using degree distributions.We note that both measures correctly identify the decrease of apparent complexity as networks approach the scalefree model (when semiorder emerges) and signal increasing complexity as networks become more and more random.It is tempting to conclude from the results of the last experiment that the degree distribution is the best representation when network complexity is concerned.However, one should not forget that the degree distribution and the degree list are not lossless representations of networks, so the algorithmic complexity of degree distribution only estimates how difficult it is to recreate that distribution and not the entire network.
Given the requirements formulated at the beginning of this section and the results of the experimental evaluation, we conclude that -complexity is a more feasible measure for constructing intuitive complexity definitions.-complexity captures small topological changes in the evolving networks, where entropy cannot detect these changes due to the fact that network density remains constant.Also, -complexity produces less variance in absolute values across different network representations, and entropy returns drastically different estimates depending on the particular network representation.

Conclusions
Entropy has been commonly used as the basis for modeling the complexity of networks.In this paper, we show why entropy may be a wrong choice for measuring network complexity.Entropy equates complexity with randomness and requires preselecting the network feature of interest.As we have shown, it is relatively easy to construct a simple network which maximizes entropy of the adjacency matrix, the degree sequence, or the betweenness distribution.On the other hand, -complexity equates the complexity with the length of the computational description of the network.This measure is much harder to deceive and it provides a more robust and reliable description of the network.When networks gradually transform from the highly ordered to highly disordered states, -complexity captures this transition, at least with respect to adjacency matrices and Laplacian matrices.In this paper, we have used traditional methods to describe a network: the adjacency matrix, the Laplacian matrix, the degree list, and the degree distribution.We have limited the scope of experiments to three most popular generative network models: random networks, small-world networks, and scale-free networks.However, it is possible to describe networks more succinctly, using universal network generators.In the near future, we plan to present a new method of computing algorithmic complexity of networks without having to estimate -complexity, but rather following the minimum description length principle.Also, extending the experiments to the realm of empirical networks could prove to be informative and interesting.We also intend to investigate network representations based on various energies (Randić energy, Laplacian energy, and adjacency matrix energy) and to research the relationships between network energy and -complexity.

Figure 2 :
Figure 2: Block network composed of eight of the same 3-node blocks.