Weighted Complex Network Analysis of Pakistan Highways

The structure and properties of public transportation networks have great implications in urban planning, public policies, and infectious disease control. This study contributes a weighted complex network analysis of travel routes on the national highway network of Pakistan. The network is responsible for handling 75 percent of the road traffic yet is largely inadequate, poor, and unreliable. The highway network displays small world properties and is assortative in nature. Based on the betweenness centrality of the nodes, themost important cities are identified as this could help in identifying the potential congestion points in the network. Keeping in view the strategic location of Pakistan, such a study is of practical importance and could provide opportunities for policy makers to improve the performance of the highway network.


Introduction
With the expansion of world economy and international trade, the role of transportation infrastructure is of crucial importance to the development of a country and is an important indicator of its economic growth.Land, water, and air transportation forms the backbone of an economy by supporting the movement of information, people, and goods.Previously, econometric models were applied to estimate the effect of infrastructure on the international trade; however, such a study did not provide sufficient insight into infrastructure itself.It is during the recent few years that different transportation systems are studied in much more detail using the complex network analysis.Key studies include but are not limited to the US, China, India, and worldwide air networks [1][2][3][4], Chinese and Indian railway networks [5,6], public transportation in Singapore [7] and Poland [8], the underground transportation in Boston [9], and the Indian highway network [10].The advances that were incorporated in these studies are enormous and revealed many new topological features of the networks.Small world property is one such feature displayed by most of the transportation networks.However, not all the networks are similar when it comes to degree distribution.The airport networks of US, China, and India show a power law degree distribution; exponential degree distribution is observed in Chinese and Indian railway networks and the Polish public transportation whereas neither power law nor normal in the case of Indian highway network.
Pakistan lies at the crossroads between South Asia, Central Asia, and Western Asia.The location provides Pakistan with a valuable opportunity to enhance its economy by providing logistic routes and services to the landlocked central Asian countries and to act as a bridge.On the contrary, the logistic network of Pakistan is largely inadequate with total length of the national highways standing at roughly 8,780 kilometers, accounting for only 3 percent of the entire road network but handling 75 percent of the road traffic in the country [11].Pakistan's logistics mostly rely on the road network where 96 percent of the national freight traffic is carried on road networks.This large percentage of road freight is a result of the failure of Pakistan Railways freight operations, which have been at halt since 2011 and briefly resumed in 2012.Poor and unreliable road infrastructure combined with traffic congestion in major cities result in 2-4 times longer freight journeys than they would in Europe.These factors hinder Pakistan's ability to integrate into the global supply chain which requires just-in-time delivery.According to the World Bank statistics, the poor performance of the logistic sector of Pakistan is estimated to cost the economy 4-6 percent (8-12 billion US $) of the national GDP every year.In this scenario, it is of utmost importance to study

Network Construction
Data for the movement of passengers were provided by the National Transport Research Center (NTRC) and the Provincial Public Transport Authorities of Pakistan.The data was transformed and travel for each day was represented as a weighted graph  with  nodes and  edges, an associated adjacency matrix  = [  ], and a weight matrix  = [  ] representing the number of passengers travelling between locations  and  in a single day.It is important to note that the objective of the study is not studying the underlying physical structure of the networks but of the movement of people between the different nodes.As such, when we say two nodes  and  are connected,   = 1, we mean that there is at least one passenger travelling between these two nodes during the day.Such a representation has already been used to represent public transportation in Singapore [7].

Topological Properties
Table 1 provides all computed network statistics, from basic network properties such as the number of nodes and edges to the more complex metrics such as weighted clustering and assortativity.

Basic Properties.
The network has 266 nodes with 4802 edges.The average shortest path length (the minimum number of edges passed through to get from one node to another) between one node and all other nodes of the network is calculated using the following equation: where  correspond to the set of nodes in the network, (, ) is the shortest path from  to , and  is the total number of nodes in the network.A small average path length ( = 2.49) means that there is travel between almost all cities in Pakistan, regardless of the geographical distance.
The network also features small diameter (maximum path length of a network)  = 5.These observed path lengths are similar to those found for the transportation routes for cities in Poland [9] and Singapore [7].The edge weight distribution is plotted in Figure 1.The range of travel between cities varied greatly,  ∈ (135, 1100).From the data, it is evident that the flow of passengers is directed from nodes of low degree towards nodes of high degree.Karachi, Hyderabad, Lahore, Islamabad, Rawalpindi, Faisalabad, and Peshawar are a few node examples with high degree and also handle most of the flow.

Degree and Strength Distributions.
The degree of a node, a measure of its connectivity, is defined as the fraction of nodes with degree  in the network.In our case, the degree is defined as the number of cities that can be reached from a given city via a single route.For a given node , the degree can be represented using The average degree of the whole graph can be obtained using the following equation: Subsequently, a node's strength is simply the sum of the weights on the edges incident upon it and is given by The average strength of the whole graph can then be obtained using the following equation: The network possesses a high average degree of 36.1, indicating high connectivity among the nodes.The degree distribution of the network is presented in Figure 2. The distribution of node connectivity is neither normal nor power law.It is interesting to note that even though many other transportation networks reported different distribution fits, the highway network of Pakistan's close neighbor India was also neither normal nor power law [10].The strengths reveal that although nodes with similar degree exist in the network, the traffic handled by each node is significantly different.We can average the strengths over all nodes with a given degree to get the strength spectrum (Figure 3).The spectrum illustrates a positive relationship between the degree and strength of nodes which means that the more connected a node is, the more traffic it handles.

Topological Clustering.
The clustering coefficient of a node  is defined as the ratio of the number of links shared by its neighboring nodes to the maximum number of possible links among them.Simply put, the clustering coefficient is a measure of cohesiveness around a given node  and it is defined by the equation where   is the number of edges between node 's neighbors and 2/  (  − 1) is a normalization factor equal to the maximum number of possible edges among the neighbors.Because of this normalization,   is in the interval [0, 1] where 0 and 1 indicate that none or all of node 's neighbors are linked, respectively.The average clustering coefficient can thus be represented by the following mathematical expression:  Using the above equation, the average clustering coefficient () of the network is calculated to be 0.81, indicating that the NHNP is a highly clustered network.This result is substantially higher than the value of an equivalent Erdös-Rényi random graph, ( ER ) = 0.1.The clustering coefficient together with the small average path length (Section 3.1) indicates that the PNHN is a small world.The computed clustering coefficient of the network is greater than that of the Indian highway ( = 0.78) and Railway ( = 0.69) [6,10], and that of location network of Portland ( = 0.06) [12] whereas almost the same as that of the public transportation in Poland ( = 0.85) and Chinese Railways ( = 0.83) [5,8].
Unlike the clustering coefficient, the weighted clustering coefficient    also takes into account the weights of edges; that is, The average weighted clustering coefficient can thus be represented by the following mathematical expression: If the weighted clustering coefficient is equal to the clustering coefficient of the network, (  = ), it means that the weights are completely uncorrelated.However, in the case when the weighted clustering is greater or smaller than clustering coefficient, the result implies that clustering is formed by edges with larger weight or smaller weights, respectively.In our case,   >  implies that the clustering is formed by edges with larger weights (Figure 4).In this figure, the clustering and weighted clustering coefficients are averaged over all nodes with a certain degree to get the clustering spectrum.
Another important topological characteristic of a network that is examined is the degree-degree correlation between connected nodes.A given network is said to be assortative if the high-degree nodes have a tendency to connect to other high-degree nodes.Similarly, disassortative networks are where low-degree nodes tend to connect to high-degree nodes.Newman introduced a summary statistic for assortativity () in 2002 [13], defined as the Pearson correlation coefficient of the degrees at either ends of an edge.Mathematically, this expression can be represented by the following equation: where This statistic lies in between the range [−1, 1], where −1 indicates a completely disassortative network and 1 indicates a completely assortative network.Examples of  greater than 0 include the Indian highway network [10] and ship transport network of China [14] whereas examples of  less than 0 include the Indian railway network [6], public transportation of Singapore [7] and Poland [8], and airport network of China [2] and India [3].For PNHN, the observed topological assortativity is 0.48 (similar to the Indian highway network).A closer inspection of the degree correlations can be done using another measure, the average degree of the nearest neighbor,   (), for nodes of degree .Consider If   () increases with , the network is assortative.If   () decreases with , the network is disassortative.The weighted version is given by , ≈  , implies that the edge weights are uncorrelated with the degree of 's neighbors.If the resultant weighted neighbor degree is greater than simple neighbor degree, (  , >  , ), then heavily weighted edges connect to neighbors with larger degree, while the opposite occurring when   , <  , .The average degree of the nearest neighbor is represented in Figure 5 where   () increases with  and   , >  , meaning that the heavily weighted edges connect to neighbors with larger degree.

Betweenness Centrality.
The betweenness centrality measure is used to identify the nodes with high congestion [15,16].Betweenness centrality of a node  can be defined as sum of the fraction of all-pairs shortest paths that passes through .Mathematically where  is the set of nodes, (, ) is the total number of shortest paths, and (,  | ) is the number of shortest paths passing through  [17].Betweenness centrality is presented in Figure 6.The top ten cities based on the betweenness centrality are listed in Table 2. Quetta and Peshawar, the trade corridor border cities, lead the list and handle maximum traffic followed by the port city of Karachi.

Conclusion and Future Work
Transportation networks, whether being land, air, or sea, communicate the development level of a country and can rightly be described as forming the backbone of economic development.Along with other tools, complex network methodologies have been extensively used to analyze transportation networks.As an addition to the theory and application of complex networks, the weighted national highway network of Pakistan analyzed using complex network theory.
The PNHN is a highly clustered network where the degree distribution is neither normal nor power law.The small world properties and assortative mixing of the highway network are evident from the calculated properties.It is interesting to note that the topological properties of the PNHN are largely similar to those of the Indian highway network.Furthermore, using betweenness centrality, the cities with potential traffic congestion are also identified.The beauty of complex network theory is that it is a powerful tool with limitless application possibilities.Although the analysis was performed taking daily average number of passengers as edge weights, it would also be interesting to conduct a much larger study using data for weeks or months.Similarly, subject to availability of data, weighted network analysis using movement of traffic or better yet the flow of goods in terms of TEU (twenty foot equivalent unit) from one city/district to other could provide useful insight into the logistics aspect of the network.Such a study would clearly highlight the network and its topological features in much more detail and help the policy makers to further enhance the infrastructure to achieve efficient flow.

Figure 3 :
Figure 3: Strength as a function of degree.

Figure 5 :Figure 6 :
Figure 5: Average degree of nearest neighbors of nodes with degree .

Table 1 :
Statistical properties of Pakistan national highway network.

Table 2 :
Top ten cities identified based on betweenness centrality.