Recent Progress about Flight Delay under Complex Network

Flight delay is one of the most challenging threats to operation of air transportation network system. Complex network was introduced into research studies on flight delays due to its low complexity, high flexibility in model building, and accurate explanation about real world. We surveyed recent progress about flight delay which makes extensive use of complex network theory in this paper. We scanned analyses on static network and temporal evolution, together with identification about topologically important nodes/edges. And, we made a clarification about relations among robustness, vulnerability, and resilience in air transportation networks. )en, we investigated studies on causal relations, propagation modellings, and best spreaders identifications in flight delay. Ultimately, future improvements are summarized in fourfold. (1) Under Complex Network, flight operation relevant subsystems or sublayers are discarded by the majority of available network models. Hierarchical modelling approaches may be able to improve this and provide more capable network models for flight delay. (2) Traffic information is the key to narrow the gap between topology and functionality in current situations. Flight schedule and flight plan could be employed to detect flight delay causalities and model flight delay propagations more accurately. Real flight data may be utilized to validate and revise the detection and prediction models. (3) It is of great importance to explore how to predict flight delay propagations and identify best spreaders at a low cost of calculation complexity. )is may be achieved by analyzing flight delay in frequency domain instead of time domain. (4) Summation of most critical nodes/edges may not be the most crucial group to network resilience or flight delay propagations. Effective algorithm for most influential sequence is to be developed.


Introduction
Commercial flights in China mainland has achieved 1.76 times growth in just one decade, from 4.22 million in 2008 [1] to 11.66 million in 2019 [2]. Except for strong demand to satisfy, air transportation network system may be disturbed by severe weather and airspace restrictions heavily. Most of flight delays are caused by them [2]. Besides, this critical infrastructure must be resilient or robust in case of earthquakes, system failures, and other unexpected situations. Microcomputer simulation is able to model air transportation network system at high resolution and accuracy via professional software [3,4]. It once was regarded as the most efficient approach for airspace capacity, network resilience, and flight delay. However, this method did not become the first candidate for national, intercontinental, worldwide, and other large-scale air transport network systems. It is mainly because of its high complexity in model building and strong inflexibility to model structure modification.
It is much appreciated that network science is able to simplify a complex system so that we can better understand its function as a whole [5]. Despite aged less than two decades, complex network theory [6,7] in network science has experienced a tremendous growth. Many real systems [8][9][10], composed of a large set of interacting elements, can thus get accurately interpreted. It is not surprising that this methodology was introduced into researches on flight delay. e Airport Network is built by connecting pairs of airports if they have a direct flight, which are illustrated by dotted lines in Figure 1. Consider that flights may be operated by different airline companies, and there are different types of dotted lines in Figure 1. And, several airline companies may ally to share their flights and to provide more convenient transport choices. If we discriminate the flights with their airline company [48] or with their airline alliance, the Airline Network or Air Alliance Network gets founded. In some airport networks or airline networks, there are more than one airport in the same city. ese airports may be merged into one node. Traffic information outside the city is accumulated into the new node, while traffic information inside the city is neglected [49]. What is noticed is that airline network differs from airport network only in the airline company. If the airline company information does not get counted, the airline network is the same as airport network, for example, [49,50].

Basic Concepts
Nevertheless, flights will not fly to its destination airport in a straight airline in real air transportation operation. For safety and airspace capacity, they have to follow the predefined nominal track, named air route or airway. Usually, an air route looks like a polyline rather than a straight line. Air route/airway is consisted of several intermediate waypoints, airports and segments among them, i.e., W1, W2, and W3, and the solid lines among them, as shown in Figure 1. Air Route Network is created once these air route waypoints, airports, and predefined nominal tracks get considered. Besides, air traffic controllers are responsible for the flight safety and efficiency within pre-established airspaces, named air traffic control (ATC) sectors. Usually, an ATC sector covers several waypoints of different air routes and zero or more airports, as illustrated by Figure 1. ATC Sector Network is constructed by linking ATC sectors if they have an air route segment. In Country Network, all the airports, waypoints, or ATC sectors within the same country or region are merged into one vertex, and the connections inside the country or region are discarded. Since air route network and ATC sector network are most relevant to flight operations, studies conducted on them benefit research studies on flight delay most. e link in above networks can employ traffic information between nodes as weight. Frequent adopted weights include number of flights, passengers, and available seats. Without weight, the link is a binary. Besides, directional traffic between two same vertices can also get modelled separately. In airport network in Figure 1, edge A ⟶ C may describe the traffic from A to C while the link C ⟶ A may represent the traffic in opposite direction, and this is the directed network [51].

Temporal Network.
Temporal network consists of a sequence of static networks over multiple time snapshots. In temporal air transportation network, connections, weights, and architecture evolve with time t, as shown in Figure 2. Figure 2 is an example of temporal airport network, in which node denotes airport. A link is created when there exists a direct flight between two airports. If we take the number of flights as weight, both weights and connections of airport network may evolve along with time t. Except for structural evolutions, temporal air transport network is also filled with dynamics, such as flight delay. Table 1 lists the most frequent used fundamental metrics in air transportation networks under complex network. ese fundamental metrics are for both undirected and directed networks. And, H-index [56,57], weighted metrics [58], and other sophisticated methods [17,20] are developed basing on them. All of these indicators are to describe single vertex or whole network directly. While eigenvectors [59] and k-core [60] are to evaluate nodes via their neighborhoods or layers they belonged to.

Network Character Analysis
Network character roots for the characterization, reproduction, and even the prediction of the network system. Modelling and analyzing air transportation network are the fundamental step for research studies on flight delay. is section is to investigate analyses about air transport network characters under complex network on the context of flight delay. Our focuses in this section include static and temporal network, together with the identification of topological important nodes/edges.

Analysis on Static
Network. Static air transportation network comes from one snapshot or the accumulation of network system states within certain time duration. In complex network theory, general procedure to analyze static air transport network is to build the network model basing on collected data, calculate topological metrics of network model, and analyze these indicators.
Traffic information characterizes the connections in air transportation infrastructure and is fundamental for a comprehensive system description. Hence, this section pays special attention to the characterization of traffic information in established network models. For this purpose, researchers usually employ weighted metrics and even develop new indicators. Table 2 compares some strong impact analyses on static air transport networks under complex network theory.
Discussions about Table 2 are in twofold. e first one refers to the air transport network model. Despite the topology of different air transport network models being analyzed by worldwide scholars, the airport network model acquires the most attention. It is mainly because it is the least difficult one. However, the airport network model is also the most different one from real air transport operation. It discards plenty of intermediate waypoints and contacts.
us, factors affecting flights, such as diverge, converge, and cross traffic in the air, airspace restriction, and severe weather, could not get effectively considered. is problem will be further confirmed by demonstrations of Tables 3-6.  Another problem about air transport network model lies in the category of nodes. Although Verma et al. [50], Lordan and Sallan [62], and Du et al. [49] encapsulate airport networks into multilayer infrastructures via the ''k-core decomposition" method, vertices in all the five air transport network models denote airport, region/country, air route waypoint, or ATC sector alternatively. In essence, nodes in all available models only represent points of flight path or flight route. Airport, air route waypoint, and ATC sector will not attend on the same network model. is simple modelling method may work well in network-level description. However, there are various factors which may affect flight operation, such as plane rotation, flight crew, air traffic controller, and "communication, navigation, and surveillance facilities." Any of them may produce large-scale disruptions to air transport networks. Without these flight operation relevant subsystems and sublayers, the current modelling scheme is hard to be effective for flight delay. e second one pertains to the characterization about traffic of air transport network system. Number of flights and number of available seats are the most frequent employed weights to signalize network traffic in almost all relevant literatures, including those in Tables 2-6. However, a comprehensive description of air transport network system requires much more information, such as airspace capacity, geographical length, direction, and altitude restriction of a segment of air route/airway. And, to the best of our knowledge, they are discarded by current investigations.

Metrics Equation Interpretation
Degree [52] Where m ij is the connection between node i and node j: m ij � 1 if there is a connection existing; m ij � 0 otherwise; this metric refers to the number of connections with other nodes in the network Where σ jk is the number of shortest paths going from node j to node k; σ jk (i) is the number of shortest paths going from node j to node k and passing through node i Average shortest path length [52] L � (1/(n(n − 1))) i,j∈N,i≠j l ij � (1/n) i∈N l i Where N is the set of all nodes in the network, n is the number of nodes; l ij [54] is the length of the geodesic from node i to node j, the minimum number of edges connecting from node i to j; l i � ( j∈N,j≠i l ij /(n − 1)) Efficiency [21] E � (1/(n(n − 1))) i,j∈N,i≠j (1/l ij ) Clustering coefficient Where K i is the set of all the neighbor nodes of node i, k i is the number of nodes K i ; m ik is the connection between node i and node k; this metric gives an overall indication of how nodes are embedded in their neighborhoods Moreover, present research studies are unable to employ more than one weight. Considered information is so limited that it is hard to make a comprehensive explanation of the air transport network system. Besides weights, scholars also employed topological metrics and correlations among them to reveal features of air transport network. Based on those conventional metrics in Table 2, Barrat et al. [58] proposed weighted clustering coefficient and weighted average nearest-neighbors degree. Correlations between indexes, for instance, relations between clustering coefficients-degree [61,63] and strengthdegree [58], get further evaluated. Guimera et al. [65] and Bianconi et al. [66] also investigated the connections between community structure and air transportation network topology. Bianconi et al. [66] proposed an entropy measurebased indicator to quantify the dependence of USA Airport Network structure on community structure and individual vertex's degree.
Much progress has been achieved by the metric-based analyses. However, the majority of available metrics are topological ones, including all the conventional and their weighted indices in Tables 2-6. ey are to characterize the air transport network topology instead of the network functionality, while air transport network is a physical network system which is born for traffic. ere exists an inherent gap between topology and functionality in current studies and the gap remains unnarrowed.

Temporal Evolution Analysis.
During strong fluctuations at various time scales in air transportation network, not only topological properties evolved but also flight delay propagated. As temporal network consists of a sequence of static networks over multiple time snapshots, analyses about temporal evolutions are mainly conducted through statistics about network topological metrics and their relations in different moments.
Besides the focuses on static network in Section 3.1, primary concern in this section lies in how to analyze the temporal evolutions. Table 3 surveys some strong impact analyses on temporal evolutions. Table 3 confirms the conclusions in Section 3.1. It also reveals that most investigations in this field are performed via basic statistics on network indices and their relationships.
Wandelt and Sun [15] analyzed the yearly and monthly evolutions of worldwide country network through degree and density (a degree-related indicator). Relationships between degree and betweenness and topological critical nodes and links got further displayed. Besides, they calculated yearly correlations within unweighted degree and passengerweighted betweenness and correlations within functionally critical nodes and weighted links. Gautreau et al. [16] discovered topological characters in USA airport network were stable via statistics of degree, passenger-weighted degree, and passenger flows. Moreover, they explored the Cai et al. [19] analyzed temporal evolution of Chinese Air Route Network via early distribution of flight-weighted degree, yearly traffic flows, and traffic flow growth rates. Based on degree, weighted degree, clustering coefficient, betweenness, weighted betweenness, closeness (shortest path length-based metric), and weighted closeness, Sun et al. [20] studied temporal evolutions of European Air Route Network and European Airport Network. Adopted weights are number of flights and number of available seats.
Temporal tendencies of metrics and their correlations are mainly presented and discussed via time-figures. Principle-level analysis is quite rare. Weight growth and lifetime of links [16], entropy of the degree distribution [17], and variations and CoV of conventional indicators [20] are employed to improve traditional methods and to narrow the gap between topology and functionality of air transport network.

Identification of Topological Important Nodes/Edges.
It was recognized that identification of influential vertices or links [42][43][44] is crucial for certain benefits. In essence, the identification in this section is to evaluate nodes/edges with indicators in static networks. However, no single index can perform the duty. e general detection idea is based on multitopological attributes, such as AHP [67]. us, we are interested in the idea of identification method, metrics they fused, and performance test approaches. Table 4 is organized by the concerns.
Li and Xu [41] employed both functional index and topological metrics in the evaluation of node importance. ese indicators are fused by a method based on fuzzy soft set theory which is able to integrate several indices over different time intervals. Performance of the proposed method is evaluated through the change of airport network efficiency after the airport is removed intentionally. Ren et al. [42] proposed a node sorting algorithm based on VCM and four conventional topology metrics. e VCM assumes that the index with larger difference has a large impact on network physical properties and is more important. ey adopt SIR (susceptible-infected-recovered) [38] model to obtain node's infection ability. And, performance of VCM and four topology indices are compared through the SIR model. e IEM [43] believes that the indicator with small entropy provides more information and is more crucial.
Depending on topological indices, these approaches [41][42][43] calculate the importance of a node directly. Instead, Ren et al. [44] measured influence of waypoint through change of network after the node is removed. e change is defined as relative entropy of network agglomeration. Network agglomeration is determined by the average path length which has been introduced in Table 1. e identification of influential nodes/edges was stimulated greatly by these two kinds of approaches. However, the gap between topology and functionality of the air transport network system remains still unnarrowed. Most frequent utilized indicators are topological ones and the network changes are also quantified via the change of network topology. Last but not least, it is much appreciated that air route network and ATC sector network have aroused attentions from researchers, since they are the most relevant networks to real flight operations.

Summary.
Except for analyses about network characters under complex network on the context of flight delay, we are much concerned about network models which are fundamental to subsequent research studies since air transportation network is a physical network and is born for traffic. Future improvements lie in the bridge between theoretical approaches and air transportation network functionality.
(1) ere are various factors and systems that may disturb a flight, such as weather, plane rotation, flight crew, air traffic controller, "communication, navigation, and surveillance facilities," diverge, converge, and cross traffic in the air, and airspace restrictions. Any of them may produce large-scale disruptions to air transportation networks. However, as concluded by Section 3.1 and confirmed by other sections, current five network models are unable to take these factors into account due to the absence of intermediate points of flight path and the single category of nodes. is prevents these network models becoming the most effective modelling schemes for flight delay. With flight operation relevant subsystems or sublayer embedded in, such as [68], hierarchical modelling approach may be the most appropriate framework for flight delay. (2) Since flight delay is concerned with functionality/ operation of the air transport system, the gap between topology and functionality should be taken seriously when aimed at flight delay. More approaches or indices, responsible for both topology and operation of air transport network, are to be developed. Moreover, the comprehensive descriptions and evaluations about air transport network will definitely get stimulated by weights beyond traffic information, such as geographical length, capacity, and altitude restriction, or by network models with more than one kind of weights.

Resilience
Flight delay will occur frequently if air transportation network is less robust to disturbances. e word resilience originally originated from the Latin word "resiliere," which means "bounce back." Resilience implies the ability of an entity or system to return to its normal condition when 6 Complexity disrupted. Resilience in the engineering system can get clarified via the three phases of system responding in Figure 3 [69]. Performance stands for network system's ability to perform required task. ere are various performance indicators in complex networks and this will be discussed later. In the original steady phase, network performance maintains its target level p 0 . In the disruptive phase, system performance drops to the lowest level p r . In the recover phase, network system performance recovers to new steady level p ns . Inspired by definitions about absorptive capability and restorative capability [70], we argue that, in the air transport network system, (1) Robustness refers to network performance loss (PL � p 0 − p r in Figure 3) when perturbed. Disturbance will produce much less performance loss to a robust network. us, resilience is composed of vulnerability and restorative capability. Relationships among resilience, vulnerability, and robustness are illustrated in Figure 4.
Following the tagged numbers in Figures 3 and 4, relations among robustness, vulnerability, and resilience are easy to be clarified. And, this clarification breeds a clear sight into numerous research studies about resilience of the air transportation network system. Current investigations can be reviewed based on the following 4 questions.
(1) How much is the network performance loss when it is interrupted? How to measure this robustness? (2) What is the most effective and efficient attack strategy or what can be done to maximize the performance loss? Whose removal decreases network performance most? (3) What is the best strategy to recover the network system's performance? Whose recovery increases network performance most? (4) How to make a comprehensive description about the resilience of the air transportation network system?

Robustness Measurement.
In early studies about error and tolerance in complex network, Albert et al. [71] regarded average length of the shortest paths between any two nodes in the network as the performance of complex network. Motter and Lai [22] quantified network performance in terms of the relative node size of the largest connected subgraph.
Our concerns in this section are about question (1), performance indicator, and how to measure the robustness of air transport network. Table 5 presents recent impact literatures about robustness. ese studies are to remove or recover vertex and its connected edges one by one based on different select criteria. e network robustness is measured and analyzed by the adopted performance indicator. Table 5 reveals that network efficiency [21] along with size of the largest connected subgraph ranks the most frequently utilized performance indicators of air transportation network. e network efficiency closely relates to average length of shortest path as illustrated in Table 1. Both network efficiency and size of the largest connected subgraph are to signalize network performance from perspective of network connectivity other than network functionality. Nevertheless, air transportation network is practical infrastructure which is born for air traffic. Network robustness is concerned with system functionality, as discussed previously. Without Air route network Improved entropy weight (IEW, objective weight method to aggregate multiattributes) [43] Air route network rough changes in network agglomeration relative entropy when node removed -SIR model Kendall's tau coefficient [44] Complexity 7 fundamental traffic information, these two topological metrics are less convincing to measure robustness of air transportation network system. On the context of air transport network functionality, Janić [72] relied on number of flights/passengers to calculate robustness of existing airports. And, robustness of removed airports is measured by their weights. en, he regarded summed robustness of all left vertices as network robustness. Wandelt et al. [73] and Sun et al. [64] employed the unaffected passengers with rerouting as a baseline index in evaluation of air transportation system robustness. Besides, Sun et al. [64] and Dunn and Wilkinso [74] defined survived  links as robustness of air transportation network. All these summed weight metrics [64,72,73] assessed robustness of air transportation network from the aspect of traffic demand. Another kind of solution comes from the perspective of traffic supply and is based on airspace capacity [75][76][77]. Pien et al. [75,76] proposed a new robustness index called the relative area index (RAI). e index quantifies the influence of individual node to the performance of the entire network when it suffers capacity reduction at a local scale. And, the model estimates the maximum flow of network as its new capacity when the air transport network is influenced. ey [75] also conducted a comparative analysis between RAIbased robustness and betweenness-based robustness. Yoo and Yeo [77] regarded the redundant capability to replace the damaged node as the adaptive capacity. Furthermore, they evaluated the air transportation network robustness based on adaptive capacity concept.
Much progress has been achieved to describe air transport network performance from perspective of network functionality. ey either rely on traffic demand or depend on traffic supply. us, disadvantages of topological character-based methods get surmounted and gap between functionality and topology of air transport network get narrowed. Nevertheless, few take traffic demand and traffic supply into their description simultaneously.
is goes against with the indication of COVID-19. In the rare scenario, traffic demand is suppressed heavily, while traffic supply (capacity of air transport network and airspace) remains almost the same as before. e demand is so weak that any random errors or intentional attacks to network system would produce much less disturbances than before. Traffic demand and traffic supply are suggested to get considered simultaneously in the measurement of air transport network robustness.

Attack and Recover.
is section is to discuss questions (2) and (3) in the beginning of Section 4, search most effective and efficient strategy to reduce or to restore network performance, and locate those nodes or edges influential to network performance. Table 5 reveals that most adopted strategy is to remove or recover nodes one by one basing on their values of degree, betweenness, clustering coefficients, closeness, or other topological metrics. And, strategy effectiveness is evaluated via the estimation of network robustness. However, these nodes' select criteria are also topological ones since they completely rely on topological indices and robustness indicators. In the birth of this functionality-oriented research, there were no other better solutions, and researchers had to go on with topological methods. However, much progress has been achieved now. More effective and efficient strategies should have been developed.
Attack Strategy. Inspired by general game playing process, Wandelt et al. [73] proposed a new exploration/exploitation search technique to find attacking strategies efficiently. It is based on a Monte Carlo Tree Search algorithm, rather than on network metrics. Dunn and Wilkinson [74] assumed that there may be unused airport capacity to accept previously unscheduled flights. is permits an adaptive saving and rerouting edges. is adaptive strategy is demonstrated to be effective in raising robustness. ompson and Tran [78] utilized a three-stage optimization model (defender-attacker-defender) to analyze USA air transportation network robustness. e model approximated the dynamics among three opposing agents: an operator that seeks to minimize the network's operational cost via optimal passenger rerouting, an attacker that aims to maximize that cost by disrupting a number of network routes, and a defender that is tasked with mitigating the actions of the attacker by protecting key routes. Recover Strategy. Since recover strategy in air transportation network system has not aroused much interest, current recover strategies mostly rely on topological characters. Clark et al. [25] evaluated the performance of three recover strategies in USA airport network. ey are based on traffic volume and random and network centrality (eigenvector, closeness, betweenness, and degree), respectively. Wang et al. [28] compared the effects of random, degree-based, betweenness-based, and remove sequence-based recover strategy in China ATC sector network. However, to the best of our knowledge, we have not noticed much progress in recover strategies. Current relevant literatures neither take traffic information into account nor study recover strategies through the modelling of network system evolutions such as [73,74,78]. Influential nodes or edges to network robustness can be identified by the attack and recover strategies. However, these diagnosed nodes/edges are independent individuals and summation of them may not be the most crucial set. Du et al. [45] employed a memetic algorithm to seek the minimum network robustness after removing certain edges in Chinese air route network. e attacking performance of the algorithm is superior to the highest edge-betweenness adaptive strategy. And, the solution of the algorithm is the vital set of edges. Soria et al. [24] proposed two customized methods to search most effective sequence of edges in network robustness reduction. It is based on betweenness and damage, respectively. ey also compared the performance of genetic algorithm, simulated annealing, two customized algorithms, and a combination of the two proposed methods in worldwide airport network. Simulation experiments demonstrated that the combined one is the most effective.

Resilience Description.
It is quite rare to meet literatures which are answering question (4). Xu [27] and Wang et al. [79] defined a general resilience index for air transportation network in which absorptive and restorative capability (illustrated in Figure 3) get counted. rough the evaluation of system performance, comprehensive resilience of China Airport Network was analyzed.

Complexity
It is much treasured that this integration of robustness, resistance to attack, and restorative capability covers all response phases to perturbations. However, the air transportation network is a practical/physical system, such as the electric power supply system [70]. Its operation is filled with interactions among subsystems or sublayers, i.e., airport, "communication, navigation, and surveillance facilities," air traffic controller, etc. Operation of subsystem or sublayer is so sophisticated that any study on subsystem or sublayer is worthwhile.
ese complicated subsystems or sublayers should get further accounted in resilience descriptions.

Summary.
We clarified the relations among robustness, vulnerability, and resilience in the beginning of this section. Under this guidance, we surveyed literatures about resilience of air transportation network under complex network. Future efforts are outlined as follows: (1) Air transportation system is a practical infrastructure, whose robustness should be connected with system performance. Traffic information is fundamental for robustness description. Except for topological indicators, scholars have developed numerous indexes to characterize air transportation network robustness from perspective of either traffic demand or traffic supply. However, the situation in COVID-19 suggests that traffic demand and traffic supply should get counted simultaneously in the measurement about air transport network robustness. In the rare scenario, traffic demand is suppressed heavily, while traffic supply (capacity of air transport network and airspace) remains almost the same as before. e demand is so weak that any random errors or intentional attacks to the system would produce much less disturbances than before.
(2) Under complex network framework, conventional attack strategy and recover strategy are to remove and recover nodes/edges one by one based on the values of topological metrics. Monte Carlo Tree Search algorithm [73], adaptive saving and rerouting edge strategy [74], and defender-attacker-defender optimization model [78] have been employed for more effective attack strategy. While, to the best of our knowledge, investigations about recover strategy still rely on topological indices. (3) It is pretty appreciated that resilience description about air transportation network system has proceeded to the cover of all response phases, although it just began. ere are various subsystems or sublayers in air transport networks and any of them may produce large-scale disruptions. ese factors should get further considered in resilience descriptions. We believe that only in this way resilience of the air transport network system can get accurately evaluated.

Flight Delay
Flight delay occurs when airplane cannot takeoff or land on time.
is abnormality may spread to downstream flight through airplane rotation. Furthermore, flight delay will propagate heavily in some extreme situations, such as severe weather and airspace restrictions. Under complex network framework, flight delay diffusion can be portrayed as network dynamics, such as cascade failure [22,80] or epidemic process [81].
Current investigations about flight delay in complex network theory focus on empirical analysis, reveal of delay causal relations, prediction or modelling of delay propagation, and identification of best spreaders. Empirical analyses [27,82] share similar assumptions, approaches, and limitations with network character analysis in Section 3. erefore, this section will not discuss them anymore. And, these articles [83][84][85][86][87] are also off our concentrations since they are beyond complex network framework, even though they are worthwhile and well qualified in the prediction or modelling of delay propagation.

Flight Delay Causal Relations.
On the foundation of character analyses about flight delay in complex network, scholars attempted to discover causal relations of flight delay. e causal relations among vertices may get mined via significance test methods in statistics. Flight delay causality network is built by connecting pairs of vertices if they have a direct induced flight delay. Moreover, characters of flight delay network are analyzed under complex network theory. Table 6 surveys recent impact studies on flight delay causal relations under complex network framework. In current situation, properties of established flight delay networks are evaluated via topological metrics in complex network. Network metrics-based analyses in this field share similar limitations with network character evaluation in Section 3. erefore, our primary concern is the fundamental data, especially network work type and time series of flight delay for significance test. And, approaches to mine causal relations among flight delays are also interesting.
Analyses about Table 6 are organized according to the focuses. e first one refers to the fundamental data. Despite flight delay causality under complex network theory emerged only six years ago, related studies have covered major nations/regions around the world. e significance test method cannot work without time series of flight delay. Data in time series differ from landing delay to departure delay. Real flight operation prefers ground delay strategy than air delay strategy for safety and air fuel saving. is makes delay often occur in takeoff rather than in landing. Cancelled flights should get counted since considerable flights may be cancelled in large flight delay events.
However, all flight delay causal relations are studied in airport network. Nevertheless, it is air route network or ATC sector network that real flight operates in. ere are multiple intermediate nodes between origination airport and destination airport in real flight operation, as illustrated by Figure 1. Flights will frequently diverge, converge, and cross in intermediate nodes, such as W2 for A-C and B-D in Figure 1. Flight delay will diffuse heavily once the intermediate node is blocked or covered by severe weather. Another frequent situation is that once there are too many flights between A-C, W2 might be allocated to these flights with priority.
us, flights between B-D will be delayed despite there being no direct links between A-C and B-D. In this case, flight delay should be studied in air route network or ATC sector network.
e second concern is about the significance test method adopted to detect flight delay causal relations. e most frequent employed approach is Granger Causality test which is linear due to the adoption of smoothed and averaged data. However, flight delay usually propagates in a nonlinear fashion. In large flight delays, flight delay might diffuse heavily in one specific time window. However, these heavy propagations are likely to be ignored by the smoothed and averaged data in the classical Granger causality test. In order to overcome these limitations, Belkoura and Zanin [29] developed a causality test method for extreme events to identify those higher than expected delays. It relies on the abnormal values and the statistics of how such values are propagated.
ey also compared performance of new approach with conventional Granger causality through causal links, transitivity, efficiency, largest shortest path length, greatest distance between any pair of vertices, and assortativity. Except for averaged data, weighting small and large delays equally also makes Granger causality unable to detect large delays. Small delay is easily absorbed by the flight trip in air and flight schedule buffers [34]. For this reason, Mazzarisi et al. [34] proposed an extension use of the Granger causality test, namely, Granger causality in tail. It only considers extreme events and large delays. ey also compared performance of the proposed method with the traditional Granger causality test through causal links, average path length, and average clustering coefficient.
Besides the Granger causality test, Xiao et al. [32] proposed low-dimensional approximation of condition mutual information for transfer entropy test to mine causality among nonlinear flight delays. en, they designed a simulation experiment based on artificial nonlinear time series. Via true positive-, false negative-, true negative-, and false positive-based indicators, performance of the proposed method is analogically analyzed with Granger causality and transfer entropy approaches. Simulations demonstrated that the estimation accuracy increased when transfer entropy was used to quantify and validate delay propagation. Wang et al. [33] applied Pearson correlation coefficient to capture the correlations between flight delays in different airports. ey also established flight delay networks from operational data.
Detection about delay propagation is fundamental for subsequent flight delay researches. Under complex network theory, there emerged Granger ausality test, transfer entropy test, and Pearson correlation test methods in current flight delay causal relation studies. Scholars have realized their limitations and attempted to improve them. However, this problem has not gained much attentions in general, and we have not noticed much efforts. Furthermore, evaluations about detection accuracy deserve more attentions. Belkoura and Zanin [29] and Mazzarisi et al. [34] compared accuracy of proposed methods with the traditional method. Xiao et al. [32] evaluated accuracy of the proposed scheme through flight delay propagation simulation experiment. No one has been testified, validated, or evaluated by real flight delay data. More effective flight delay propagation test methods with high accuracy are to be developed.
Ultimately, the idea of above flight delay causality detection methods is to mine delay causality via significance test methods in statistics and the flight delay time series. In essence, these kinds of schemes totally rely on flight delay data to carry out their duty. ey do not make full use of flight schedule and flight plan. In real flight operation, there are flight schedules to arrange flights. Most aircrafts are scheduled with several flights within one day. And, every commercial flight has to submit its flight plan in advance. Flight plan contains origination airport, destination airport, planned air route, expected departure/arrival time, and tail number of aircraft. rough tail number, we are able to track flight path of an aircraft and of course the delay propagation path. We believe that accuracy of flight delay causality detection mechanisms will increase once flight schedule and flight plan are accounted in.

Flight Delay Propagation
Modelling. Subsequently, scholars commenced modelling or predicting flight delay diffusions for delay mitigation and reduction. Under complex network framework, flight delay diffusion can be portrayed as network dynamics, such as cascade failure [22,80] or epidemic process [81].
Our primary concern is about how to model the spreads of flight delay. Most frequent utilized models are SIR [38] in epidemic and cascade failure [22,80] models. e SIR model divides all the individuals in the system into several groups with different states. Everyone is aligned with only one status at one time step. Delayed flights or delayed airports can be in the infected state, those recovered from flight delay are in the recovered state, and undelayed are in the susceptible state. Flight delay propagation is simulated through state transmissions.
Baspinar and Koyuncu [35] and Mou et al. [36] assumed that recovered vertices are not immune to flight delay and can be delayed with the same possibility with susceptible ones. ey employed the simplest SIR model. ere are only two kinds of states: susceptible and infected. Before infected nodes recover to normal/susceptible, it will maintain infected status and spread flight delay to neighbors constantly. e SIR model directly labels a node as delayed or undelayed. In real flight operation, airport, waypoint, ATC sector, or other components contains more than one flight. Hence, the accurate transmission rate (infection rate or recovery rate) is of great importance for the predicting or modelling since not all flights in one airport seem to be delayed or undelayed in any moments. e cascade failure model [22] reproduces flight delay propagations based on flight flow movement in the air transport network system. It assumes that failure or flight delay occurs once the load of node is over than its capacity. Failed nodes have to be removed from the network and their loads have to be reallocated to other connected nodes in air transport network. is reallocation may produce cascade failures. us, flight delay diffusions in air transport network get simulated. Wu et al. [37] assumed that delayed airports were able to perform certain amount of their traffic load and kept delayed airports other than closed them. is idea is similar with real flight operation and makes the proposed model more reasonable. ey also explored the association of cascade failure [22] and SIR. Allocation strategy of cascade failure remains a challenging issue since there is more than one flight in a single vertex.
In addition, the executions of both individual-based SIR [35,36] and improved cascade failure [37] require basic computer simulation. is requirement not only increased calculation complexity but also decreased flexibility to the adjustment of network architecture. Nevertheless, scientists are able to draw network system characters and its dynamic process via mathematical equations in the traditional nonindividual-based SIR model. is kind of methods assumes that all the individuals are equally contacted with each other and network is fully mixed. With graph Laplacian operator [39], Estrada [40] discovered and proved characters of the propagation process. He obtained the rate of convergence and best spreaders through theoretical analysis. In this way, flight delay prevailing can be modelled at a low cost of modelling complexity.
Second concern is about the infection/recovery rate and the reallocation strategy. It is of great significance to resolve them since they are the key parameters to model flight delay propagations.
Baspinar and Koyuncu [35] estimated the infection rate for airport-based SIR model through the data-driven statistical analysis on flight plan information and actual flight flow data. Ground-waiting time of flight is utilized to determine infection rate in the flight-based model. Moreover, they calculated theoretical recovery rates of the airportbased model and flight-based model via equations in the SIR framework. Flight time and ground-waiting time are also considered in the running of simulation. Furthermore, the infection rate and recovery rate from actual flight flow data was extracted by the Euler Method. e accuracy of established models in capturing delay spreading was demonstrated via the comparison between estimated rates and real rates. Mou et al. [36] introduced two airport-based SI models to evaluate effects of flight time and ground-waiting time on the spreads of flight delay. In one of them, infection of flight delay completed at the moment flight lands. In the other one, infection of flight delay accomplished once flight is airborne. Infection rates in both models are directly defined as 0.5, while recovery rates are 0.
Wu et al. [37] did not define or calculate the infection rate and recovery rate directly. ey reallocated the exceeded load of infected airport to its connected susceptible airports according on their weights.
ere are multiple available reallocate strategies, such as average, degree-based, and priority-based. As introduced in Section 5.1, we are able to track the flight path of aircraft through flight schedule and flight plan. e flight path also enables us to follow the spread of flight delay in air transport network. However, flight schedule and flight plan should have been extensively explored to obtain more accurate infection rate, recovery rate, and reallocate strategy.
From flight delay propagation modelling approach to key parameters solving, considerable efforts have been made to complete the framework for flight delay propagation under complex network. e pursuit for high accuracy is a fundamental objective. Generally, the accuracy of flight delay propagation models remains unvalidated by real flight operation data. e accuracy test by real flight data also implies a new idea to predict the spreads. Since current SIR and cascade failure models totally rely on prebuild model to simulate or predict flight delay propagations in network, real flight data may be utilized to revise the modelling or prediction. Based on the difference between predictions of prebuild models and real flight delays, we believe the revision will increase accuracy of the prediction or modelling.

Best Spreaders in Flight Delay.
With the progress in delay propagation modelling, there emerged a new hot issue: identification of best spreaders, which is indispensable for delay mitigation and reduction. e general method is to measure the flight delays once the node or edge is removed one by one. More severe the flight delay is, more critical the removed nodes or edges are.
Besides the strategy employed to identify best spreaders, our primary concern is about how to measure the influence of flight delay once nodes or links are removed. Cardillo et al. [68] simulated the behavior of passenger rescheduling when their flights are cancelled or links between airports are removed.
rough the quantification of rescheduled or delayed passengers, they explored the effects of link deletion in the European Airline Network. e number of rescheduled or delayed passengers was also utilized to indicate the network robustness. Voltes-Dorta et al. [46] proposed an algorithm to quantify the disrupted passengers due to closure of a given airport. e approach also reallocated the disrupted passengers in alternative travel itineraries based on shortest path length. e proportion of reallocated passengers was employed to measure robustness of European Airport Network. Accumulated delay experienced by the disrupted passengers was to identify the critical airports.
Mazzarisi et al. [34] found that existing centrality metrics neither respect the flight schedule nor consider airline companies that each flight belongs to. Hence, these indices were not capable to characterize the effect of delays in the air transportation network. en, they employed a new centrality to identify critical airports. It is different from conventional approaches since importance of airports is measured through their influence to flight delay.
It s much appreciated that flight delay is quantified via operational influences other than topological changes that are adopted in the robustness indicator. However, above works run into the same weak point with the first concern of Section 4.2. ey identify influential nodes or links through computer simulation and this raises the calculation complexity. Instead, Estrada [40] is able to locate best spreaders through theoretical analysis with graph Laplacian operator [39]. It achieved a low cost of computation complexity.
Our second concern lies in the strategy employed to identify best spreaders of flight delay. To the best of our knowledge, nodes and edges are selected by random or by values of topological characters in current best spreaders identification studies. ese identified nodes/edges may be most critical individuals to flight delay, but summation of them may not be the most influential association to flight Complexity 13 delay in the network level. is conclusion is the same with the attack and recover strategy of network resilience in Section 4.2.

Summary.
On the foundation of character analyses about flight delay in complex network, scholars have made great efforts to detect flight delay causalities, to model or predict delay propagations, and identify influential nodes/ edges to flight delay under complex network. Section 5 surveyed recent progress about these challenging and pragmatic concerns. Outlook for future is concluded as follows: (1) Under complex network framework, almost all the current research studies in Section 5 are conducted in airport network. Nevertheless, it is air route network or ATC sector network rather than airport network that real flight operates in. In these two networks, there are multiple intermediate nodes between origination airport and destination airport. Airport network ignores these multiple intermediate waypoints and prevents us from tracing some frequent flight delays. Investigations in this section should pay more attention to air route network or ATC sector network for a better accuracy. is not only raises the calculation complexity but also reduces the flexibility to the adjustment of network architecture. With graph Laplacian operator [39], Estrada [40] revealed and proved characters of propagation process through theoretical analysis. ey initialed a new course to seek or develop approaches at low calculation complexity. It further reminds that Laplacian transform, Z-transform, Fourier transform, or wavelet transform in signal processing area may be adopted for that purpose. With transformed into frequency domain from time domain, we can perform analyses on flight delay at low calculation complexity.
(4) Much efforts have been made to reveal flight delay causalities and model flight delay propagations. However, accuracy of causality detections and diffusion modellings have not been validated by real flight data. Accuracy test via real flight data also implies a new idea to predict the spreads. Since current SIR and cascade failure models totally rely on prebuild model to simulate or predict the propagations, real flight data may get utilized to revise modelling or prediction. Once difference between predictions of prebuild models and real flight delays is employed to revise the next time-step predictions, prediction accuracy of flight delay diffusion in the next time step is believed to increase. (5) e general strategy to identify best spreaders of flight delay is to measure the flight delay after the nodes or edges are removed one by one. More severe the flight delay is, more important the node or edge is. In this situation, identified nodes/edges may be the most influential individuals to flight delay. However, the reactions among diagnosed individuals are ignored and summation of them may not be the most critical group to the whole network. For the sake of flight delay mitigation and reduction, the most crucial association of nodes/edges is more valuable. More effective algorithms should be applied into the search for most crucial group of nodes/ edges to flight delay.

Conclusion
Flight delay is one of most benefiting concerns in air transportation system. From the perspective of complex network, we reviewed research studies on network character analysis, network resilience, and flight delay. All of them directly or indirectly contribute to the flight delay phenomenon in the air transport network system. We have mapped the state of art to identify promising approaches as well as to make their limitations and assumptions clear. With detailed conclusions organized in the main context, future directions are summarized in fourfold: (1) In the available network models for air transportation under complex network framework, all the nodes or edges are modelled with the same property. ese models disregard numerous subsystems and factors behind flight operation.
is modelling method prevents us from tracing and analyzing frequent flight delays. Furthermore, real flight operation may be affected by plane rotation, flight crew, air traffic controller, "communication, navigation, and surveillance facilities," and other subsystems or sublayers. It is impossible to describe vulnerability and resilience of air transport network system comprehensively since these factors are outside current complex network framework. It is necessary to utilize hierarchical modelling approaches to take flight operation relevant subsystems, air route waypoints, and ATC sectors into network models. Hierarchical modelling approaches will make network models in complex network framework more capable for flight delays.
(2) Flight delay is a matter of functionality or operation of the air transportation system. In current studies on flight delay under complex network, there exists an obvious gap between topology and functionality. Most available network metrics and robustness indicators are topological ones. Transmission rates about flight delay propagations are mainly quantified via topological connections. Traffic information is the key to narrow the gap. More indices with traffic information and responsible for air transport network operations are to be developed. And, network system robustness measurement should take traffic demand and traffic supply into consideration simultaneously, which is indicated by the situation in COVID-19. Furthermore, flight schedule and flight plan are indispensable in the tracing of aircraft flight path. e flight path is valuable for flight delay causality detections and estimations of flight delay transmission rates. Besides, real flight operation data can be utilized for the accuracy test on flight delay causality detections and flight delay diffusion models. We also believe real flight data can be employed to revise the modelling or prediction about flight delay propagations. Accuracy of flight delay propagation models may thus get raised.
(3) Most investigations about flight delay propagation modellings and best spreaders' identifications require computer simulations. is requirement not only increases calculation complexity but also decreases the flexibility to network architecture adjustment. With graph Laplacian operator [39], Estrada [40] initialed a new course to seek or develop approaches at low calculation complexity through theoretical analysis. It further reminds that Laplacian transform, Z-transform, Fourier transform, or wavelet transform in signal processing may be applied for that purpose. By transforming into frequency domain from time domain, we can perform analyses on flight delay at low calculation complexity.
(4) In research studies on attack/recover strategies, as well as investigations on identifications of influential nodes/edges to network resilience and flight delay propagations, most frequent adopted select criteria are to choose node/edge by random or by values of topological metrics. Sophisticated algorithms, such as Monte Carlo Tree Search algorithm [73], adaptive saving and rerouting edge strategy [74], and defender-attacker-defender optimization model [78], have been applied for more effective attack strategies. While to the best of our knowledge, recover strategies still rely on topological indices. Furthermore, those identified nodes/edges may be the most influential individuals to network resilience or flight delay propagations. However, most of current studies ignored the reactions among them and the summation of them may not be the most critical association. Similar to memetic algorithm [45] and custom methods [24], effective algorithms should be developed to search most influential sequence of nodes/edges to network resilience or flight delay propagations.

Conflicts of Interest
e authors declare that they have no conflicts of interest.