Modeling Congestion Propagation in Multistage Schedule within an Airport Network

In order to alleviate flight delay it is important to understand how air traffic congestion evolves or propagates. In this context, this paper focusses on the aggravation of airport congestion by the accumulation of delayed departure flights. We start by applying a heterogeneous networkmodel that takes congestion connection/degree into consideration to predict departure congestion clusters. This is on the basis of the fact that, from a micro perspective, the connection between congestion and discrete clusters can be embodied in models. However, the results show prediction to be of high accuracy and time consuming due to the complexities in capturing the connection in congested flights.The problem of being highly time consuming is resolved in this paper by improving the models by stages. Stage partitioning based on the variation of delay clusters is similar to the typical infectious cycle. For heterogeneous networks the model can describe the congestion propagation and its causes at the different stages of operation. If the connection between flights is homogeneous, the model can describe a more indicative process or trend of congestion propagation. In particular, for single source congestion, the simplifiedmultistagemodels enable short-termprediction to be fast. Furthermore, for the controllers, the accuracy of prediction using simplifiedmodels can be acceptable and the speed on the prediction is significantly increased. The simplified models can help controllers to understand congestion propagation characteristics at different stages of operation, make a fast and short-term prediction of congestion clusters, and facilitate the formulation of traffic control strategies.


Introduction
Airport congestion is an inherent problem in civil aviation, often resulting in substantial departure delays, reroutings, and even cancelations.Operating the aviation network is a complex task, where many factors need to be considered, especially disturbances.Congestion at airports is caused by an imbalance between the demand for flights and capacity of operation units.Flight schedule is in turn limited by both market demand and traffic capacity [1].Therefore, airport flight delay per hour as a result of exceeding the hourly operational capacity can be seen as the congested flights, the temporal variation of which reveals their evolution.The other indices for congestion are queuing and queuing time [2][3][4][5][6].
Since the global air transportation network is a scale-free small-world network [7,8], it provides a suitable framework to characterize air traffic.Therefore, a robust analysis can be undertaken of using flight performance data and the topological structure of the network to reveal the distribution of delays [9].Metrics (delay time, centrality, degree distribution, and so on) have been defined to quantify the level of network congestion and various models introduced to describe or predict the delay/congestion propagation.Some of the models are derived from the observed data [10,11].The models combined with economic approaches have been proposed to estimate propagation delay [1,12,13], through repeated chain effects in aircraft rotations [13].In computing the delays due to local congestion [14], the network delay models assume that every node corresponds to a given airport, two nodes are connected by means of flight routes, each node is weighted by its throughput capacity, and links are weighted by the Euclidean distance [14,15].At the same time, some of the research on delay/congestion propagation focus on the Bayesian network structure learning algorithm by combining genetic algorithms [16][17][18][19][20][21] with timed colored Petri nets [22].
Based on these models, simulation tools can be constructed, for example, in the final approach phase to reduce airport arrival delays [23] and for detailed policy formulation and assessment.Such tools have been used, for example, to reveal that local delays are dependent on the capacity to demand ratio of departures and arrivals [24].
In particular, some characteristics of delay/congestion have been found; for example, the objective delay statistics are sensitive measures of the effect of capacity improvements at airports [24], and the capacity and delay of different airports show different "spectral" characteristics, which can be used to examine airport performance [25].At the same time, cyclic variations in air travel demand and weather at airports have been shown to have an impact on flight delay [26].These characteristics are the foundation of this paper.
Our previous papers have described daily congestion propagation and modeled the evolution of congestion clusters in airports [27] and at the intersection of sectors [28] using some classic epidemic models [29][30][31], based on the similarity between congestion propagation and disease transmission.And the prediction of congestion propagation is a complex work, due to the polytrope of operational environment.So the model of congestion propagation in different stage should take varied factors into account.In this paper, we focus on the congestion propagation of departure aircraft from airport using multistage and multievent models [32,33].The assumption is that propagation characteristics resulting from a particular "event" and at a particular "stage" manifest a distinct "spectrum" (defined as the evolution of congestion clusters with time in a given airport).An "event" can be equipment failure, extreme weather, luggage off-loading, and so on.Here, we focus on extreme weather, which is the main event that caused the disturbance and relatively easy for data acquisition, and compare the congestion propagation in different meteorological conditions to reveal the "spectrum".The case study is the ATL airport, which is one of the busiest airports in the world.Analyzing the spectrum in hour should reveal the cyclic nature and enable the determination of the relationship between the spectrum and departure schedule.Predicting congestion clusters in cycle should improve the accuracy of prediction.Compared with the models in our previous research, modeling congestion propagation in multistage schedule can enable air traffic controllers to better understand the characteristics of congestion and its propagation and provides an accurate and fast way to predict congestion size.And accurate and timely prediction which controllers need for strategic and tactical choices is benefit of both congestion management and improvement of efficiency.
The rest of the paper is structured as follows.In Section 2 we describe congestion propagation and analyze its cyclic nature and schedule variation and the relationship between delay and congestion.Section 3 establishes multievent and multistage congestion propagation models, with a particular focus on simplicity for short-term and fast prediction.Section 4 summarizes the conclusions and future direction of our work.

Congestion and Congestion Propagation
Traffic congestion results from demand exceeding capacity, with the most visible manifestation in terms of delays at airports.This is partly due to congestion in the terminal or airspace.Relative to the requirements, the main cause of delay is reduced capacity of air traffic units as a result of disturbance by incident(s).Although suboptimal distribution of flight scheduling also causes departure airport congestion, delay clusters give rise to additional unexpected congestion.Hence, congestion from departure flights can be divided into two parts due to schedule and delay clusters.This paper focuses on the unexpected part, congestion and its propagation resulting from delayed flights.

Congestion and Delay
Cluster.Focusing on the congestion caused by departure delay clusters, both the variation of capacity for departure and delay clusters with time are used to define the degree of congestion caused by delay at  time as () denotes the capacity for departure at an airport at  time, and () denotes the delayed flights at  time.The degree of congestion is directly proportional to delay size and inversely proportional to departure capacity.If either there is no incident or the effect is negligible, () is constant.
Evolution of congestion mainly depends on the variation of delay clusters with time as in the following expression: It is well known that delay can be propagated on air traffic networks.Hence, congestion has the same characteristic.

Delay Cluster and Schedule.
Based on the relationship between congestion degree and delay clusters, research on the evolution of delay cluster is the key to revealing the mechanism of congestion transmission.Analysis of 744 pairs of data (delay flights and schedule flights) reveals the direct relationship between them, as shown in Figure 1.The probability of delay deterioration is proportional to the schedule flights.That is, when the number of schedule flights is large, the probability of widespread flight delays tends to be high, when the system is disturbed by an "event".The red lines enveloping almost all the dots show the diffusion trend.
Based on the linear regression line, where  is the schedule flight;  is the delay flight.The correlation between the delay clusters and schedule flights can be seen in Table 1 (in the Appendix) and is significant at the 1% level; i.e., flight schedule is the main factor that influences flight delay.Therefore, research on congestion propagation must take into account the variation in flight schedule and congestion degree can be described as Because of the heterogeneous distribution of departure schedule, firstly, we need to find the temporal distribution  The analysis of schedule flights , delay clusters , and departure flights  according to the time line ,  + 1,  + 2, . . .can help to understand the evolution of delay cluster as well as the ability for diffusion or dissipation.Let   ,  = , , , , , , ℎ, , ,  denote the aircraft size of , , , , , , ℎ, , ,  in Figure 2. Let   ,   and   denote the delay clusters, scheduled flights, and departure flights at  time period, respectively.Taking the most complex scenario (Figure 2(c)) as an example, we can get the expressions for , , and  by combining the data structure at each time period, shown as expressions (5a)-(5c).Furthermore, expressions (6a) and (6b) reveal the relationships among , , and  for different time periods.

𝑛: 𝑥
If severe weather was the main cause of departure delay, then congestion would accumulate in the departure airport network.The congestion degree at  is captured in expression (7).Severe weather not only decreases airport capacity, but also increases the number of delayed flights in the absence of an intervention.Congestion here can describe the density of delayed departure aircraft in an airport.At the same time, it can describe the number of delayed flights.Although these flights are scheduled, they do not always appear in the form of individual aircraft, making a part of the congestion invisible.The connection between flights on congested airports or busy periods is not easy to quantify.However, research on the delay clusters provides many useful clues.We can use delay data to measure congestion clusters, and the evolution of delay clusters can be regarded as the congestion propagation on timeline.
A disturbance or incident such as mechanical failure of an aircraft or taxiway jam may cause local congestion, which may in turn propagate into a regional or global congestion, according to the correlation between flights or the thresholds of spread.Congestion resulting from severe weather can been viewed as the disturbance from a series of incidents or flights.Our models take unexpected incident (disturbance) as the root of congestion.Furthermore, they are developed to describe the congestion propagation in severe weather (details in Section 3).

Multistage Congestion Propagation Model
We model the propagation course as a complex undirected dynamic network.In the network, each node is a flight, and two nodes are linked if congestion exists between them.Each node is weighted equally, and each link is weighted according to the connection intensity, as shown in Figure 3.As far as we know, congestion conflict comes out of resource competition.The connection intensity can be described as the sum of shared resources.If two flights share three resources, such as departure time, runway, and taxiway, the intensity should be stronger than other ones sharing two or one resource.We model the congestion propagation in heterogeneous and homogeneous networks, respectively, based on the distribution of connections.The models capture the propagation process and factors that influence the delay clusters.The models can then be simplified for fast short-term prediction, if required.

Model Specification
3.1.1.Heterogeneous Network Model.Aircraft in airport activity areas share same resources, for example, taxiway, runway, apron, flight crew, and vehicles.Delay and/or congestion usually derive from the scarcity of these resources.At the same time, the congestion connection between departure flights is varied, and its intensity can be analyzed based on the weighted sum of the shared resources.Let  denote the congestion connection/degree between departure flights, shown in Figure 4. Due to the inhomogeneous distribution of connections, the congestion propagation of departure flights is in fact a heterogeneous network model.
According to the congestion status and evolution, the departure flights are divided into three clusters, those are congestion clusters , discrete clusters , and removed (from congestion) clusters , in a given airport.Let  ()  , ,  = 1, . . ., ,  = 1, . . ., , denote the number/ratio of congestion clusters which is in the -ℎ congestion event and in the -ℎ stage.The congestion connection (degree)  is given as the superscript to differentiate from the subscripts ,.
At the same time, not all the discrete clusters are affected by the congestion clusters, some of which may be transformed directly to the removed clusters at a transformation rate .
The dynamic equations in the heterogeneous network are Supposing Θ , is a constant, the results are ,  1 ,  2 , and   are constant, and some expressions are given by () =  , +  , ,   () =  ,−1  () ,−1 . ( The study focuses on the flights in congestion clusters, so the key is to observe the evolution of  () , .Figure 5 shows us the prediction process of congestion clusters based on expression (9), and the specific process is as follows.
(B) Congestion Source.The event  that causes congested departure airport should be confirmed, for example, low visibility, thunderstorm, and so on.It depends on prediction of congestion event.
(C) Propagation Duration.Each case of congestion clusters can develop in several stages ; those can be called propagation duration.Based on the propagation capability of congestion source, the propagation duration can be confirmed.
(D) Parameters Fix.Similar day can be found by searching foregoing information in database.And main parameters, for example, , ,  . .., can be fixed based on the data of similar day.(E) Prediction.We can utilize the results of the above steps and predict the congestion clusters  , .
If we make an assumption, every flight on the airport network has same connection intensity; the congestion propagates on a homogeneous network.

Homogeneous Network Model.
In a homogeneous network, we suppose that every departure flight has a homogeneous distribution of congestion connection/degree, by simplifying the operational environment.The other parameter definitions are the same as those for the heterogeneous network model.The process of congestion propagation of departure flow in a homogeneous network is shown in Figure 6.At first, discrete clusters may become infected through competing for shared resources so that the congestion clusters enter stage I and pass through next stages at different rates.Finally, some of the congestion clusters may depart from the departure airport or break out of the congestion connection turning into removed clusters (either removed from the airport activity area or becoming discrete clusters again).Here we do not take the super-infection into consideration.Namely, the flights do not disturb the previous ones and just may affect the latter ones.The dynamic equations for a homogeneous network are shown in the following expression: Supposing  = 3,  = 3,  = 3,  1 =  2 =  3 = 100, the time spent on prediction based on expression ( 9) is about 300s.For the controller, both accuracy and time are important in shortterm prediction.We can simplify the model based on the time distribution of congestion propagation and expression (12).The congestion propagation stages are introduced to reveal its time distribution.

Congestion Propagation Stages.
Congestion propagation exhibits a cyclical fluctuation due to the daily schedules.Hence, its propagation and dissipation rates are variable at different stages.Similar to spread of disease, the process of propagation can be divided into four stages: latent, prodromal, maturation, and convalescent.These are shown in Figure 7.

Latent Stage (Stage I).
The period of this stage is [0, 6] on the time axis, which represents 24 hours ([0, 23]) in a given day.The wave form at this stage is nearly flat.The number of planed departures is rare.If a congestion incident occurs suddenly, there will be only a few disturbed flights and the congestion connection between flights is weak.This is designated in this paper as stage I of congestion propagation.The characteristic of congestion propagation is "latent" and hence not the subject of this paper.

Prodromal Stage (Stage II).
The fluctuation curve presents a high amplitude of variation, due to a sharp growth in planed flights.This phenomenon is well known in most busy airports, referred to as "morning rush" or "early rush".Compared to the latent stage, the demand for departures is significantly higher, magnifying the congestion connection between departure flights.With the occurrence of a disturbance, this early stage of congestion propagation might aggravate the traffic jam and increase delay time.The stage referred to as prodromal occupies the period (6,11] on the time axis and is designated as stage II.

Maturation Stage (Stage III)
. Accumulated through early stages, congestion traffic may reach the peak value of the whole day or a steady state (Figure 7).The period in which traffic flow is saturated is referred to as the maturation stage.Compared to the previous stage, there is a significant change in amplitude.The demand for planned departures is basically stable, and the oscillation of wave is mainly affected by random incident disturbances and not the schedule.Similarly, the stage occupies the period (11,18] on the time axis and is designated as stage III.

Convalescent Stage (Stage IV).
In the convalescent stage the delay flights exhibit a slight increase in the initial phase and then decline sharply until reaching the stable state.The propagation of congestion is mainly affected by the schedule D  variation.Disturbed by the congestion events, the dissipation of congestion is faster than in stages II and III.The stage occupies the period (18,23] and is designated as stage IV.
Based on the description of above stages, we can model congestion propagation in multistage schedule for short-term prediction.

Multistage Congestion Propagation
Model.Usually, congestion may be caused by just one "event"; for example, losing luggage may delay a flight and create local congestion.On the other hand, thunderstorm always results in a wider range of congestion.Ignoring the coupling of multievents, we can specify the simplified model in expression (13).Here   and   ,  = 1, . . ., , denote the number/ratio of departure congestion clusters and discrete clusters in the -ℎ stage, respectively.Let   denote the transformation rate from   →   ,   denote the transformation rate from   →  +1 ,   denote the transformation rate from   → , and   denote the transformation rate from   → .
In Figure 7 Compared to the complex one (expression ( 9)), the model (expression (14)) is a simplified one, and its accuracy may be decreased but the time spent (about 10s) is sharply reduced.It is beneficial to fast prediction and fast decision for controllers.
In next section, the typical congestion days are used for case study, and the application of the simplified model is introduced in stages.Comparisons between scheduled flights on different typical days demonstrate the similarity of schedule distribution.Figure 8 shows the flight scheduling on 22 and 23 December 2014 exhibiting a very strong similarity for the two days.The immediate cause of traffic congestion in the period 22 to 24 December was severe weather, specifically poor visibility on runways.A comparison of the visibility on these three days is shown in Figure 9.The visibility on runways is the key to flight departure and therefore affects airport capacity.

Case
In Figure 10, the variation in delay clusters Δ reveals the evolution of congestion flights as influenced by visibility.In particular, on 22 December visibility is increased gradually after 15:00 increasing the rate of congestion dissipation.In contrast, the visibility on 23 December was low for most of the day from 7:00 to 22:00, and therefore of the three days, which had the highest level of congestion.The gap between two days is obvious in the typical rush hours, such as 15:00-16:00, 18:00-19:00, and 21:00-22:00.At the same time, there are common rush hours, such as 9:00-10:00 and 12:00-13:00.The difference of congestion clusters is decided by the significant differences of visibility in above-mentioned typical congestion days.
The multistage models are applied to describe the congestion evolution and predict the congestion trend.Ignoring the complexity and probability of "event" coupling, prediction focuses on the congestion propagation resulting from a single "event".We still take the data on the Dec 22-24 2014 in ATL  airport as our case analysis and predict the congestion clusters with the simplified models.
Stage II.In stage II, the congestion proportion   and discrete proportion   are the results of removing the effect of schedule flights.(), (), and () are the scheduled flights, congestion flights, and discrete flights in the period , respectively.  and   are expressed as The congestion propagation model on stage II is expressed as Applying expression (16) to stage II, the comparison between historical data and prediction result is shown in Figure 11.
And Table 2    and the congestion clusters are sensitive to capacity [24].The model is expressed as Applying expression (18) to stage III, the comparison between the prediction result and historical data is shown in Figure 12.
Stage IV.In this stage, scheduled flights manifest a sharp decline and the visibility on typical days is different (Figure 9).Hence, we still use the congestion proportion removing the effect of scheduled flights and find that the curves representing the evolution of congestion proportion are relatively flat, as shown in   visibility.However, the curve for 23 December shows a higher congestion proportion because the visibility is much lower than the other two days.
Comparing two models, complex model (expression ( 9)) and simplified model (expression ( 16) and ( 18)), the time spent on the prediction is more dominant.Table 3 shows the comparison results under the same initial conditions.Let V() denote the average size of congestion cluster; if V() = 20, the mean prediction errors using complex model and simplified model are 0.182 and 0.416, respectively; if V() = 100, they are 0.91 and 2.08, respectively.There are small differences between them.And the prediction result using simplified model is acceptable for controllers.Time is critical in congestion prediction, and control and management in the early stages are more effective.Compared with time spent using simplified model, time spent using complex model is too long to be accepted.Especially for fast and short-term prediction, time cannot be neglected.

Conclusions and Further Work
This paper builds on our previous work [27,28], to develop a new congestion propagation model of departure aircraft in multievent and multistage schedule.Dividing the departure flights into different clusters according to the connection between the flights, the evolution of these three groups in multistage schedule can reveal the propagation mechanism.

Figure 3 :
Figure 3: Congestion propagation in departure airport network.

Figure 4 :
Figure 4: Congestion connection between departure flights in the airport activity area.

Figure 5 :
Figure 5: Prediction process of congestion clusters based on heterogeneous network model.
, the daily congestion propagation period is divided into four stages based on the characteristics of the propagation curves.And we suppose the departure flights in every stage have three clusters (, , and ), and the evolution of congestion happens in the same stage.The simplified expressions are

Figure 8 :
Figure 8: Schedule flights on typical congestion days.

Figure 9 :
Figure 9: Comparison between the visibilities on typical congestion days.
Δd on Dec. 22 Δd on Dec. 23 Δd on Dec. 24 peaks of Δ d on typical rush hour gap between Δ d in Dec. 22 and Dec. 23

Figure 11 :Figure 12 :
Figure 11: Comparison between the prediction result and historic data for stage II.

Figure 13 .
Particularly, the curves for 22 and 24 December are almost same due to similar levels of

Figure 13 :
Figure 13: Prediction result on stage IV in different days.

Table 1 :
Correlation between delay clusters and departure schedule flights.
* * .Correlation is significant (2-tailed) on 0.01 level.b.Repeated sampling will be based on a sample of 1000 Study.The scheduled flights vary with airports and time periods.Take the case of one of the most congested airports, ATL, as our example.ATL is the busiest and most efficient airport in the world and, by some accounts, the best in North America.It also holds the distinction of being the first airport in the world to serve more than 100 million passengers in a single year.From January 2016 to December 2017, the proportion of delayed gate departures is 15.56%, and the average minutes of delay per delayed gate departure is 46.25 minutes [34].
supplies us with some descriptions of historical data and prediction result to compare two sets of data, such as maximum, minimum, standard deviation, variance, and standard error of mean.  can be used to estimate the accuracy of our prediction result on the  stage.Let () and () denote the historical data in  time; then   can be expressed as Stage III.Based on the less volatility of scheduled flights in stage III, the effect from scheduled flights can be ignored.As any disturbance impacts airport capacity, we use capacity to describe the disturbance intensity, because the visibility (capacity) changes significantly on typical congestion days

Table 2 :
Descriptions of historical data and prediction result on stage II.

Table 3 :
Comparison between complex model and simplified model.