A Multilayer Perceptron Neural Network with Selective-Data Training for Flight Arrival Delay Prediction

Flight delay is the most common preoccupation of aviation stakeholders around the world. Airlines, which suffer from amonetary and customer loyalty loss, are the most affected. Various studies have attempted to analyze and solve flight delays using machine learning algorithms. (is research aims to predict flights’ arrival delay using Artificial Neural Network (ANN). We applied a MultiLayer Perceptron (MLP) to train and test our data. Two approaches have been adopted in our work. In the first one, we used historical flight data extracted from Bureau of Transportation Statistics (BTS).(e second approach improves the efficiency of the model by applying selective-data training. It consists of selecting only most relevant instances from the training dataset which are delayed flights. According to BTS, a flight whose difference between scheduled and actual arrival times is 15 minutes or greater is considered delayed. Departure delays and flight distance proved to be very contributive to flight delays. An adjusted and optimized hyperparameters using grid search technique helped us choose the right architecture of the network and have a better accuracy and less error than the existing literature. (e results of both traditional and selective training were compared. (e efficiency and time complexity of the second method are compared against those of the traditional training procedure. (e neural network MLP was able to predict flight arrival delay with a coefficient of determination R2 of 0.9048, and the selective procedure achieved a time saving and a better R2 score of 0.9560. To enhance the reliability of the proposed method, the performance of the MLP was compared with that of Gradient Boosting (GB) and Decision Trees (DT). (e result is that the MLP outperformed all existing benchmark methods.


Introduction
In the last few years, air transport has experienced a high growth and demand mainly because of its comfort, speed, safety, and efficiency. e massive increase in air traffic has resulted in congestion in the airspace and airports leading to traffic delays. Flight delays are an inconvenience to airlines, airports, passengers, and aviation authorities. Delays occur due to mechanical and technical issues, slot restrictions, personnel labor strike, runway, airport or airspace lack of capacities, the hub status of the airports, or poor weather. e authors in [1] concluded in their study that thunderstorms are responsible for flight delay duration. As consequences, a delayed flight can be costly to passengers by arriving late or cancelling their personal scheduled appointments and to airlines by a large economic and customer loyalty loss. It also affects the environment by increasing fuel consumption and gas emission [2]. Hence, a delay prediction turns out very important. International Civil Aviation Organization (ICAO) has enabled a program called Air Traffic Flow Management (ATFM) with the objective of ensuring that the traffic volume is compatible with the capacities declared by aviation authorities in order to reduce ground and en-route delays. Another application of traffic management is the Free Route Airspace (FRA) concept which consists of using the shortest routes possible in order to reduce flight time, CO 2 emissions, and fuel waste. Moreover, several other models have been developed to solve delays problem based on probability, statistics, graph and network representations, operational research studies, and so on. Ours is based on machine learning algorithms. Recently, the use of artificial neural network (ANN) has become widely recommended in different fields: medical applications, pharmaceutical sciences, engineering, banking, social media, and so on. One of its greatest advantages is the capability to rapidly learn from its environment (data, tasks, and so on). It is also able to identify redundant and noise variables during training [3]. To predict traffic arrival delays using ANN, we chose to apply the multilayer perceptron (MLP) because of its reliability and better performance. Unlike other statistical techniques, MLP can model highly nonlinear functions and has been shown to be effective when presented with new unseen data. e MLP has been applied to a wide variety of tasks such as prediction, function approximation, or pattern classification [4].
Only few studies related to flight delay prediction using MLP were conducted, but to the best of our knowledge, no one has adopted MLP-based selective-data training. e proposed method was applied to predict the arrival delay of United States domestic flights in the year 2018. Open-data sourced from the Bureau of Transportation Statistics were used [5]. To enhance the performance of the proposed prediction model, a selective procedure which consists of keeping only the delayed flight data was employed separately and compared with the traditional procedure. Hyperparameter optimization has proven better results on the performance of machine learning algorithms over manual search which tends to be annoying and time-consuming [6]. As a search method to find the best parameters, we used grid search technique which examines the entire search space. Each dimension of a grid represents a successive or discrete variable to be optimized after several trial-and-error processes [7]. In order to enhance the reliability of the proposed method, we attempted to evaluate the model using two other regression algorithms, namely, gradient boosting (GB) and decision trees (DT). e main contributions of the proposed work can be summarized as follows: (i) e proposed model can be utilized to estimate and predict flight arrival delays. (ii) It can be extended to other applications only by adapting or changing the data. It can be used to predict departure delays instead of arrival. Air transport can be replaced by maritime or railway fields by predicting ship or train departure or arrival delays and so on. (iii) We introduce a novel technique, namely, selective training to help the system in focusing on relevant data and avoid overfitting. Data were refined, transformed, and prepared for the learning using preprocessing and cleaning techniques. (iv) To better determine the architecture of the network, adjust, and tune the hyperparameters, grid search technique was adopted. As a consequence, the best configuration of our model was generated. (v) To ensure the reliability and efficacy of the proposed method, results are evaluated and compared with some existing systems from the literature. To prove much more the effectiveness of the study, other machine learning models such as gradient boosting and decision trees were applied and compared with the multilayer perceptron performance. e complexity of the traditional training and the novel technique was calculated and compared. e rest of the paper is structured as follows: Section 2 presents a preview of the artificial neural network. Section 3 shows a brief review of previous research studies related to flight delay prediction with machine learning algorithms. e methodology proposed in this study is described in Section 4, followed by the experimental results of the predictive model in Section 5. Section 6 defines the computational complexity of both traditional and selective training models. Section 7 provides a conclusion and suggests future works.

Artificial Neural Network (ANN)
As a simulation and imitation of the brain neural network, ANN is a mathematical structure that necessitates less formal statistical training to develop. It has the advantage of being able to detect complex nonlinear relationships between independent and dependent variables and every possible interaction between predictor variables. It can be elaborated using multiple different training algorithms [8]. ANN performs better with large dataset unlike SVM and random forest which show a high accuracy and precision with smaller data set [9]. It has the power to be adjusted in order to lower its error without being sure that the error could not be lower still [10]. e authors in [11] considered ANN among the most powerful machine learning algorithms for time-series predictions. A large-scale empirical comparison between ten supervised learning methods demonstrated that neural networks are more competitive and efficient than boosting, random forests, bagging, and support vector machines [12].
ANN applications are endless. In classification, ANN is used for image and speech recognition, abnormal event detection, customer purchasing patterns, and so on. In regression, it is applied for stock market predictions, forecasting applications, real-time optimization, modelpredictive control, and so on. e authors in [13] used ANN to predict the fatigue crack growth and propagation in both short and long crack regimes. e authors in [14] applied ANN to detect and classify COVID-19 disease from X-ray images using capsule networks. e authors in [15] chose a multilayer perceptron (MLP) neural network to assess the safety of earthquake hazard and identify vulnerable buildings. Other applications of ANN are highlighted further in the paper.

Literature Review
Researchers have carried out flight delay prediction studies from different perspectives and using several modeling methods: statistical analysis, probabilities, queuing modeling, simulation techniques, network representations, and 2 Scientific Programming machine learning. Unlike these traditional methods which generally have proven to be weak, slow, and limited, machine learning algorithms become more popular in leading to higher accuracy and dealing with a huge amount of data. Several research studies have been conducted on flight delay prediction using machine learning (ML) algorithms.

Neural Network
Models. e authors in [16] presented a comprehensive study of traffic delays using several machine learning models. e MLP model has proven 89.07% accuracy compared with the convolutional neural network (CNN) which showed a slightly better prediction accuracy of 89.32%. e authors in [17] applied the ANN to predict airborne delay due to air traffic control using actual operation data observed by radars. e performance of the ANN was compared with that of the queuing analysis method. ANN was able to predict the delay average value but unable to learn the propagation of the delay compared with the other method. e authors in [18] selected ANN to create a prediction model of air delays in the route linking São Paulo to Rio de Janeiro. Random search technique was utilized for hyperparameterization of the network. e results showed an accuracy superior of 90%. e researchers [19] introduced a new type of multilevel input layer ANN capable of handling nominal variables in order to predict the delay of incoming flights at JFK airport. e authors in [20] applied decision trees, random forest, multilayer perceptron, and different sampling techniques to predict flight delays. e best model was the MLP with 85% accuracy. e authors in [21] implemented a supervised machine learning model that predicts delay deviation time of new Lithuanian airports flights. ey used the grid search technique with seven algorithms: probabilistic neural network, multilayer perceptron, decision trees, random forest, tree ensemble, gradient boosted trees, and support vector machines. MLP has shown an accuracy of 96.02% for departures but has problems with arrivals with 47% accuracy.

Other ML Models.
e authors in [22] provided a costsensitive delay prediction model using supervised machine learning algorithms such as decision trees, random forest, Adaboost, and kNN. Authors [23] developed a flight delay predictive model by combining multilabel random forest classification and approximated the delay propagation model. ey demonstrated that feature selection had a better performance than using all the features from the dataset. e study in [24] aimed to predict departure and arrival flight delays in an individual airport using the gradient boosted decision tree algorithm. It showed better performance as compared with other methods. To estimate airline delays, the authors in [25] utilized binary supervised and unsupervised machine learning classification algorithms. Paper [26] consists of a two-stage predictive model (classification and regression) employing supervised machine learning algorithms in order to predict on time performance of flights. e authors in [27] utilized decision tree, logistic regression, and neural networks classifiers to predict flight arrival delays for the year 2015 in the United States. An accuracy of 91% was achieved by all the three classifiers. Authors in [28] applied multiple linear regression, decision trees, and random forest algorithms using R-studio to predict and identify the critical parameters responsible for flight delay. e authors in [29] created a Bayesian network model to analyze flight departure delay in a large hub airport. e model proved to have a high convergence and accuracy rates. e authors concluded that parameters learning can reflect departure delay. e authors in [30] used an improved SVM model, KNN, and random forest to predict flight delays with an R 2 , respectively, equal to 0.71, 0.14, and 0.09. e authors in [31] compared deep belief network combined with support vector regression (DBN-SVR) results to those of k-nearest neighbors (kNN), support vector machine (SVM), and linear regression (LR). e coefficient of determination R 2 was 0.93, 0.87, 0.87, and 0.82, respectively. Table 1 represents a summary of prior studies for flight delay prediction.
e present work stands out from the previous academic literature by applying two different approaches to predict flight arrival delay in the United States context. Despite being reliable and presenting a good performance, only few studies have employed MLP to predict flight delays unlike other machine learning methods which are very common and are applied most of the time. is lack in the existing literature encouraged us to use the MLP in the same context but treated differently.
is study is, to the best of our knowledge, the first attempt to apply the selective training in the flight prediction field. Instead of blindly and manually testing MLP parameters, we adopted the grid search technique to find the best architecture of the model.

Motivation of the Proposed Method.
Flight delay prediction is crucial not only for passengers and airlines but for every player in the aviation and transportation systems. e proposed model is able to predict flight arrival delays in United States context. By establishing a model with a high ability of predicting air traffic delays, airlines will be able to inform their passengers of the delay in advance. Moreover, the proposed model, if utilized appropriately, can enable aviation stakeholders to analyze, study, and reduce occurring delays by finding the best course of actions to take during the decision-making process, without necessarily having to invest in airport infrastructure and installations.
A survey (survey link: https://t.ly/VO4Q) was established, before carrying out this study, in order to identify the importance of predicting and minimizing flight delays from the point of view of pilots, air traffic controllers, airport personnel, and passengers from different countries: Morocco, Egypt, Sweden, and United States. All target population confirmed having experienced flight delays that engendered many consequences such as frustrations, delay propagation, missing the next flight in case of a stop-over, and not being able to check-in in time for non 24 hours reception hotels at destination. ey all agreed that predicting air traffic delay is important. Pilots claimed that flight delays lead to a work under pressure, stress, and Scientific Programming It is highly recommended by several organizations that safety is a major concern in aviation context. ere are three primordial slogans for Air Traffic Management (ATM). In fact, the International Civil Aviation Organization (ICAO) has named in its ATM book the three slogans in order such as safety, regularity, and efficacy [32]. e Federal Aviation Administration (FAA) services (Federal Aviation Administration-Air Traffic Services, https://www.faa.gov) provide a safe, secure, and efficient management for the National Airspace System and international airspace assigned to U.S. control. National Aeronautics and Space Administration (NASA), in collaboration with the Office of Safety and Mission Assurance (OSMA), created a culture and an environment that keep safety in the forefront and as a priority. Since it has been agreed that aviation is a sensitive and delicate field, we decided to assure the safety by establishing a simple, smooth, unrisky but efficient predictive system. Also, using several and excessive features may lead to an overfitting. For this reason, we chose to keep our model simple and understandable.
To enforce and achieve the predictive system, we chose the MLP regressor for being effective and having a good performance compared with other traditional statistical techniques. It is a universal approximator that presents better efficiency for function approximation in high-dimensional spaces. Unlike conventional linear regression methods which suffer from the curse of dimensionality, the error convergence rate of the MLP is independent of the input dimensionality [33]. MLP is also known for being easy to implement, providing high-quality solid models with a low training time compared with more complex methods [34], which fulfills all we need to assure a safe and secure predictive model in aviation field.
As an alternative to traditional statistical modeling techniques, MLP has been applied in many scientific disciplines categorized as either prediction, function approximation, or pattern classification. In prediction, researchers chose MLP for wind speed forecasting [35], financial predictions [36], forecasting of stock prices [37], bacteria type prediction [38], temperature forecasting [39], short-term rainfall forecasting [40], and so on. In our case, the purpose of the proposed MLP-based model is to estimate flight arrival delays. However, it can be extended to other applications only by adapting or changing the data. It can be utilized to predict departures delays instead of arrivals. Air transport can be replaced by maritime or railway fields by predicting ship or train departure or arrival delays and so on.

Description of the Proposed Method.
Data are the core of any machine learning algorithm. In the first stage, we collect historical flight data from Bureau of Transportation Statistics (BTS). In order to extract appropriate information, features selection which consists of only considering the relevant features is applied. Collected data are generally noisy, incomplete, and redundant [41]. Since ours contain undesired information, cleaning and preprocessing techniques which involve correcting, transforming, replacing, and deleting data need to be performed at this stage. en, we apply to the modified dataset either the traditional technique of selecting all data records or the selective training which consists of only accounting the delayed traffic. Next, we split the dataset into 70% for training and 30% for testing. e test set is used for the prediction. e model is trained with the multilayer perceptron regressor. Grid search is applied to generate the best hyperparameters and an optimized model. To evaluate the performance of the proposed model, accuracy, computational complexity, and error metrics are calculated and compared for both traditional and selective trainings. Gradient boosting (GB) and decision trees (DT) are used as benchmark methods to prove the efficiency of the MLP. e choice of GB and DT for the comparison will be defended in Section 5 of the paper. e flowchart of the proposed methodology is illustrated in Figure 1.

Data Source.
e dataset used for the study is extracted from Bureau of Transportation Statistics (BTS). It contains more than 760 thousand samples of historical flight data from the 1st of January to the 31st of December 2018, in the United States. BTS database has proven to be reliable and full of statistical data saved since 1987. All the flights whose difference between scheduled and actual arrival times is 15 minutes or greater are considered delayed.

Features Selection.
e data selected contained several features from which we have kept only relevant ones which have a high contribution to traffic delay. e features used in this project are resumed as follows: (i) Date of flight: the date in which the flight was performed (ii) Carrier: the airlines/company code (iii) Origin: airport of departure (iv) Destination: airport of arrival (v) CRS DEP: the scheduled departure time (vi) Actual DEP: the actual departure time (vii) DEP delay: the departure delay in minutes (viii) CRS ARR: the scheduled arrival time (ix) Distance: distance of the flight in miles (x) ARR delay: the arrival delay in minutes, which is our dependent variable Logically, arrival and departure delays are highly correlated. A traffic that experiences a delay on departure will be surely delayed on arrival. It proves, according to [42], that congestion at destination airport is to a great extent originated at the departure airport. For this reason, we considered DEP delay as a contributive feature to arrival delays. e long-term flights are more likely to having delays due to the possible bad weather scenarios, multiple time zones travel, crew stress and fatigue, and so on. So, Distance is a relevant feature for our study.

Preprocessing and Cleaning.
In the preprocessing, we eliminate unnecessary information and keep only relevant one to assure coherence. is process consists of preparing and formatting the data before training by the following: (i) Removing data records with missing values (ii) Eliminating errors and null values (iii) Removing duplicate data (iv) Converting categorical data to numeric 4.6. MLP Modeling and Hyperparameterization. Being the most common and popular network architecture in use today, MLP is a feedforward neural network composed of more than one perceptron [10]. It is applied generally in prediction, function approximation, or pattern classification fields. e authors in [15] used an optimized MLP model with three hidden layers (25,15,10) to predict the damage state of reinforced concrete buildings from the Duzce earthquake in Turkey. Results showed that the MLP model has a high accuracy in detecting most vulnerable buildings. Authors in [43] developed a set of multiple MLP neural networks with the backpropagation learning algorithm using an adaptive learning rate. e study was applied for thyroid disease diagnosis in the Internet of medical things in which the accuracy rate of 99% was achieved.
e MLP, compared with other traditional statistical techniques, has shown to be effective with unseen data and nonlinear systems especially in prediction applications [4]. Despite being reliable and presenting a good performance, only few studies have employed the MLP to predict flight delays. is lack in the existing literature encouraged us to use the MLP in our research with a parameters optimization. Besides, to the best of our knowledge, no one has adopted a MLP-based selective training.
Generally, most of machine learning algorithms achieve optimal results only if their parameters are tuned and adjusted properly [44]. To save search time and energy, we adopted grid search technique with the k-fold cross-validation; in our case, k was equal to three (k � 3). For the construction of the neural network in both traditional and selective procedures, we utilized Scikit-learn for being a well-maintained, comprehensive, and open-sourced machine learning package in Python [15]. For model training and testing, we applied the recommended 70-30 split. For the hyperparameterization, the following parameters were chosen: (a) hidden layer sizes: (50), (50, 50), (100); (b) activation function: tanh and ReLU; (c) solvers: stochastic gradient descent (Sgd) and Adam; and (e) L2 penalty (regularization term) parameter: 0.001, 0.05, and 1e − 10. To obtain the prediction results, we followed the ANN process of multiplication, summation, and activation using the following equation: where x i (t) is the input value (the independent variable/feature used to predict the output) in discrete time t where i goes from 0 to n inputs; w i (t) is the weight value representing the influence of input nodes on the output in discrete time t where i goes from 0 to n inputs; b is the bias that shifts the result of the activation function towards the positive or negative side; F is the transfer or activation function which decides whether a neuron should be activated or not; and y(t) is the output value (the dependent variable) in discrete time t.

Experiments and Results
As an experimental dataset, we used more than 760 thousand samples containing US flight records from the 1st of January to the 31st of December, 2018. As we mentioned in Section 4, for model training and testing, we applied the recommended 70-30 split. We utilized Scikit-learn and Python for coding our program. North American commercial airlines, such as American Airlines and United Airlines, have defined short haul flights as the flights where the route length or distance is shorter than 700 miles, long haul flights as being longer than 3000 miles, and medium haul flights as being in-between [45]. We classified the flights in three categories, namely, short haul flight, medium haul flight, and long haul flight depending on the route distance. In Figure 2, we plot the arrival delay against the flight category. We notice that the category long haul flight is clearly null in the figure. Since our data contain records for only domestic flights, it is obvious that the distance will be shorter than 3000 miles because the flights were performed inside the country US and not abroad. Hence, our data flights are either short or medium haul. From the results, we conclude that most of the flights delayed on arrival were classified as medium haul based on the distance. So, the longer the distance is, the higher the delay is ( Figure 2). 25% from the data indicated that delayed flights on arrival had delays on departure too. Furthermore, 40% from the data demonstrated that the flights which had not been delayed on arrival had not been delayed on departure too which proves that arrival and departure delays are highly correlated. We plot the high correlation between the two features in Figure 3.
Two approaches have been adopted in our research. e best configuration of ANN for both selective training and traditional procedure was the one that presented 2 hidden layers with 50 nodes each (50, 50), ReLU as an activation function, Adam as a solver, and L2 penalty or alpha equals to 1e − 10. With the optimized configuration of the ANN, a multilayer perceptron network of 4 layers was trained in our study. e final is the output layer with a unique neuron giving the arrival delay. e final architecture of our MLP is shown in Figure 4. e classic evaluation metric coefficient of determination R 2 , which is the measure of deviation between the regression line and the observed points [46], was applied to measure the performance of both traditional and selective procedures. We compare, in Table 2, the accuracy of our model to that of other studies from the literature review. Our model with the selective training achieved the best score of 95.60% followed by paper [18] in which a mean correct predictive capacity of 91.3% was obtained. Among the several machine learning algorithms applied in [16,20], the best model was the MLP with an accuracy of 89.07% and 85%, respectively.
In order to verify the reliability of the proposed method, we compare the performance of the MLP to that of other benchmark methods, namely, gradient boosting (GB) and decision trees (DT). e gradient boosting model has benefited from being popular in many research areas and from the ability to handle overfitting,     [47]. Decision tree models are quick to build and easy to interpret and understand. e predictions based on decision trees are efficient [48]. Mean absolute error (MAE), root mean squared error (RMSE), and median absolute error (MdAE) were used as evaluation metrics. MAE is the measure of how close forecast or predictions are to the eventual outcome [9]. RMSE is the square root of the mean of squares of all the errors while MdAE is the median of all absolute differences between the target and the prediction [49]. Table 3 shows the benchmark performance of the MLP proposed model with gradient boosting and decision trees. e results prove that the MLP outperformed the other algorithms in both training and testing processes. e evaluation metrics were quite similar for both training and testing steps which indicates that our model fits well. To examine the accuracy of the predictive model, we plot the high correlation between actual and predicted delay values in Figure 5.
From the results and evaluations above, the following can be easily deduced: (i) Arrival and departure delays are highly linked and correlated (ii) Distance and length of the flight are contributive to traffic delays (iii) An adjusted and optimized hyperparameters using grid search technique helped us choose the right architecture of the network (iv) e ANN-based MLP gave a high predictive arrival delay performance of 90.48% (v) e selective-data training with the MLP proved an increase in efficiency and a better performance of 95.60% (vi) e proposed model has outperformed all existing systems in Table 2 (vii) e MLP achieved a better performance than gradient boosting and decision trees (viii) Evaluation metrics of testing are similar to those of training which proves that our model is good (ix) Actual and predicted values' high correlation indicates that our model fits well

Computational Complexity
Time complexity is the measure of how fast or slow an algorithm will perform depending on the input size. In order to compute the complexity of our model, the famous big O notation was applied and compared for both traditional and selective trainings.

Feedforward Pass.
Following the ANN process of multiplication, summation, and activation, we have from layer m to n as follows: where S n is the weighted sum of the weights W nm and the inputs X m . If we apply the activation function, then we will have the following: where Y n is the output after the activation function F is fed to the computed value S n . e process will run N-1 times in the case we have N layers (input and output layers included). For the case of 4 layers, we will need 3 matrices to represent the weights: W nm , W on , and W lo .
W nm is a matrix with n rows and m columns containing the weights from layer m to layer n.
For t training examples, we have Time complexity of the operation above is O(n * m * t). O(n * t) is the time complexity of the activation function as follows: In total, we have the following complexity: O(n * m * t + n * t) � O(n * t * (m + 1)) � O(n * m * t).
Following the same logic, the similar process will be applied when going from n to o: O(o * n * t), and from o to l:   Scientific Programming So, in total, time complexity for feedforward propagation process is as follows: O(n * m * t + o * n * t + l * o * t) � O(t * m * n * o * l).

Backpropagation Pass.
Following the same process, time complexity of backpropagation is the same as feedforward pass.
For one epoch, it is equal to O(t * m * n * o * l).
We multiply by number of iterations i (epochs): Another format of the neural network time complexity (neural network models (supervised)-MLP regressor-scikitlearn.org) which gives the same results can be used: O(i * t * m * h k * l).
6.2.1. Application. We consider t as the training samples, m as the number of features (input layer neurons), h as the number of nodes per hidden layer, k as the number of hidden layers, and l as the output layer neurons.
(1) Traditional Training. In this case, we have t � 579880, m � 9, h � 50, k � 2, and l � 1. So, the computational complexity of the traditional procedure is O(i * 1304730 * 10 04 ), where the max iteration number (epochs) i is equal to 200.
(2) Selective Training. In this case, we have t � 127340, m � 9, h � 50, k � 2, and l � 1. So, the computational complexity of the selective training is as follows: O(i * 286515 * 10 04 ) where the max iteration number (epochs) i is equal to 200.
We notice that the computational complexity of the selective training is way lower than the traditional procedure.

Conclusion
Reducing flight delays has been a major concern for airlines, airports, passengers, and aviation stakeholders in general. However, minimizing delay time is not such an easy topic. Hence, a traffic delay prediction turns out useful. Several researchers have tried to develop new models in order to increase the precision and accuracy of flight delays prediction. In this study, we proposed an artificial neural network (ANN) model based on supervised learning. After the hyperparameterization of the network by grid search technique was done, a multilayer perceptron (MLP) with an input layer, two hidden layers of 50 nodes each (50, 50), and an output layer was built. Departure delay and flight distance proved to be very contributive to flight delays. e experimental investigation showed that the highest score of 0.9560 has been obtained when the selective-data training was applied.
e traditional training procedure has demonstrated a R 2 score of only 0.9048 compared with the other training. Time complexity of the two methods was computed by the famous big O notation and then compared. In order to boost and enhance the reliability and the efficacy of the proposed model, MLP results were compared with gradient boosting and decision trees.
e MLP regressor has performed a better prediction in both training and testing processes and must be given more attention in the next studies.
Our model treats flight arrival delays but it can be also applied to study flight departure delays. Our proposed architecture can be also used to predict train, TGV, metro, bus, ship arrival, or departure delays. In this case, the airline can be replaced by the transportation society, the airport by the station or the port, and the distance of the flight by the distance of the trip.
is work has some limitations that can be a subject for further research. Only domestic US flights information were used to predict arrival delay. International flights records were not utilized due to a lack of reliable relevant data. Another limitation would be meteorological data that were not considered and may be a challenging future research.
In the future, we will consider a flight delay prediction using real-time flight data. Furthermore, a complex but more performant deep learning model would be very interesting. Finally, all the needs and lacks that must be fulfilled in this research will be studied.

Data Availability
e data used to support this study can be found in transportation statistics, United States Department of Transportation, http://www.transtats.bts.gov/.