LiDAR-Based Windshear Detection via Statistical Features

,


Introduction
Windshear refers to a sudden and sustained change of the headwind/tailwind encountered by aircrafts in direction and/or speed [1,2]. A signifcant windshear occurring might change the lift force that the aircraft would experience and subsequently make the aircraft fy below or above the intended fght path. Tis might cause difculties for the pilot to control the aircraft safely. Accurate windshear alerts can help pilots take timely and appropriate corrective actions to ensure the safety of the aircraft. Hence, windshear detection for airports are crucial for efcient and safe air trafc control (ATC) [3].
Te occurrence of windshear strongly corresponds to convection, frontal systems, thermal instabilities, and microbursts [3]. Generally, windshear could occur in both rainy days and non-rainy days for diferent reasons [4]. In rainy weather conditions, microbursts and gust fronts associated with severe convection can result in windshear, which could be detected by anemometers and rain-detecting Doppler weather radar. In non-rainy weather conditions, windshear occurs mainly due to terrain efects, sea breezes, low-level jets, dry microbursts, etc. However, since there are few hydrometeors as refectors for microwaves in clear air [2], it is difcult to obtain adequate wind velocity data by Doppler weather radar on non-rainy days. In addition, anemometers can only detect windshear within the lowest couple of hundred meters [4,5]. Terefore, it is inefective to capture dry windshear events at higher altitudes only by anemometers and Doppler weather radar during non-rainy weather conditions. Although wind proflers can be used as complementary tools to detect windshear to some extent, they are unstable for windshear detection along glide and takeof paths [4].
With the development of Light Detection and Ranging (LiDAR) techniques [6,7], a number of LiDAR-based windshear detection methods have been proposed, which are efective for dry windshear detection. Most of current LiDAR windshear detection algorithms are based upon some indicators corresponding to the fight condition and mathematical statistics of wind velocity. In reference [1], Chan et al. introduced the operational LiDAR-based windshear detection system for the Hong Kong International Airport (HKIA) in which a glide path scan method is proposed to measure the headwind along the individual glide paths and a wind ramp detection method is used to detect windshear. Wind ramp refers to the change of wind speed along a certain distance (ramp length) [8,9]. In practice, windshear ramps with diferent detected ramp lengths (400, 800, . . . , 6400 m) will be detected. Some of them will be picked out by ramp prioritization based on a severity factor that corresponds to the ratio between wind velocity change and the inverse cube root of the related ramp length. If any one of chosen windshear ramps reaches or exceeds 14 knots (the predetermined operational threshold at HKIA), windshear alert will be generated. However, this method neglects the internal features of the ramp waveform, which may refect the true feelings of the pilot. To address this issue, Li et al. [7] proposed a novel ramp algorithm based on a weighted smoothing approach and a secondary windshear recognition scheme, which can detect the fat areas for windshear ramp. In reference [10], Hon et al. introduced a "gentle ramp" removal method for the automatic windshear detection algorithm at HKIA in which the root-mean-square (RMS) of the original headwind profle from its running mean will be additionally checked for a selected windshear ramp whose intensity reaches or exceeds 14 knots and ramp length is greater than 3000 m (about 39 s for aircraft fight). Based on the consultation by aviation users, if the RMS measure of velocity fuctuations falls below 1.2 knots, the windshear alert would be withheld. Due to the transient and sporadic nature of terrain-disrupted airfow disturbances, it is difcult for the pilots to clearly diferentiate between windshear and turbulence at times. In such cases, wind ramp-based methods may not work and it is advantageous to consider the gradient of headwind. F-factor is such an indicator whose frst term is corresponding to the headwind gradient and second term is about the vertical acceleration term. Chan et al. investigated the performance of LiDAR-based F-factor for windshear detection as well as its statistics in reference [11]. However, they did not study the aircraft response to the headwind change and the vertical acceleration term of F-factor. In reference [12], Chan took the aircraft response into account by preprocessing the headwind profle using commercial fight simulator software and established a new threshold (−0.05) of F-factor that is less than the conventional threshold of F-factor for windshear detection (−0.105) based on the dataset collected at HKIA. In reference [13], the corresponding research is further in-depth. Te center-averaging postprocessed F-factor was investigated besides the consideration of fight simulator-based F-factor. By selecting an optimal alerting threshold specifc for each runway corridor, the successful alerting rate of F-factor could reach 86%. In addition to the wind ramp and F-factor-associated windshear detection methods, there are some other methods to detect windshear. For instance, Wu and Hon [14] applied the Fourier transformation to decompose the headwind profle and proposed to detect windshear by threshold based single/ multiple channel method. A regional divergence algorithm based on LiDAR data was proposed by Li et al. in reference [15] for windshear detection at Lanzhou Zhongchuan International Airport which mainly concentrated on the regional wind change size along the runway.
In the past few decades, due to the development of machine learning, several machine learning methods for windshear detection by LiDAR data are proposed. For instance, Ma et al. [16] extracted the windshear features from the partial LiDAR scanning image by an invariant moment method and gray-gradient co-occurrence matrix. Ten, the support vector machine (SVM) method, which can generate a line or hyperplane to separate data into two classes based on the input features, is used to validate the efectiveness of the extracted features. However, the accuracy of this method is too low (lower than 50%) to use in practice. In reference [17], Huang et al. proposed a statistical indicator based on the headwind profles which measures the maximal diference in wind velocities along the range of the measurement LiDAR beam for diferent azimuth ranges. By employing a one-side normal distribution based decision rule, the threshold of this indicator is determined for windshear and non-windshear distinguishing. However, this method only considers the wind profle obtained at the nearest timestamp to the reported time spot, which may miss some windshearrelated data.
In this paper, we will introduce two statistical indicators of windshear from LiDAR observational data. One is derived from the observed wind velocity data, and the other is constructed by the texture feature of the LiDAR plan position illustrate (PPI) scan images. Based upon the idea of multiple instance learning [18], we can construct several feature vectors based on the proposed indicators. Tis research is valuable in a number of aspects. First, the proposed indicators are based on the LiDAR data obtained from PPI scan whose requirements for elevation and azimuth angle change are lower than the frequently used glide path scan. Second, wind profles collected in 4 minutes near the reported timestamp are considered, which could reduce the possibility of windshear features loss. Tird, the application of the image texture extraction method can improve the identifcation efciency of sporadic windshear. Moreover, numerical experiments based on real data collected at HKIA validate the efectiveness of the proposed indicators.
Te outline of this paper is given as follows: in the second section, we will introduce the information of the LiDAR observational data we studied in this paper. Te construction of statistical indicators and related feature vectors are presented in Section 3. In Section 4, numerical results and discussion will be shown. Finally, we will give some concluding remarks in Section 5.

LiDAR Observational Data
Te Hong Kong International Airport (HKIA) is located in the northern part of Lantau Island, which is mountainous between 300 m and 900 m above sea level. Windshear occurs frequently since the terrain near the airport is quite complex. In 2002, the Hong Kong Observatory devised a Doppler LiDAR system to HKIA for timely windshear detection and alerting [5,19]. Te wind velocity data collected in PPI mode with elevation angles of 3°(landing) and 6°(takeof) are provided for windshear detection. 20-30 seconds are required to get one PPI scan. In Figure 1, we show two examples of the LiDAR radial velocity data of PPI scan, where the radius and the polar angle of the scan refer to the slant range and the azimuth angle of LiDAR beam, respectively. Tere are mainly two sectors that cover the airport approach/departure corridors including the touchdown zones, with the eastern sector (10°-150°) for tailwind of the aircraft and western sector (220°-340°) for headwind of the aircraft. Although the slant ranges of the LiDAR data are from 350 m to 10, 000 m, the LiDAR data points with slant ranges over 4950 m are neglected since there are lots of missing Doppler velocities caused by the complex terrain. Hence, in this paper, we focus on the data collected from 10°to 150°and 220°to 340°, respectively, with slant ranges less than 4950 m.

Proposed Method
In this section, we will introduce the proposed indicators of windshear and the corresponding feature vectors. First, we will give a brief review of the physical property of windshear and its characteristics on LiDAR PPI scans. Ten, we will introduce the proposed indicators and the corresponding feature vectors. Here, we use the LiDAR PPI data collected at HKIA as an example. It is interesting to note that the proposed methods can be widely applied to LiDAR PPI data obtained at any airport.

Properties of Windshear.
Windshear can be caused by a wide variety of meteorological phenomena, such as winds blowing across terrain, sea breeze, microburst, gust front, and low-level jet. Generally, most windshear cases are terrain-induced. When the wind blows over rough terrain, wind speed and direction might change in the lee side of high obstacle, which may lead to alternating high-speed and low-speed air streams. Aircraft traversing through the alternating high-speed and low-speed air streams may experience a large headwind gain/loss that afects the aircraft lift. In addition to alternative high and low wind speed, the cross mountain fow can also cause some more complicated localized fows such as vortices or jumps, which could bring signifcant windshear to the aircraft. Due to the complicated characteristics of terrain-induced airfow disturbances, the corresponding windshear is transient and sporadic. Tis property can be properly illustrated by the LiDAR PPI scan of wind velocity for small-scaled areas that have reversed fow embedded in the background wind. Please refer to reference [1] and Figure 1 for more details. In this paper, we mainly derive the indicators based on the sustained change property of wind velocity and the transient and sporadic properties showed in the PPI scan images. Figure 1 shows two LiDAR PPI scans of Doppler velocities obtained at HKIA. Tere are mainly two sectors in each LiDAR scan which can cover the area around the whole corridors including the touch down zones. Te sector with positive values denotes the Doppler velocities of wind blow away from the LiDAR which is actually the headwind encountered by aircraft, and the other sector denotes the Doppler velocities of tailwind. Here, we take both two sectors into account. To ensure the quality of velocity data, the data points obtained over a predetermined range value would be fltered out (4950 m for HKIA data). In references [20,21], 2D wind feld has been verifed for windshear detection. However, it is generally derived quantity based on some assumptions of the nature of the fow, which may not be totally accurate in reality. On the other hand, using radial velocity, it is the direct measurement of the atmosphere and must be accurate. Hence, we mainly concentrate on the features corresponding to the radial velocity data obtained by LiDAR PPI scan in this paper. In addition to wind velocity, coherent Doppler Li-DAR can retrieve multiple parameters, such as spectrum width, spectrum skewness, EDR, windshear intensity, and so on, which can also refect the features of windshear. Nonetheless, future study may include 2D wind feld and some other parameters retrieved by coherent Doppler Li-DAR, to test the robustness of the algorithm and see if there is any improvement in windshear feature detection.

Proposed Indicators.
First, we will introduce several mathematical notations used in this paper.
For one LiDAR PPI scan, let x i ∈ R n be the LiDAR data observed at azimuth angle θ i . For simplicity, we assume that there are m 1 azimuth angles (θ (1) to be recorded in the tailwind sector and m 2 azimuth angles (θ (2) 1 < θ (2) 2 < · · · < θ (2) m 2 ) to be recorded in the headwind sector, and also, there are n range values to be recorded along fxed azimuth angle, i.e., (1) Note that the locations of such range values are not necessary to be uniform, and the distance between the Li-DAR center and the observed value x i,j is equal to r j .

Indicator Based on Wind Velocity Data.
Since the main property of windshear is the change of wind speed and direction, we can evaluate the characteristics of windshear by the maximum variation of wind velocities along the LiDAR beam, which can refect the wind velocity change in the investigated area. For fxed azimuth angle θ (1) i (θ (2) i ), the maximum variation of headwind (tailwind) velocities can be evaluated by indicator max 1≤j≤n (x (1) i,j ) − min 1≤j≤n (x (1) i,j ) (max 1≤j≤n (x (2) i,j ) − min 1≤j≤n (x (2) i,j )). Te higher the value is, the more likely windshear will occur. Note that we just take the diference of the velocity values instead of the diference of velocity magnitudes so that the sudden change of wind direction can also be evaluated by this indicator. More specifcally, for the case where there are areas of reversed Advances in Meteorology fow embedded in the background wind, the sign of Doppler wind velocity values would be diferent, which could make the diference between velocity values be large. Here, we extract the top k maximum variation values of each sector to represent the change of wind velocity in the corresponding sector (the determination of k will be illustrated in Section . . , m 1 } denote the set of maximal velocity variation values obtained in the tailwind sector and V (2) � max 1≤j≤n (x (2) i,j ) − min 1≤j≤n (x (2) i,j ), i � 1, . . . , m 2 } denote the set of maximal velocity variation values obtained in the headwind sector, the associated indicator of each sector is calculated as follows: where V (1) i denotes the i-th maximum value in set V (1) and V (2) i denotes the i-th maximum value in set V (2) . Since the windshear property mainly based on the great change of wind velocity, to fnd the largest change of wind profle, we select the one with larger L 2 norm in between f (1) p and f (2) p as the indicator f p for the corresponding wind velocity data collected by one LiDAR PPI scan, i.e.,

Indicator Based on the LiDAR PPI Scan Image of Wind
Velocity. By the LiDAR PPI scan image of wind velocity (e.g., Figure 1), windshear can be detected visually based on the image texture. Generally, there would be more textures on the velocity images collected in windshear case. Hence, we propose to use the image texture extraction method to evaluate the property of windshear from the LiDAR PPI scan image of wind velocity. Gray level co-occurrence matrix (GLCM) [22] is a kind of image texture feature extraction method, which is widely used in remote sensing, biometric issues, and pattern recognition. It mainly describes the relative frequencies of pixel pairs with gray-tone i and j, respectively, which can be separated by distance d under a specifed angle occurring on the image. One can get several diferent co-occurrence matrices with diferent distances and angles. Each element p with position (i, j) in a co-occurrence matrix denotes the relative frequency where the pixel with grey level i is adjacent to a pixel with grey level j horizontally, vertically, or diagonally with a specifed distance and angle.
where I denotes the image, p1, p2 denote two pixels, and # denotes count function. Here, we set the distance to be one and angles to be 0°, 45°, 90°, 135°. Ten, we can get four diferent co-occurrence matrices. Tere are several texture features, which are calculated by GLCM, such as energy, contrast, dissimilarity, entropy, correlation, homogeneity, and so on.
Here, we calculate the contrast, dissimilarity, and correlation of the four co-occurrence matrices as the statistical indicator f im for LiDAR PPI scans, whose equations are given as follows: (ii) Dissimilarity: where μ i , μ j and σ i , σ j denote the mean values and standard deviation values, respectively.

Feature Vectors Based on the Proposed Indicators.
From the proposed two indicators, we can readily get three feature vectors for windshear detection: Since the time stamps reported by pilots are not accurate, one cannot fnd the LiDAR scan where windshear occurs precisely. It is better for us to consider the wind velocity data obtained by LiDAR scans within four minutes near the windshear reported time stamp, which is also the same for non-windshear cases. Inspired by the idea of multiple instance learning in which several instances are arranged in sets (bags) and a label is provided for the entire bag, we consider treating the LiDAR scans collected in one time period as a bag and constructing one feature vector for this bag. Specifcally, after extracting the feature vectors from the data collected by each single LiDAR scan in the corresponding period, we propose to select the one that can represent the most signifcant wind velocity variation (i.e., the one with the greatest L 2 -norm) as the feature vector of the whole time period (bag). F � f 1 , f 2 , . . . , f d be the set of feature vectors obtained in one time period, i.e., one bag. Te feature vector of this bag is the one with the maximum L 2 -norm: f � f j and ‖f j ‖ 2 ≥ ‖f i ‖ 2 , ∀i � 1, 2, . . . , d.

Numerical Results and Discussion
In this section, we will do several numerical experiments to evaluate the efectiveness of the proposed feature vectors, i.e., their separability for windshear and non-windshear distinction. For comparison, the corresponding numerical results of feature vector proposed by Huang et al. [17] are also shown, which is also for windshear detection. First, we will introduce the dataset that we used for numerical experiments. Ten, we will calculate the distances of the proposed feature vectors between the test windshear and non-windshear cases and then plot the histogram of these distances. Note that efective feature vectors that can be used for machine learning-based windshear detection methods should distinguish windshear cases and non-windshear cases properly by distance (dissimilarity), namely, the distance between features of same kind of cases should be small and the distance between the features of diferent kind of cases should be large. Finally, we apply several clustering methods to do data clustering based on the proposed feature vectors. Compared with the supervised classifcation methods, clustering methods only focus on the information given by the proposed feature vectors of the wind velocity data, which can adequately evaluate how good the proposed feature vectors are.

Dataset.
Te set of wind velocity data used in this study was collected at HKIA in the frst three months from 2017 to 2019. All of them are preprocessed by the quality control method currently using at HKIA [1] to flter out the noisy velocities. Pilot reports are referred as the ground truth of windshear occurrence. Te non-windshear cases are collected on days where there was no windshear. According to the pilot reports from 2017 to 2019, there were overall 369 windshear cases in the frst three months. Correspondingly, 369 non-windshear cases are randomly selected on the days with no windshear under the suggestion from the Hong Kong Observatory. For each selected time stamp, LiDAR observational Doppler velocity data collected in 4 minutes with an elevation angle of 3°are used for feature vector construction. Note that there would be 3-6 LiDAR PPI scans collected in most of investigated periods, which makes it adequate for us to capture the statistical indicators proposed in this paper.

Histogram of Distances.
In this part, we will show the histograms of distances between the proposed feature vectors. Tree frequently used distance assessments are considered here. Te equations of them are given as follows: (i) Euclidean distance: ‖u − v‖ 2 ; (ii) City block distance: ‖u − v‖ 1 ; (iii) Bray-Curtis dissimilarity: where u and v are vectors.
In each histogram, the corresponding distance values between arbitrary two feature vectors of the investigated cases are calculated and the frequencies of the corresponding distance values are shown by the heights of blue bar (y-label). Tere are three kinds of distance histograms, including the distance histogram of feature vectors between two windshear cases, the distance histogram of feature vectors between two non-windshear cases, and the distance histogram between one windshear case and one non-windshear case. For efective feature vectors, the frequencies of small distance values should be large in the histograms of the distances between feature vectors of two windshear/nonwindshear and be small in the histograms of distances between feature vectors of one windshear case and one non-windshear case.
In the construction of indicator f p , there is a variable k which indicates the number of top maximum variation indicators we take into account. Here, we plot several distance histograms of this indicator with k ∈ 3, 5, 8, 10, 12, 15, 20 { }. Here, due to limit space, we only show the histogram with k � 12. Te histograms for other cases are almost the same as it. Figures 2-4 show the histograms of the distances corresponding to the feature vectors f 1 � f p , f 2 � f im , and f 3 � [f p , f im ], respectively. Figure 5 shows the histogram of the distances corresponding to the feature vector proposed in Advances in Meteorology reference [17] f h . From these results, one can readily fnd that most of distance values between feature vectors of two windshear/non-windshear cases are small and most of distance values between feature vectors of one windshear case and one non-windshear case are large. For the feature vector proposed in reference [17], this phenomenon is relatively trivial except for the result of the Bray-Curtis distance. Terefore, the proposed feature vectors are valid for learning-based windshear detection.

Clustering Results.
To further evaluate the efectiveness of the proposed features, we apply four clustering methods to them, including the k-means clustering method, hierarchical clustering method, DBSCAN clustering method with the Euclidean distance, and DBSCAN clustering method with the city block distance. If the proposed features can be properly clustered into two categories where windshear cases or non-windshear cases are the majority, we can consider further using them in other learning-based  Advances in Meteorology windshear detection methods. Since we clearly know whether a feature vector corresponds to a windshear case or not, it is easy for us to calculate the accuracy of the clustering methods and evaluate the efects of the proposed feature vectors. First, we need to determine the best value of k for the physical property based indicator f p . Here, we apply the clustering methods to feature vector f 1 � f p obtained with k ∈ 3, 5, 8,10,12,15,20 { } and calculate the average accuracy. Figure 6 shows the average accuracy curve corresponding to diferent k. From Figure 6, we can fnd that the best average accuracy will be reached when k � 12. In the following section, we will set k � 12 for the indicator f p .  Table 4 for comparison. To further show the clustering results intuitively, the scatter plots of them are shown in Figures 7-10.
From these results, we can readily fnd the following: (i) the physical property-based feature vector f 1 has the best clustering results for most clustering methods except the k-means clustering method. (ii) Te clustering accuracy of image-based feature vector f 2 is greater than 91% for most clustering methods, but the hierarchical clustering method does not work well on this feature vector whose accuracy is only 74.12%. In

Conclusion
In this paper, we propose two statistical indicators of windshear from the LiDAR PPI scan observational wind velocity data for windshear features construction. Both indicators based on the physical properties of windshear and the image processing method are researched in this paper.
To evaluate the efectiveness of the proposed indicators, we construct 3 feature vectors based on them. Diferent from the feature vectors used in previous learning-based windshear detection methods, which only takes the wind profles obtained at the nearest time spot to the windshear reported timestamp, we consider the wind profles collected within 4 minutes of the reported timestamp. Based upon the idea of multiple instance learning, a feature vector of the scans in one bag (time period) is constructed. Te distance histograms for three frequently used distances are given to preliminarily validate the separability induced by the proposed feature vectors for windshear and non-windshear classifcation. Ten, we apply four clustering methods to further evaluate the efects of the proposed feature vectors by the accuracy of these methods and the corresponding scatter plots. Te corresponding numerical results of the feature vector proposed by Huang et al. are also shown in this paper for comparison. Numerical results show the following: (i) by the histograms of the distances between the test cases, we can readily fnd that most of distance values between feature vectors of two windshear/non-windshear cases are small and most of distance values between feature vectors of one windshear case and one non-windshear case are large, which indicate that one can distinguish windshear and nonwindshear properly by distances of the proposed feature vectors. (ii) Feature vector based on the physical property performs better than other feature vectors for most clustering methods. (iii) One can get pretty high clustering accuracy by using the feature vector extracted by the image processing method except the hierarchical clustering method. (iv) Te combined feature vector is quite stable to diferent clustering methods whose accuracy are all greater than 90%. (v) Our proposed feature vectors perform better than the feature vectors proposed by Huang et al. [17]. In the future, one can consider to use the proposed physical property based feature vector or the combined feature vector in other learning-based windshear and non-windshear classifcation to achieve better detection results. Moreover, we can also explore using some image processing methods

Data Availability
Te request for data used to support the fndings of this study could be addressed to the Hong Kong Observatory. Te data provision would be considered on a case-by-case basis.

Conflicts of Interest
Te authors declare that there are no conficts of interest regarding the publication of this paper.