A High-Precision Fatigue Detecting Method for Air Traffic Controllers Based on Revised Fractal Dimension Feature

As air traffic volume increases, the air traffic controller (ATC) fatigue has become a major cause for air traffic accidents. However, the conventional fatigue-detecting methods based on speech are neither effective nor accurate because the speech signals are nonlinear and complicated. In this paper, an ATC fatigue-detectingmethod based on fractal dimension (FD) is proposed. Firstly, a special speech database of ATC radiotelephony communications is constructed. (ese radiotelephony communications are obtained from Air TrafficManagement Shandong Bureau of China. (en, speech signals implement a wavelet decomposition and FD calculation. (e calculation result shows the significant difference among the FD of the speech signal before and after fatigue. Furthermore, a novel fatigue feature of the ATC based on the FD of speech is built. A series of experiments are conducted to detect the ATC fatigue with the fatigue feature comparison process and a support vector machine (SVM). (e results show that the accuracy in detecting ATC fatigue based on FD was 92.82%, which are higher than the state-of-the art methods. (e research provides a theoretical guidance for Air Traffic Management Authority on detecting ATC’s fatigue, while it may provide reference for the fatigue assessment in other professional fields of civil aviation.


Introduction
e rapid development of civil aviation results in continuing increases in the volume of air traffic in China. Despite the rapid growth in the number of flights, the growth rate of the number of ATCs (air traffic controllers) is relatively slow [1]. e associated increasing working pressures are making ATCs more vulnerable to fatigue. is situation has led to frequent air traffic accidents in recent years [2]. e accurately detecting method of ATC fatigued state has gradually attracted the attention of experts and scholars in the field of civil aviation. e fatigue-detecting methods for ATCs can be divided into subjective and objective methods [3]. A popular subjective method is the fatigue scale. e fatigued state is detected according to the score on the scale [4]. For example, Chalder et al. reported the Chalder fatigue scale [5]. Subjects were asked to fill out the scale before and after work. Although this method is easy to implement, it cannot detect the fatigued state either rapidly or accurately. erefore, the objective method has received a considerable amount of research interest of researchers. A popular objective method uses instruments, equipment, and other auxiliary tools to determine the fatigued state. is kind of method records changes in certain indicators of human physiology, biochemistry, behavior, or other characteristics. Objective methods can be also subdivided into contact and noncontact detection methods. is classification is based on whether the detection tool needs to be in contact with the human body when attempting to detect fatigue. Contact detection methods mainly detect the fatigued state by recording changes in physiological indicators of the tested person (such as the EEG, ECG, or heart rate) [6][7][8]. Some European researchers have proposed obtaining the fatigued state of ATCs by analysing five physiological indices. ese indices involve skin voltage, skin conductivity, skin blood flow, body temperature, and instantaneous heart rhythm [9]. Chen et al. reported the subjective symptoms and physiological measures of fatigue in air traffic controllers [3]. is study was carried out on 102 ATCs in Taiwan. e tests collected the physiological information of flicker fusion threshold, thumb/index finger strength, and systolic and diastolic pressure. e process of detecting requires physical contact with the tested person. Although this detection method is highly accurate, its applicability is poor. Physical contact potentially interferes with the work they are performing.
at also makes it difficult to implement. Noncontact detection methods mainly employ facial-or vocal-feature-based detection method. e facial-featurebased detection method involves collecting images of the face. ese images are used to detect the fatigued state based on facial-features (such as movements of the eyes and mouth) [10,11]. Nie et al. found the significant difference among the indicators before and after experiment the experimental subjects undergo fatigue [12]. ese indicators include eye closure time, percentage of eyelid closure (PERCLOS) value, frequency, and number of blinks. e fatigue of experimental subjects is obvious when eye closure time is 3.5 s/min, PERCLOS value 6%, and blink frequency 0.4 times/s. Di Stasi et al. improved the accuracy of fatigue test results by studying how the characteristics of eye movement reflect a fatigued state [13]. Here, 12 subjects were asked to perform a prolonged and demanding visual search task under fixation conditions. A major disadvantage of this method is that the image acquisition equipment needs to remain in front of the face of the subject while collecting facial images. is situation induces a certain psychological pressure. In contrast, the vocal-feature-based detection method is currently receiving a considerable amount of research interest due to its high accuracy and applicability [14]. Krajewski et al. introduced a general framework for detecting accident prone fatigue states based on prosody, articulation, and speech quality-related speech characteristics [15]. e advantage of this measurement method is that the speech data used for detecting fatigue can be easily obtained without needing sensors or calibration. Krajewski et al. subsequently proposed a framework for detecting a fatigued pattern that combines speech nonlinear dynamics analysis with a machine-learning classification algorithm [16]. e utilization of speech nonlinearity greatly improves the accuracy of fatigue detection. Deng et al. proposed an internal model-based neural network control for unknown nonaffine discrete-time multi-input multi-output (MIMO) processes in the nonlinear state space form under model mismatch and disturbances [17]. Palo et al. proposed a timefrequency source feature for emotional speech classification [18]. e feature spans four dimensions. is method improves the detection speed while maintaining the detection accuracy. e remainder of this paper is organized as follows. In Section 2, a time-frequency vocal source feature is introduced.
en, the relationships between radiotelephony communication and the chaos in a speech signal are reported. In Section 3, a revised feature called the speech wavelet fractal feature (SWFF) is proposed. e process of constructing this feature is described in detail. en, a novel fatigue-detection method for ATCs based on SWFF is proposed. In Section 4, a series of experiments are conducted. In Section 5, conclusions and future research directions are presented.

Radiotelephony Communication.
e radiotelephony communication recorded by an ATC-speech recording system in each time period contains all the voices of ATCs and aircrews. erefore, it is necessary to extract the speech data of the ATCs firstly. According to the characteristics of radiotelephony communication, semantic recognition is used to achieve this goal. Radiotelephony communication refers to a method of issuing and executing instructions between ATCs and aircrews. is communication has the following characteristics. When the ATC talks to an aircrew, they hold down the button on the headset cable. ey release the button after the end of the call when the aircrew first establishes contact with the ATCs. e communication structure adopted by the ATC is aircraft call sign-+ control-unit code + content, while that of the aircrew is control-unit code + aircraft call sign + content. After the first call, the call structure of the ATCs is aircraft call sign + content and that of the aircrew is content + aircraft call sign.
is revealed that the ATC always states the aircraft call sign first when they reply to the aircrew. e aircrew uses the aircraft call sign as the closing remark at all times except when first establishing contact with the ATC. erefore, the aircraft call sign should be the beginning of the speech transmission from the ATC. e aircraft call sign at the time of receiving is used as the sign to end the call. ese features make it relatively easy to timely distinguish the voice of ATCs. An audio editing software (GoldWave) was used to intercept the voice of ATCs from the radiotelephony communication at different times. In addition, fatigue is more likely to make lead ATCs to be frequent pauses, hesitation, or even speak incorrect instructions when they issue control instructions. is is also consistent with the known physiological mechanisms of fatigue. erefore, in this paper, when there is a problem with the instructions, ATC is fatigue.

Hurst Vocal Source
Feature. Vocal source features extracted from speech signals contain important information about the distribution of harmonic [19]. In addition, the characteristics of the excitation source affect the spectrum envelope of short-term speech. Because these different sounds have different characteristics, vocal features have previously been studied for automatic detection of emotion in speech [20].
e Hurst Vocal Source Feature (HVSF) introduced in this paper is a time-frequency feature used in a speaker recognition and verification system. is feature consists of a vector containing the Hurst index. It is closely related to the excitation source. e Hurst index (0 < H < 1) indicates the time correlation or scale of speech signal Its autocorrelation coefficient function (ACF) decreases gradually in the following form: (1) e value of H can be associated with the spectral characteristics of X(i) { } N i�1 . e proposed HVSF can represent the emotional state of the person speaking. In the process of extracting the HVSF, time-frequency multiresolution analysis captures the high-order correlation of speech samples. is correlation is also found when source features are extracted from linear prediction residuals. erefore, HVSF is closely related to the characteristics of the excitation source. is relationship can be utilized for recognizing emotions [21]. e process of HVSF extraction based on wavelet analysis [18] is as follows: Step 1: speech signals are decomposed into approximate coefficients (a(j, k)) and detail coefficients (d(j, k)) using the discrete wavelet transform. e j is the decomposition scale (j � 1, 2, . . . , J). e k is the coefficient index of each scale.
Step 2: for each scale j, variance σ 2 j � (1/n j ) k d(j, k) 2 is derived from the detail coefficient, where n j is the number of possible coefficient values of each scale.
Step 3: weighted linear regression is used to obtain the relative slope of (j, Y i � log 2 (σ 2 j )). e value of H is obtained as H � ((1α)/2).  e speech signals here were randomly selected from the speech instructions of ATCs. e fatigued and normal states are distinguished by whether the ATCs can continue to perform their duties normally (i.e., whether the speech given by the ATC has the problems mentioned in Part 2.1). e subjective feelings of the ATC and the Chalder fatigue scale are also considered. In this project, the Daubechies wavelet filter was used to decompose the discrete wavelet transform. It can be seen that there is no obvious difference in the distribution of the H index between the fatigued and normal states. is problem is due to three limitations of applying the HVSF to detect fatigue in ATCs. First, the H index cannot adequately indicate the changes in chaos of radiotelephony communication. Second, unlike the speech data in the Berlin Database of Emotional Speech (EMO database) or other databases of emotional voices, the fatigue detection of ATCs is based on the speech characteristics of different voices (different speech contents). e radio communication itself is also distinctive and has been discussed above. e speech data in EMO database or other databases are derived from the speech produced by the same person in different emotional states in which the semantic content is the same.
is is obviously not practical for the present application. ird, the speech data of ATCs are contaminated by noise due to influences from equipment and environment associated with the data recording process. A revised vocal feature for detecting ATC fatigue is proposed as follows.

Chaos in a Speech
Signal. e H can be used as an index to judge whether time-series data follow a random walk or a biased random walk [22]. Because this index cannot adequately indicate the changes in chaos of radiotelephony communication, this paper illustrates how the chaos in a speech signal changes the presence of fatigue from a different aspect. Based on Takens' embedding theorem, the chaotic nonlinear dynamic model of speech signals is reconstructed using a delay phase diagram method [23]. e process is in the phase space of the discrete time series of a speech signal. e model describes the phase-space topology of the strange attractor of speech [24]. When reconstructing the speech sequence X(i) { } N i�1 sampled in discrete time, the vector point set P(i) in m-dimensional space with delay τ is obtained: e velocity of the airflow when speaking decreases when a person is in a state of fatigue. e friction and viscous force of the airflow increase due to the softening and cooling of the vocal-duct wall. is physiological change will reduce the energy of the airflow turbulence in the vocal-duct boundary layer [16]. Turbulence forms the basis of a chaotic speech signal. Any change in the turbulence directly affects the chaotic characteristics of a speech signal. Figures 3-6 depict the phase-space trajectories of different speech states when the same person utters the same speech. e right and left subgraphs of all four graphs show the phase-space trajectories of speech signals produced in the normal and fatigued states, respectively. Four words and numbers (i.e., speed, height, 182, and 134375) that are highly representative of the content of radiotelephony communication. e figures clearly indicate that the degree of speech fluctuation is significantly lower in a fatigued state than in a nonfatigued state.
erefore, the changes of chaos in speech can be quantitatively evaluated by using a reconstruction model of speech signals.

Fractal Dimension and Fatigue.
e chaos in speech signal is related to fractal theory. e trajectory change of a nonlinear system in the process of chaotic evolution has some universality. Aerodynamics shows that the generation of a speech signal is a nonlinear process. Furthermore, the production of a speech sound (in particular, breathing sounds such as frictional and explosive sounds) involves the generation of eddies in the boundary layer of the vocal tract that eventually turns into turbulence, which has been proved to be a kind of chaos. is qualitative result forms the basis of applying chaos and fractal theory to the analysis of speech signals. Fractal is a complex system whose complexity can be described by noninteger dimension called the fractal dimension (FD).
Fractal dimension is the main parameter to describe fractal. e FD indicates the complexity of a fractal set. It is not a simple extension of Euclidean dimension, instead has many new connotations [25]. Generally, in Euclidean geometry, a line or curve is one dimensional. A plane or sphere is two dimensional. A geometry with length, width, and height is three dimensional. However, the complex fractal (such as coastline, koch curve, and shelpensky sponge) cannot be described the dimension of the integer. e FD has broken through the limit of the integer dimension of a general topological set. e importance of the FD is that it can be defined by data and calculated approximately experimentally. It is related to H as follows [26]: where D is the FD. On this basis, a revised fatigue feature of the ATC is proposed.
e formula for calculating the fractal dimension is as follows: where ε is the side length of a small cube and N(ε) is the number needed to cover the measured geometry with the small cube. e formula is to determine the fractal dimension by covering the measured geometry with a small cube. For random fractal, different approximate methods can be used to calculate it, and some appropriate methods can also be used to measure it.
In order to obtain qualitative information about dynamic systems, it is often necessary to have sufficient information about the state evolution. However, in many practical engineering applications it is only possible for data acquisition equipment to obtain 1-D vectors containing system information, namely, time-series data. erefore, it is necessary to extract the qualitative information of the system from the experimental time series, in which reconstructing the phase space is the first step to detecting weak signals. e FD of time series in this paper is calculated directly in the time domain, which not only simplifies the computational complexity but also achieves the same effect as phase-space reconstruction. A time series X(i) { } N i�1 with length N is set up. ere are k new time series X m k that are obtained by reconstructing the time series with a delay method. e new time series X m k has the following form: e curve length L m (k) of each X m k can be calculated using Low-pass filter Band-pass filter Low-pass filter Figure 1: Extraction processing for the HVSF assuming that the multidimensional estimator based on wavelet analysis has three steps of decomposition. "HC" represents the calculation method of the H index described above. e length of the total sequence can be approximated as the average of the length of the sequence curve generated by k delays: For different values of k, a set of curve data related to k and L(k) can be obtained. Curve lb(L(k)) ∼ lb(1/k) can be drawn out. If it is a straight line, the relationship between k and L(k) is as follows: Linear fitting is used to obtain the straight line:

Mathematical Problems in Engineering
A method for determining k max is proposed, which involves changing the value of k max from k max � 1 when the abovementioned method is used to calculate the FD of a time series. When the value of FD no longer clearly changes, the corresponding k max value is the most suitable for calculating the FD of this kind of time series by using the abovementioned algorithm. e specific calculation method of FD is shown in Figure 7.
e FD is calculated for the speech data which is used to calculate the HVSF feature in Section 2, as shown in Figure 8.
It is not difficult to see from the graph that the FD is obviously smaller for speech produced in a fatigued state than in a normal state. Furthermore, analysis shows that this situation applies to different ATCs in two states. Table 1 lists the FDs for some marked voice instructions from different ATCs recorded during the same time period (07 : 00 to 10 : 00 on April 26, 2018).

Speech Wavelet Fractal Feature.
e speech data of an ATC are contaminated by noise due to the influence of data-acquisition equipment and the environment. Considering this problem, an improved vocal feature of ATC fatigue is proposed. e noise in radiotelephony communication contains more energy at low frequency. In this paper, wavelet decomposition is used to extract the detail coefficients of the ATC speech signal to reduce the influence of noise. In wavelet decomposition, it is very important to choose the appropriate decomposition scale and wavelet basis function. e decomposition scale (j) is closely related to the frequency range of speech signals and the frequency distribution of wavelet decomposition. e frequency distribution of speech signals on each scale after wavelet decomposition is shown in Figure 9.
If the signal is decomposed in four layers, the frequency range of the fourth-level low-frequency coefficients is 0-500 Hz. If the signal is further decomposed into five layers, the frequency range of the fifth-layer low-frequency signal is 0-250 Hz. e energy and information in a speech signal is generally present between 300 and 3400 Hz. erefore, it is meaningless to decompose the speech signal in the fifth level. erefore, for a speech signal with a sampling frequency of 8 kHz, a wavelet decomposition scale (j) of 4 can be chosen. Different wavelet basis functions will have different impacts on noise reduction. Generally, the wavelet bases that can produce the most coefficients near to zero are chosen. When wavelet transform is applied to signals, the selected wavelet bases are better able to satisfy the properties of symmetry or antisymmetry, compact support, and orthogonality simultaneously. e main properties of common wavelet bases are listed in Table 2. It is not difficult to see that the Daubechies wavelet is highly consistent with the abovementioned requirements. erefore, the Daubechies wavelet was chosen as the wavelet basis function.
When the wavelet decomposition scale and wavelet basis are determined, the speech signal can be decomposed by using the wavelet transform. en, the detail coefficients can be extracted. e FD of the detail coefficient d i of each layer is also calculated: where FD is the FD calculation method described in this paper, for i � 1, 2, 3, 4k max (d i ) � 10, and D(d i ) represents the FD of the detail coefficients of d i layer. e FD comparison of the detail coefficients of each layer is shown in Figure 10. Finally, the SWFF of the ATC fatigued speech is built, which is composed of the FD of the speech signal and its detail coefficients. In the following formula, D(x) represents the fractal dimension of the speech signal:        Mathematical Problems in Engineering A novel fatigue-detection method for ATCs can be proposed this moment. is method takes SWFF as fatiguedetection feature. en, a support vector machine (SVM) is used to detect fatigue, as shown in Figure 11. e first step of the method is to construct a speech database of control instructions corresponding to the individual ATC. e speech signals in the database are marked as normal or fatigued. e second step is to decompose the voices in the database using four-layer wavelet decomposition. en, the detail coefficients of each layer are extracted. e third step is to calculate the FD of the sequence in the second step according to the method described in this paper. e FD of the sequence is used to obtain the SWFF of the ATC. Finally, an SVM is applied to detect ATC fatigue. e implementation process of the method is reported in detail by experiment.

ATC Speech Database.
e fatigue experienced by ATCs mainly results from a poor working environment and inadequate rest due to an excessive workload. e factors affecting fatigue can be roughly divided into aspects such as personality characteristics, the available facilities and equipment, the operating environment, duty scheduling, and organizational management.
In order to address the aims of the present study, each speech signal was numbered according to certain rules, as presented in Table 3. ese rules are worked out by taking factors (such as working time, age, position, and the individual feelings of ATCs) into consideration. Finally, an ATC speech database is constructed. is database could support future research into the fatigue of ATCs. e numbering scheme used consists mainly of numbers and English letters that indicate certain factors.

Support Vector Machine Settings.
e SWFF of 696 speech signals in the ATC speech database were selected. en, a SVM was used to detect fatigue status. e selected speech signals were recorded from 02 : 00 to 17 : 00 on April 26, 2018. ey were sampled at 8000 Hz. In all samples, the number of negative samples is less than that of positive samples. erefore, in the simulation experiment, we selected all the fatigue samples, and then select positive samples with the same number of negative samples according to the time sequence of speech samples being recorded. ese voices  It should be noted that each speech sample was different in terms of the semantic content and signal length. SVM is a popular method in machine learning. is method is "robust." A few support vectors determine the final result and are not sensitive to outliers, which eliminates a large number of redundant samples. Based on these advantages, the process of training and detecting sample data using an SVM is shown in Figure 12.
x j , y j } N j�1 is a given set, where x j ∈ R n is the jth input vector and y j ∈ R is the corresponding output. e overall model by weighted LS-SVM is formulated as [27] y v � C i�1 P iv y iv C i�1 P iv . (12) e ith weight coefficient of x v is calculated by P iv x iv � A 1 iv x 1 iv , A 2 iv x 2 iv , . . . , A n iv x n iv , where θ t i is the tth component of the center θ i , β t i is the tth component of the width β i , and λ is the parameter to control the overlapped ratio. e reconstructive set in the ith subspace can be expressed by A t ij is the membership grade, t � 1, 2, . . . , n. Weighted LS-SVM employs fuzzy c-means clustering to decide the number of rules, which is based on the following formula: where m ∈ (1, ∞) is a fuzzy exponent, μ ij denotes the degree that x j belongs to the rule i, μ ij ∈ U, and z i is the ith cluster center. e novel LS-SVM considers general errors that include noises of input variables and output variables as empirical errors [28].
Considering that the radial basis function (RBF), namely, the Gauss kernel function, has better antijamming ability for noise in data, the classical robust RBF is chosen as the kernel function of the SVM in the ATC fatigue-detection method proposed in this paper. e RBF kernel in this research is the same as the activation function used by Mu and Zhang [29]. e mathematical model of kernel function is as follows: where c is the parameters of kernel function. A K-fold crossvalidation (K-CV) method is used to obtain the values of the penalty factor c � 9.7656 × 10 − 4 and Gamma parameter c � 0.5. In the experiment, the 696 original data were divided into K � 6 groups (generally average). Each subset data is used as a verification set, and the rest K-1 subset data is used as a training set so that K models can be obtained. e average of classification accuracy of the final verification set of these K models is used as the performance index of the classifier under this K-CV. K is set to 6. Except where stated otherwise below, the above parameters were used for the SVM in the experiment experiments.   True category Prediction category Figure 14: Test results for support vector machines with SWFF and RBF kernels.

Result and Analysis.
In order to fully demonstrate the detection accuracy of the proposed method, different combinations of fatigue features and classifiers were simulated in this study. First, the HVSF was used as the fatigue feature. e SVM was used as the classifier to detect the fatigued state of ATCs. e test results are shown in Figure 13. e kernel function of SVM is RBF kernel function. Subsequently, the SWFF is used to replace the HVSF as the fatigue feature to be detected. e experiment results in Figure 14 show that the fatigued-state detection results based on SWFF characteristics are superior in terms of accuracy than those based on HVSF.
When the predicted and real categories are the same, the detection results can be considered as correct. Detection accuracy is the ratio of the number of correct detection to the number of samples in the detection set. Detection accuracy A det is calculated as follows: where N corr is the number of correct detection and N det is the number of samples in the detection set. Second, considering that the use of different kernel functions for the SVM may affect the detection results, the detection results were also analysed when the fatigue characteristics were SWFF and the kernel functions of the SVM differed, as listed in Table 4. at table also gives the parameter settings, fatigue-detection rate, and total detection accuracy of different methods. It is clear from the table that when the kernel function is a polynomial, the accuracy of the test results were reduced to 85.63%. e results obtained when the classifier for detecting a fatigued state was changed to a backpropagation (BP) neural network are also presented in the table.
e accuracy when using the BP neural network as a classifier was 8% lower than when using the SVM. e test results for the 348 speech instructions of the ATCs indicate that the best performance in detecting a fatigued state was obtained when using SWFF. e classifier used here for detection was an SVM, whose kernel function is RBF, with a wavelet decomposition scale (j) of 4 and a db1 wavelet basis function. is method produced a high detection accuracy of 92.82%. In particular, the fatigue-detection rate of the proposed method is 96.55%. is is very important for aviation security. In short, the technology for detecting ATC fatigue proposed in this study is superior to other advanced fatigue-detection technologies.

Conclusion
In this paper, the radiotelephony communications of ATCs have been analysed. e chaos in radiotelephony communications has also been discussed. e chaos can be used to accurately judge the fatigued state of ATCs. e phase-space trajectories are significantly more complex of a fatigued state than that of a normal state. is study has introduced the HVSF for fatigue detecting. Because of the specific characteristics of radiotelephony communications, the HVSF cannot represent the fatigued state of ATCs well. ere is no significant difference in the HVSF between fatigued and normal speech signals. erefore, a revised vocal feature of ATCs called SWFF was proposed based on FD. is feature shows a great change in speech signal when an ATC is fatigued. e FD is obviously smaller for speech in a fatigued state than in a normal state. Furthermore, analysis shows that this situation applies to different ATCs in two states.
A fatigued speech database of ATC has been constructed. e file name of each voice in the database represents the state information related to that voice. is database could support future research related to the fatigue of ATCs. A method for detecting ATCs in a fatigued state is proposed based on SVM technology and the SWFF of ATCs. is method is robust for noise contamination. e experiment results obtained for different fatigued-state detection methods demonstrate the superiority of the proposed method.
e accuracy of the proposed method was at 92.82%. at is higher than the accuracy of the other fatigued-state detection methods analysed. In particular, the fatigue-detection rate of the proposed method is 96.55%.
is is very important for aviation security. e research provides a theoretical guidance for Air Traffic Management Authority on detecting ATC fatigue, while it may provide reference for fatigue assessment in other professional fields of civil aviation.
Data Availability e data that support the findings of this study are available on request from the corresponding author. e data are not publicly available due to their containing information that could compromise the privacy of Air Traffic Management Shandong Bureau of China.

Conflicts of Interest
e authors declare that they have no conflicts of interest.