A Deep Neural Network-Based Fault Detection Scheme for Aircraft IMU Sensors

. A new fault detection scheme for aircraft Inertial Measurement Unit (IMU) sensors is developed in this paper. This scheme adopts a deep neural network with a CNN-LSTM-fusion architecture (CNN: convolution neural network; LSTM: long short-term memory). The fault detection network (FDN) developed in this paper is irrelative to aircraft model or ﬂ ight condition. Flight data is reformed into a 2D format for FDN input and is mapped via the net to fault cases directly. We simulate di ﬀ erent aircrafts with various ﬂ ight conditions and separate them into training and testing sets. Part of the aircrafts and ﬂ ight conditions appears only in the testing set to validate robustness and scalability of the FDN. Di ﬀ erent architectures of FDN are studied, and an optimized architecture is obtained via ablation studies. An average detecting accuracy of 94.5% on 20 di ﬀ erent cases is achieved.


Introduction
Inertial Measurement Unit (IMU) is a key sensor to aircraft control system as it measures angular speeds and accelerations in flight. Faults in IMU may result in serious consequences as flight control algorithms are highly dependent on feedback of angular speeds and accelerations. Hence, it is of immense significance to achieve fault detection of IMU in order to improve fault tolerance of the control systems [1].
Hardware redundancy (HR) is a commonly adopted approach for sensor fault detection on aircrafts [2,3]. HR scheme adopts redundant sensors and a voting system to eliminate erroneous data measured by fault sensors. Despite the fact that HR scheme usually improves fault tolerance of flight control systems, defects including high cost and vulnerability to generic fault are nonnegligible [4].
Analytical redundancy (AR) is another kind of approach, which can be divided into two categories: model-based AR and data-driven AR.
Model-based AR is a more traditional technique. Sensors are modeled mathematically, and an output estimator is pro-posed [5]. Real output of a certain sensor is monitored and compared with the estimated output, yielding a residual, to identify sensor faults. A majority of studies on modelbased analytical redundancy for fault detection were carried out by applying Kalman filtering (KF), e.g., extended Kalman filtering (EKF) [6][7][8], unscented Kalman filtering (UKF) [9,10], two-step EKF [6], fuzzy logic KF [11], and hidden-Markov-based KF [12]. Other model-based methods were found using H∞ synthesis [13][14][15], set-value observer [16][17][18][19], and moving horizon estimator [20]. Model-based methods are highly dependent on aircraft dynamics and kinematics, which differ significantly among different aircrafts at diverse flight conditions. Thus, scalability is not promised by using model-based schemes.
Data-driven AR is an alternative to model-based AR. As data-driven AR maps sensor outputs to fault cases directly, which means aircraft dynamics are not directly referred to in the fault detection process, it would potentially be a scalable scheme for different aircrafts and flight conditions. Neural network (NN) is widely used in recent years for data-driven AR as it is a powerful nonlinear fitting tool.

Problem Definition
Aerodynamic is written as wherein S * = sin ð * Þ and C * = cos ð * Þ. fV, α, βg are airspeed, angle of attack, and sideslip angle, respectively. fϕ, θ, ψg and fp, q, rg are angles and angular speeds, respectively. fa x , a y , a z g denote accelerations along the x, y, z axes in the body frame of the aircraft. External forces along body axes are defined as fF k = a k mg k=x,y,z which are functions of flight states, control inputs, and geometric parameters: wherein fδ th , δ e , δ a , δ r g are control inputs of throttle, elevator, aileron, and rudder. fS, b, cgdenote wing area, wing span, and mean aerodynamic chord, respectively. Kinematics in rotational channel are described as Dynamics in rotational channel yield [33] _ wherein fM x , M y , M z g are moments exerted on the aircraft and c 1~c9 are inertial parameters of the aircraft (refer to [33] for detailed definitions). External moments are functions of flight states, control inputs, and geometric parameters: We intended to fully cover the whole flight envelope (e.g., LTO and cruise) by introducing the 4 configurations mentioned above. More detailed information of the configurations is concluded in Table 1.   2 International Journal of Aerospace Engineering High-altitude cruise of aircraft Y and low-altitude LTO of aircraft D are simulated for network training; highaltitude cruise and low-altitude free flight of aircraft B are simulated for network testing.
To simulate the atmospheric turbulences, Dryden model is adopted, which injects perturbances to "clean" flight states. A speed model of Dryden wind can be defined by spectrum functions: wherein fL u , L v , L w g represent the turbulence scale lengths. fσ u , σ v , σ w g represent the turbulence intensities. Measurement noises are considered by introducing Gaussian noises. See Table 2 for noise configuration in each channel of flight states measured.
See Figure 1 for simulated flight trajectories of the 4 configurations.

Flight Data with IMU Fault Cases
Injected. Typical faults of IMU include drift, noise, and scale factor. As scale factor is produced during manufacturing, which can be calibrated beforehand, we focus on noise and drift, i.e., angle random walk (ARW) and rate random walk (RRW), which behave randomly and are more difficult to detect in flight.
We introduced 4 cases with IMU faults by adding randomized faulty values to basic flight data described in Section 3.1. To be specific, 6-DoF IMU composed of triaxial accelerometer and three-axis gyroscope is studied. Although there are 3 channels in a triaxial accelerometer and threeaxis gyroscope, respectively, they function as 2 measurement units. Thus, we concern about whether there are faults in the accelerometer or gyroscope, rather than in a specific acceleration or angular speed channel, to provide information for further determination if a redundant measurement unit is to take over. Fault cases are listed below. One and only one type of fault occurs simultaneously or individually in acceleration or angular speed channels.   We have 5 different flight cases (including cases without IMU faults, Case 0) to be classified for each flight configuration. Fault cases happen at a random time and last for a random period (but less than 60 s) in every 60 s during flight. Figure 2 illustrates how fault cases are injected, where red dashed lines represent basic flight data and black solid lines with existence of IMU fault. We repeat this randomized fault-injecting procedure multiple times on each flight configuration depicted in Figure 1.

Data Structure for Fault Detection Scheme.
In real flights, IMU faults last for a certain length of time period with some "pattern," as the way we insert them in Section 3.2. Therefore, it is more reasonable to observe flight data fragments rather than flight data points. A time window of length ΔT slides on each flight data fragment. Data sets composed of flight data matrices are formed. Each column of a data matrix is a vector of measured flight state at certain time t. To be specific, we downsample data fragments at sample rate of 1 Hz and use a time window of ΔT = 30 s to form data matrices, yielding inputs with a dimension of 31 columns and 12 rows (representing all 12 variables measured by sensors: 3 ADS states fV, α, βg, 3 Euler angles fψ, θ, ϕg, 3 angular speeds fp, q, rg, and 3 accelerations fa x , a y , a z g). Figure 3 illustrates how a flight data matrix is formed in angular speed and acceleration channel, wherein the left plot is part of the flight data sequences with fault inserted, and the right plot is the flight data matrix extracted by a time window of length ΔT at time t from the sequence in the left plot.
Data sets are divided into training and testing sets. To be specific, 2/3 data of aircraft Y and D is extracted for training. 2/3 of Y and D and all data of aircraft B are for testing. Fault cases are randomly added. See Table 3 for detailed distribution of the data sets.

Channel
Standard deviation Unit    Table 1. The IMU fault detection problem discussed in this paper can be regarded as a classification issue of sequential data, which is composed of 2D flight-data matrices. A deep neural network with both CNN and LSTM modules is designed to deal with this kind of issue.
With CNN, we intend to extract semantic information of 2D data matrices. Feature maps are generated each time an input passes through a convolutional layer. With the increase of network layers, deeper semantic information is expected to be extracted. We use leaky ReLU as activation function which can reduce the appearance of silent neurons.
With LSTM, we aim to extract temporal information. Gate operations are used in the LSTM module to improve its performance. For network training, "crossentropy" is used to generate training loss and "Adam" is adopted as optimizer. An exponential-decaying learning rate (lr) was adopted: lr = 0:001 × e −epoch/200 , wherein one epoch means all training data is used once.

Evaluation of Fault Detection Performance.
To evaluate the classification performance of the fault detection network, a confusion matrix is computed each time the network is tested with test sets. Each one of diagonal elements of a confusion matrix is the percentage of cases correctly predicted; i.e., the closer the diagonal elements are to 100, the better the network performs.
An example of confusion matrix is presented in Table 4. As described in Section 3.2, we took into concern both drift and noise in angular speed and acceleration channels, forming 4 fault cases and 1 clean case with no fault. Test sets (for example: aircraft Y with flight condition "low altitude, LTO, manual," see Table 1) are sent into a fault detection network with a certain structure. Every one element of the confusion matrix is a statistical result of predictions; e.g., the 2nd row of the matrix indicates that 6% of the "angular speed drift" cases were predicted "clean," 93 correctly predicted "angular speed drift," 0% "acceleration drift," 1% "angular speed noise," and 0% "acceleration noise." In this case, prediction precision of case "angular speed drift" is 93% for aircraft Y with flight condition "low altitude, LTO, manual." A a y a z Figure 3: Formation of flight data matrix. 6 International Journal of Aerospace Engineering With the diagonal vectors of confusion matrices discussed above, we obtain a tool to evaluate prediction performance of a certain architecture of FDN. For each network architecture to be discussed in the next section and aircrafts/flight conditions in Table 1, we send all test sets into the FDN to come to a diagonal vector of confusion matrix, generating Table 5 for comparison.

Architecture Studies of the IMU FDN.
As the inputs we send into the network are not images but data sequences of aircraft dynamics, general rules for architecture optimizing in image recognition networks may not work. We adopt ablation studies beginning with a full architecture to find a "best" network for the fault detection problem. The structure of the full network is depicted in Figure 4.
In the full network, namely, FDN-FULL, we send fV, α, βg (air data sensors, "ADS"),fψ, θ, ϕg (Euler angles, "ANGLES"), fp, q, rg (angular speeds, "AS"), and fa x , a y , a z g (accelerations, "ACC") into the input layer, which means all the measured variables are used for fault detection. And for all channels, both CNN and LSTM layers are adopted.    The FDN is essentially a nonlinear function that maps input (flight states) to output (fault cases). Notice that equation (1) indicates that accelerations (in which we are interested) are mapped to ADS data explicitly. Sending both ADS and ACC into FDN may lead to data redundancy and overfitting. So, we cut the ADS channel out so that the size of the network blocks before the Concatenate layer was reduced to 3/4 that of FDN-FULL. Prediction performance is slightly improved as expected, see Table 5.

FDN-FULL/ANGLES. Equation (3) indicates that
Euler angle may be redundant due to the explicit mapping relationship between it and angular speed. So, we cut ANGLE channel out just like what we did to ADS in Section 4.4.1. Prediction performance is also slightly improved while the size is reduced, as shown in Table 5 Figure 5. Validation accuracy increases and loss decreases with the number of filters, which means better performance. We chose 64 filters for a balance of performance and net size. Then, studies on hyperparameters 2 and 3 are carried out on a fixed number of 64 CNN filters.
The training history of different sizes of CNN kernels is depicted in Figure 6. As a common view in CNN, the size of receptive field and computational cost are proportional to the size of CNN kernels. Figure 6 shows that FDN-OPT with 2 × 2 or 3 × 3 performs better on validation sets than 1 × 1 or 1 × 1 + 2 × 2. A model size of FDN-OPT with 2 × 2 CNN kernels is 55.8 MB and 3 × 3 59.1 MB. So 2 × 2 kernel is determined for balancing model size and performance.
The training history of different numbers of LSTM nodes is depicted in Figure 7. The FDN performs worst in training when the number of LSTM nodes is equal to the number of CNN kernels (64) and best when the number of LSTM nodes is half (32) or twice (128) the number of CNN kernels. As 128 nodes are a commonly used configuration in a majority of LSTM practices, it is finally adopted in our work. This result implicates that the number of LSTM nodes should not be equal to the number of CNN kernels when composing a DNN with both CNN and LSTM. shows that both CNN and RNN are necessary. Predicting result of FDN-OPT proved these two conclusions: for IMU fault detection problem, we need only AS and ACC information, and a CNN-LSTM fusion deep network provides satisfying performance.
Starting from FDN-FULL, we obtained an equivalentin-performance but smaller-in-size deep network for IMU fault detection problem. We used both manual and AP flight data of 2 aircrafts (Y & D) to train the network and another aircraft (B) to test it. Results show that accuracy of prediction is over 84% for all aircrafts and flight configurations, which means the fault detection scheme is robust and potentially effective for other aircrafts/flight configurations.
To eliminate stochastic effects during NN training, we trained each architecture 20 times till loss converged (3000 epochs for FDN-RNN, 1000 epochs for other architectures). Figure 8 depicts the training history of different architectures discussed in the previous sections. In Figure 5, FDN-OPT (black lines) claims the lowest training loss on average, as shown in (b). FDN-FULL/ADS (blue lines) and FDN-FULL/ANGLE (magenta lines) perform alike, as well as FDN-FULL. Loss curves of those 4 architectures are similar in Figure 8, which strengthen the argument that prediction performances are almost of the same order as shown in Table 5. Average loss of FDN-CNN is greater than those 4 architectures but obviously much better than FDN-RNN. The reason performance of FDN-CNN being close to FDN-FULL might be that input data was arranged in a form temporal information had been included (see Section 3.3). Loss of FDN-RNN decreases much slower than any other architectures as is expected, as its prediction performance is worst in Table 5.
Both Figure 8 and Table 5 show that FDN-OPT is the most size-performance balanced one among all studied in this work; we adopt this architecture for IMU for detection task. Architecture of the optimized FDN-OPT is depicted in Figure 9.

Conclusion
A CNN-LSTM-fusion fault detection network (FDN) is proposed for aircraft IMU fault detection. Flight data measured by Inertial Measurement Unit (IMU) including angular speed (AS) and acceleration (ACC) are used as inputs to the FDN and fault cases as output. We simulated different aircrafts with various flight conditions to guarantee data diversity, and testing sets were data extracted from flight data that was not adopted in the training process. Testing results were satisfying for 3 different aircrafts (trained and validated with large cargo Y and general aviation D, tested with large passenger B) simulated in different flight conditions (low-altitude LTO, high-altitude AP cruise, and lowaltitude manual free flight), which means the FDN developed in this paper is robust to flight conditions and potentially scalable for different types of aircrafts.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.