An Adaptive Operational Modal Analysis Method Using Encoder LSTM with Random Decrement Technique

A new parameter identification method under non-white noise excitation using transformer encoder and long short-term memory networks (LSTMs) is proposed in the paper. In this work, the random decrement technique (RDT) processing of the data is equivalent to eliminating the noise of the raw data. In general, the addition of the gate in LSTM allows the network to selectively store data, which avoids gradient disappearance and gradient explosion to a certain extent. It is worthwhile mentioning that the encoder can learn the essence of data, which reduces the burden for the LSTM. More specifically, establish as simple LSTM structure as possible to learn the data of this essence to achieve the best training effect. Finally, the proposed method is used for simulation and experimental verification, and the results show that the method has the advantages of high recognition accuracy, strong anti-noise ability, and fast convergence rate. Specially, the results indicated appropriate accuracy proposed by deep learning combined with traditional method for parameter identification as well as proper performance of the proposed method.


Introduction
Operational modal analysis only needs to measure the vibration response data of the structure, and there is no need to measure the input excitation, which saves the measure cost. In addition, the modal parameters can be directly applied to the on-line health monitoring and damage diagnosis of the structure. What is more, for some complex and large structures, such as aerospace vehicles, offshore platforms, and bridges, it is difficult to measure the excitation under the actual working conditions, so it is of great engineering significance to identify the modal parameters directly from the timedomain response signals of the structure [1][2][3][4]. Conventional modal analysis methods usually assume that the excitation of the structure is white noise, but in fact, in the working state of the structure, the ambient excitation is mostly non-white noise. Therefore, the research on structural modal parameter identification under non-white noise excitation is beneficial to the further development of structural dynamic analysis technology, so as to be better applied to engineering.
The RDT was a time-domain method to identify modal parameters proposed by Cole [5]. Subsequently, Ibrahim extended the RDT method to the field of multichannel signals and formed Ibrahim time-domain method, which was successfully applied to modal parameter identification of spacecraft model structure [6]. The RDT was originally applied to linear single degree of freedom systems with constant damping ratios, which was later used to extract aerodynamic damping from random crosswind responses [7]. Moreover, Kordestani et al. have proposed a two-stage time-domain output-only damage detection method with a new energy-based damage index [8]. As mentioned earlier, the RDT is used for monitoring and determining structural performance, being able to predict damage and handle the occurrence of sudden failures during operation of the structure. In brief, the RDT is considered as a unique nondestructive testing method, which is widely used in aerospace, civil engineering, and mechanical engineering [9]. In addition, other time-domain methods, such as natural excitation technique (NExT), eigensystem realization algorithm (ERA), and stochastic subspace identification (SSI), have also been applied in engineering [10][11][12][13][14][15]. In general, the timedomain methods use the measured response signal to identify the modal parameters of the system directly without Fourier transform, which reduces the data transformation error, but the anti-noise ability is poor.
On the contrary, deep learning methods demonstrate more attractive advantages in the anti-noise interference and can be used in damage assessment, health monitoring, modal identification, and so on [16,17]. Hopfield invented a single-layer feedback neural network Hopfield network to solve combinatorial optimization problems, which is the prototype of the earliest RNN [18]. Nevertheless, given the abundant literature for RNN, it is noticeable that the conventional RNNs usually suffer from a dilemma between the long-range dependence and gradient vanishing. As a remedy, Hochreiter and Schmidhuber proposed the LSTM [19], which greatly alleviated the problem of the early RNN training by using gating unit and memory mechanism. Subsequently, Gers et al. [20] introduced the forgetting gate mechanism on the basis of literature [19], so that the LSTM can reset its own state. Specifically, Greff et al. [21] reviewed the development of the LSTM, compared and analyzed the abilities of eight LSTM variants in speech recognition, handwriting recognition, and chord music modeling, and proved that forgetting gate and output activation function are the key components of the LSTM. It is fair to assert that the neural network represents the most successful identification technology used in the modeling of dynamic system, and it has a unique advantage in antinoise interference; scholars began to study the parameter identification method based on neural network, aiming at better application in practical engineering [22,23]. Many attempts have already started in this field, such as Xu and Wang who proposed a RNNbased approach for modal parameter identification of structure-unknown systems [24]. Then, the work [25] presented a structural identification method based on RNN and autoregressive and moving average (ARMA) model. Zhang et al. studied the modal parameter identification based on neural network with ARMA [26]. RNNs have unique advantages in processing time series data, and the time-domain method for modal parameter identification based on RNNs has great development potential.
Generally speaking, the limitation of the conventional OMA methods on input-type greatly reduces the adaptability of this method in practical engineering application. However, using the advantages of traditional methods and neural networks to establish a new method is worth studying. For this purpose, an adaptive operational modal analysis method using encoder LSTM with RDT is proposed in this paper. Initially, the data is processed by RDT, so that the recognition accuracy is the highest on the premise of simplifying the model as much as possible. In the second step, with the addition of encoder, LSTM can be regarded as a decoder in autoencoder. Then, establish the simplest network structure as possible to achieve the best performance. Finally, the results indicated appropriate accuracy proposed by encoder LSTM for parameter identification as well as proper performance of the proposed method. The rest of this paper is organized as follows. The RDT and the architecture of LSTM are described in Section 2. The proposed method and its simulation are described in Section 3. Experimental verification is described in Section 4. Finally, conclusions are given in Section 5.

Background
2.1. RDT. RDT extracts the free attenuation vibration response from the response of ambient excitation by means of average and mathematical statistics [5][6][7]. In a linear multidegree of freedom system, the forced vibration response of a measuring point under arbitrary excitation can be expressed as where DðtÞ is the free vibration response of the system with an initial displacement of 1 and an initial speed of 0; VðtÞ is the free vibration response of the system with initial displacement 0 and initial velocity 1; yð0Þ and _ yð0Þ are the initial displacement and initial velocity of the system vibration, respectively; mðtÞ is the unit impulse response function of the system; uðtÞ is external excitation.
Selecting the appropriate constant A to intercept the random vibration response of a structure in situ yðtÞ, and a series of different intersection times t i (i = 1, 2, ⋯, N) are obtained. The response from time ti can be expressed as Since the uðtÞ is stable, the starting point of time does not affect randomness. The yðt − t i Þ time series starting point t i is moved to the origin of coordinates, and the corresponding subsample function can be expressed as Take the statistical average of x i ðtÞ The excitation uðtÞ is random vibration with the mean value of 0, and the system vibration response yðtÞ and _ yðtÞ are also stationary random vibration with mean value of 0.
Journal of Sensors After RDT processing, the free vibration response with initial displacement A and initial velocity 0 is obtained [5][6][7][8][9]. RDT has the characteristics of simplicity and clear physical meaning, so it is used in the preprocessing part of the dataset.

LSTM.
RNN is very effective for data with sequence characteristics, and it can mine temporal information in data [16]. However, due to the problems of gradient vanishing and gradient exploding, the training of RNN is very difficult and its application is very limited. Compared with RNN, LSTM has gating unit and memory mechanism and can selectively store information, so it solves the problems of gradient disappearance and gradient explosion.
In the LSTM, for each element in the input sequence, each layer computes the following function: where h t is the hidden state at time t, c t is the cell state, x t is the input, h t−1 is the hidden state of the layer at time t − 1 or the initial hidden state at time 0, and i t , f t , g t , and o t are the input, forget, cell, and output gates, respectively. σ is the sigmoid function, and ⊙ is the Hadamard product. The method of Adam (adaptive moment estimation) [27] is used to optimize, which has the advantages of simple implementation, high efficiency, less memory consumption, and suitable for large gradient noise problems. The loss function is mean square error (MSE) as follows.
where M is the total number of samples, y i is the actual output value, and yy i is the predicted output value.

The Proposed Method
Generally speaking, the rocket works in stages during launch, and the length is changing. Therefore, it is necessary to study the dynamic characteristics of beams with varying length. A cantilever beam with different length is taken as an example to verify the proposed method. The flowchart of the proposed method is shown in Figure 1.

Dataset Processing.
Here, the cantilever beams with 11 different lengths are used for numerical simulation. Furthermore, each beam is divided into 10 elements (as shown in Figure 2 and Table 1 The construction of dataset is the first and foremost step of network training. Before going into the model, data preprocessing is particularly important. The acceleration response signal preprocessed by RDT and analytical solution are regarded as the input and output data of the network, respectively. More specifically, the dataset is composed of 11000 samples, and each sample is a two-dimensional matrix. The ratio of training and testing data is 8 : 2.
where RDT½· denotes the RDT, as described in Section 2.1.

The Encoder LSTM Model.
Transformer encoder layer is made up of self-attention and feedforward network. The encoder can get the essence of the raw data, and then, we only need to create a small neural network to learn the essence of the data, which not only reduces the burden of the neural network but also achieves good results. The dataset is written into g = ½g 1 , g 2 , ⋯, g t , ⋯ after being processed by the transformer encoder layer. Then, g is substituted into the LSTM layer for calculation.
where LSTM½·denotes the LSTM network calculation, which is detailed in Section 2.2. Finally, in the full connection layer, where L and Y are input and output data, respectively. The PReLU function is selected in the fully connected layer, which is characterized by fast convergence and simple gradient calculation, In brief, the proposed encoder and LSTM model are consist of the transformer encoder layer, the LSTM layers, and the fully connected layer (Figure 1). For convenience, we use E, L, and F to represent the transformer encoder layer, the LSTM layer, and the fully connected layer, respectively. When given data, the model first uses the transformer encoder layer E1 to learn features, where the number of expected features in the input is 512, and the number of heads in the multihead attention models is 8. Then, the features in E1 are inputted into the       Journal of Sensors

Results.
Here, the encoder LSTM was established by repeatedly training with the iteration steps as 100 and the learning rate as 0.001. It is widely known that the finite element method can be directly used to solve the modal parameters of beam [28,29]. And the natural frequency equation can be determined by the vibration differential equation [30] and boundary conditions, and then, the natural frequencies of the beam can be obtained. What is more, the analytical solution is taken as the output of the network and compared with the finite element solution (as shown in Figures 3 and 4). Additionally, the beam with length = 0:8 m is taken as an example to illustrate. It is generally known that signal to noise ratio (SNR) [31] is a common index to evaluate the strength of noise in a signal. When the signal contains more noise, the value of SNR is smaller. P signal is the power of the effective signal in the signal, and P noise is the power of the noise in the signal. So in order to test the antinoise ability of the proposed method, noise with different SNR is added to the response data of the beam. Then, the data is preprocessed to establish a dataset. Finally, the dataset is substituted into the model for training and tested.

Error =
Estimate value-reference value Reference value × 100%, ð19Þ The MSE of the first 10 steps is greatly reduced, and it is close to the optimal value in the 20th step, reaching 10 -5 orders of magnitude in the 50th step and the 100th step, which indicates that the proposed method has the advantage of fast convergence (as shown in Figure 5). In addition, the results of the beam under different SNRs are the same, which indicate that the proposed method has strong anti-noise performance.

Dataset Processing.
A slender aluminum beam (as shown in Figure 6) is selected as the experimental specimen. The shaker table provides a base excitation along the Y direction. The sixth acceleration sensor measures the excitation signal, including white noise and non-white noise  Table 2. The acquisition equipment is the Agilent VXI plus and play system. Sensors are the sensory organs of various mechanical and electronic devices. Without sensors to capture and convert the original information accurately and reliably, all measurement and control cannot be realized [32,33]. The sensor type is PCB 333B32 SN 25222.
The conventional OMA method usually assumes that the excitation of the structure is a uniform spectrum excitation. However, in the operational state of structures, such as the flight of aerospace, the passing of bridge, the wind load, or the earth pulsation action of high-rise structure, the ambient excitation is mostly nonuniform. All these states will restrict and affect the application and accuracy of the conventional OMA method. Therefore, the modal analysis must be conducted under the nonuniform excitation spectrum. Traditionally, the four typical non-white noises correspond to blue noise, pink noise, purple noise, and brown noise. In order to make a more comprehensive study of different non-white noise excitation, the excitation spectra of four typical colored noise and white noise are mixed to excite the structure. Here, two typical trapezoidal spectra ambient excitations are used to excite the beam, and the vibration environment of the trapezoidal spectrum base excitation is controlled by the shaker table controller. Excitation spectrum 1 can be regarded as a combination of blue noise, narrow band white noise, and pink noise. Excitation spectrum 2 can be regarded as the combination of purple noise, narrow band white noise, and brown noise. More specifically, the    Journal of Sensors spectrums inflection frequency are 10, 100, 600, and 1000 Hz, respectively (as shown in Figure 7). The acceleration response signals generated under the excitation of the white noise excitation spectrum, the excitation spectrum 1, and the excitation spectrum 2 are denoted as data 1, data 2, and data 3, respectively. Under laboratory conditions, the method of using the simultaneously measured excitation and response signals of the structure to obtain the transfer rate function of the system for parameter identification is called experimental modal analysis method (the results obtained by this method are referred to as expected output). The acceleration response signals preprocessed by RDT are still used as the input of the network, but the difference is that the output data of the network are the results of experimental modal analysis.

Model Training and Results.
Under the laboratory conditions described in Section 4.1, NExT-ERA, NExT-ARMA, Data-Driven SSI (SSI-DATA) [34], Covariance-Driven SSI (SSI-COV) [35], frequency and spatial domain decomposition (FSDD) [36], and other methods are used to identify the modal parameters of the data 1, and the results are compared with the proposed method. Obviously, the natural frequencies identified by the proposed method are consistent with the expected output, so the recognition accuracy of the proposed method is higher than other methods (as shown in Figure 8). The damping is greatly affected by the external noise, and the recognition results of the proposed method are similar to NExT-ARMA method and FSDD method (as shown in Figure 9). The modal shape is consistent with the actual situation (as shown in Figure 10). Mode is the natural   Journal of Sensors vibration characteristic of the structure. Each mode has a specific natural frequency, damping ratio, and modal shape, so the modal parameters of the structure will not change due to different excitations. Nevertheless, taking EFDD and FSDD as examples, there are modal leakages and false modes in parameter identification of the data 2 and the data 3 (as shown in Figure 11), and the results are inconsistent with the data 1, thereby suggesting that the conventional OMA method is not suitable for non-white noise excitation.
In order to test the performance of the proposed method, different network structures are used to train and test the same data, and the data includes data 1, data 2, and data 3. More specifically, Model 1 means that the data is not processed by RDT, but directly trained by RNN. The fourth order The third order The second order   Journal of Sensors Model 2 means that the data is not processed by RDT but directly trained by the encoder LSTM.
Obviously, Model 1 has obvious bulge in step 17, and Model 2 has obvious bulge in step 25, which indicates that these network training effects are not good (as shown in Figure 12). Fortunately, the loss function of the proposed method is smooth, and the results are consistent with Figure 8, which shows that the network training effect is good and the modal parameter identification accuracy of the data is high. As well in the loss function of the proposed method, the MSE of the first 10 steps is greatly reduced, and it is close to the optimal value in the 20 steps, reaching 10 -5 orders of magnitude in the 50th step and the 100th step, which shows that the proposed method has a fast convergence rate. As mentioned earlier, the proposed method has strong generalization ability.

Conclusion
An adaptive operational modal analysis method using transformer encoder and LSTM is proposed and has been applied to extract the mode from the acceleration response of cantilever beam model. Simulation and experimental results show that the proposed method has the advantages of strong antinoise ability, fast convergence, and high accuracy, which provides a new method for the application of modal analysis in engineering.
(a) In the simulation, the proposed method is used to identify the response data with noise of different SNR, and the results are the same, which proves that the method has strong anti-noise ability.
(b) In the experiment, different treatment methods are used for the beam, and the recognition results show that the proposed method is the best. Furthermore, in the loss function of the proposed, the first 10 steps decay rapidly and approach the optimal value at 20 steps, and the MSE of the 50th and 100th steps is in the order of 10 -5 , which shows that the proposed method has a fast convergence rate.
(c) In the experiment, compared with other conventional methods, the proposed method has higher recognition accuracy for the data 1. In addition, the result of the data recognition by the proposed method is consistent with that of the data 1, and the convergence speed is fast, which shows that the method has strong generalization ability.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.