Application of Principal Component Analysis-Assisted Neural Networks for the Rotor Blade Load Prediction

This paper presents a novel approach of principal component analysis(PCA-) assisted back propagation (BP) neural networks for the problem of rotor blade load prediction. 86.5 hours of real flight data were collected from many steady-state and transient flight maneuvers at different altitudes and airspeeds. Prediction of the blade loads was determined by the PCA-BP model from 16 flight parameters measured and monitored by the flight control computer already present in the helicopter. PCA was applied to reduce the dimension of the flight parameters influencing the component load and eliminate the correlation among flight parameters. Thus, obtained principal components were used as input vectors of the BP neural network. The combined PCA-BP neural network model was trained and tested by real flight data. Comparison of this model and to a BP neural network model as well as to a multiple linear regression (MLR) model was also done. The results of comparison demonstrate that the PCA-BP model has higher prediction precision with an average error of 2.46%, while 4.49% for BP and 10.20% for MLR. The results also reveal that the PCA-BP model has a shorter convergence path than the BP model. This method not only is useful in establishing the load spectra of helicopter rotor in-service where installation of strain gauges is impractical but also can reduce the cost of installation and maintenance measured by strain gauges.


Introduction
The fatigue life of helicopter structures plays an essential role in helicopter-maintenance work, as components of a helicopter could be fatigue-damaged under various working conditions with loads. Reliable measurement and prediction of fatigue damage then would be the key for timely replacement of necessary components and comply to the Strength Regulations for Military Helicopters. The common practice in determining the fatigue life applies the Palmgren-Miner cumulative damage theory, where one takes the loads in different maneuvers as input data [1][2][3].
A common method of obtaining the component loads is to directly measure by installing strain gauges. Although this method is able to get a high accuracy for component loads, this way will bring some disadvantages. Firstly, this can be expensive considering installation and maintenance costs [2]. Secondly, it is often impractical because strain gauges normally cannot be mounted on an in-service helicopter [4]. Besides, the installation of strain gauges would lead to the occupation of helicopter space [5]. In addition, it often takes more than one year to collect data and needs a long time analysis as the load spectrum involves a series of flight maneuvers [6].
Some researchers try to predict component loads by a mathematical model and an intelligent model actively. A mathematical model is an important method for the prediction of rotor blade loads. However, the variables and the component loads are intertwined in such a complex way that currently, a precise mathematical model is still not an option [7]. Therefore, many studies have proposed to predict component loads by using an intelligence model that can model the complex relationship for improving the prediction precision. For example, prediction of the component loads is determined by a linear regression or neural network model from the roll, pitch, airspeed, and other flight parameters measured by the flight control computer on the helicopter [8].
Cabell et al. predicted oscillatory loads using a neural network for on-line health monitoring of flight essential components in an AH-64A helicopter. The linear model was also used to predict the load, and its accuracy was obviously lower than that of the neural network, especially at higher load values that cause fatigue damage [9]. Haas et al. used a more complex method to model the load model derivation of the helicopter rotor system. Both linear regression and neural networks were used to create load models. Nonlinear effects were found in the data and explained using a linear model with derived parameters created by a nonlinear mapping function or using a neural network model [10]. Allen and Dibley applied neural networks to model the wing loads of aircrafts. The linear model was established first as the starting point of the network training, which improved the accuracy of the neural network model [11]. Gómez-Escalonilla et al. developed a parametric full-scale fatigue monitoring system for an Airbus A330 using an artificial neural network with several strain gauges installed on some areas of the wings and fuselage [4]. Cooper and DiMaio predicted the static load on a wing rib of aircraft using an artificial neural network. This was achieved by using strain values obtained from the static test as an input parameter [7]. Elshafey et al. researched on the use of neural networks to predict structural response on structures [12]. De Paula et al. predicted nonlinear unsteady aerodynamic loads for NACA0012 airfoil using neural networks [13].
Despite insufficient accuracy in the fatigue damage estimation, in these papers, encouraging progress has been made. However, the methods developed in these papers are rather inflexible for generalization when one attempts to feed the neural network with the influencing factors directly. One way to solve this problem is to preprocess the data with the principal component analysis (PCA) method. The original data of the influencing factors is very likely intercorrelated and possibly redundant. PCA is designed to reduce the data to the most compact and independent sets while retaining the data's information. Compared to the original data, data processed with PCA mitigates the demand of the complexity in the neural network. In this paper, a combination of PCA and the back propagation (BP) neural network method was experimented and compared to the pure BP method, as well as to multilinear regression (MLR). The results show evident enhancement in the accuracy of fatigue-damage estimation by the combined method of BP with PCA.
2. PCA and BP Neural Network Structure 2.1. Principal Component Analysis. The principal component analysis is a data-dimension-reduction technique. Constructing an appropriate set of linear combinations of the original indicators, a series of unrelated comprehensive indicators are produced, from which a few comprehensive indicators are selected such that they contain as much information contained in the original indicators as possible. Thus, it is possible to use fewer indicators to explain most of the variations in the original data [14].
Suppose a data set has n samples, each expressed by an m-dimensional transposed vector, (x 1 , x 2 , ⋯, x m ), so the original data matrix can be represented by an n × m matrix X = ðx ij Þ n×m , with each row representing a sample. The PCA consists of the following steps, which are shown in Ref. [15][16][17][18].
2.1.1. Normalization of Raw Data. The original variables need to be standardized to eliminate the influence of different dimensions, and it can be done according to the following two equations: where min ðx * j Þ stands for the smallest number of the n entries of the jth column in the matrix X. The logic behind this choice is that the numbers in the same column quantify the same characteristic feature. Equivalently, we can write the above equation in a more compact form, where ðmin ðx * j ÞÞ n represents a vector with n entries which are all equal to min ðx * j Þ.

Calculate the Correlation Coefficient Matrix R.
The correlation coefficient matrix R can be calculated according to the following specification: With the standard definition for correlation between two sets of data where σð x * s Þ and meanð x * s Þ stand for the variance and average of the data, respectively. It is noted that r ss = 1 and r sq = r qs .

Calculate the Eigenvalues and Feature
Vector. Calculate the eigenvalues λ i ði = 1, 2,⋯,mÞ of matrix R and arrange them by size, namely, λ 1 ≥ λ 2 ≥ ⋯≥λ m ≥ 0. Then, the feature vectors u i are just the eigenvectors of the eigenvalues λ i . Specifically, International Journal of Aerospace Engineering In case of degeneracy, we assume that all real eigenvectors can still be found. Principal components can then be expressed in terms of where F 1 denotes the first principal components, F 2 the second principal components, and F m the mth principal components.

Extract p Principal Components.
First, the principal component contribution rate and cumulative contribution rate are computed. Then, p principal components based on the cumulative contribution rate are extracted. Calculate, respectively, the contribution rate b j of the jth principal component and the cumulative contribution rate a p by The 1st, 2nd,…, pth ðp ≤ mÞ principal components are taken, when the contribution rates of the characteristic value a p reach up to 90%. Finally, the p principal components of the original samples can be expressed as in 2.2. BP Neural Network. The BP is a popular technique of neural networks for supervised learning. Its general structure is depicted in Figure 1. As shown in Figure 1, there are three layers in a BP model, which are the input layer, hidden layer, and output layer. Two nodes of each adjacent layer are connected directly, and it is called a link. Each link has a weighted value representing the degree of relationship between the two nodes. Suppose there are n input neurons, m hidden neurons, and one output neuron. A training process can be divided into two steps.

Hidden Layer Stage.
Calculate the outputs of all neurons in the hidden layer according to the following steps: where v ij is the weighted value from the ith neuron node in the input layer to the jth node in the hidden layer, net j is the activation value of the jth node, y j is the output of the hidden layer, and f H is the activation function of a node. A sigmoid function is usually expressed by formula (11):

International Journal of Aerospace Engineering
where ω jk is the weighted value from the jth neuron node in the hidden layer to the pth node in the output layer and f o is the activation function, and it is a linear function generally. All weights are assigned with random values initially. They are modified according to the learning samples traditionally by the delta rule [19][20][21][22].

Math Experimental Data
The flight load data and flight parameter data were collected by a specially instrumented helicopter. Rotor blade load data with the root position were collected through the use of strain gauges. The location of the strain gauges in the blade is shown in Figure 2. These strain gauges are temperature compensated to consider the expected temperature changes with altitude, and there is a special coating on the strain gauge to provide mechanical and environmental protection.
The ground test was conducted to calibrate the strain gauges' output, and it is shown in Figure 3. The strain gauges were mounted on the root area of the blade which was prone to cracking, and the strain gauges' outputs were measured. Static load data has been taken from a blade supported in the frame. Applying some weights to the blade in specifically designed ways, one can measure the blade's response to the weights, i.e., strain values. To avoid the destructive limit being exceeded, the strain on the blade has constantly been measured with strain gauges at a certain frequency. In the meantime, the data has also been recorded for later utilization. The measured gauge outputs and load values were used to develop equations using linear regression to calculate blade loads from the strain gauge output. In Figure 4, the correlation coefficient of this load equation in this ground test reached up to 0.9998. This equation was later used to determine the blade loads during flight.
The blade is a rotating component. In order to collect the strain gauges' signal, the strain test equipment was designed, and it was installed at the top of the rotor hub and rotated synchronously with the rotor hub. The strain signal can be transmitted wirelessly to the recording equipment in the cabin. The strain test scheme for the rotating blade is shown in Figure 5. The airspeed, roll, pitch, altitude, acceleration, overload, and other flight parameters were measured by the flight control computer already present in the helicopter.
The experimental data were synchronized in time using measured time tags from the helicopter.
The data were obtained in-flight from a series of maneuvers. Sensor readings were recorded for roll, left turn, hovering, sideslip-left, sideslip-right, push-pull, and other maneuvers. The maneuvers, the speeds, and the altitudes at which they were performed were chosen to cover the full range of helicopter motions and a considered portion of its flight envelope. 86.5 hours of data on the helicopter were collected. The percentage of the flight envelope that was covered was considered to be sufficient for establishing a strain prediction method.
The rotor angular velocity of the helicopter is 212 revolutions per minute. Therefore, the sampling rate of 256 Hz for strain data was considered enough to capture each possible associated strain data. However, flight parameters were recorded at 32 Hz. Naturally, the output data from the chosen model is paired up with the input data for each single moment describing the state of the blade and its load. Therefore, the measured data which is to be compared with the model output (or to train the model) must be correspondent to the input data for the same moment. Put in another way, all the data, whether directly measured or generated by the model must be matched for the same time points.
An example of the raw data from the helicopter is shown in Figures 6 and 7, which is the time history curve of blade loads during an asymmetric push-pull maneuver. According   International Journal of Aerospace Engineering to the Figure 6, we can see that the waveform is smooth, periodic, and without data spikes. When a helicopter performs a steady flight maneuver, the aerodynamic load of the blade changes periodically with the azimuth of the rotor blade. Therefore, the load-time curve of the blade shows a periodic trend. The rotor angular velocity is 212 revolutions per minute, so the rotor rotates 3.53 laps in one second. We can see that blade load data in a hovering maneuver were periodical waves corresponding to the rotor shaft's rotation rate, which is consistent with the theoretical analysis, while the load waveform during a symmetric push-pull maneuver in Figure 7 changed with the flight parameters. The time history curves of some flight parameter data were shown in Figures 7  and 8. Those waveforms were smooth and show no data spikes. Therefore, it satisfies the model's requirements for variables because data spikes, noisy data, or invalid data may lead to abnormal training of neural networks, thus hindering the generalizability of models. They are only a small 86.5 hours of flight data were arbitrarily divided into training data and test data, accounting for up to about 98% and 2%, respectively. It should be noted that training data and test data were recorded on two different days to provide differing flight conditions, which can determine how general-izable the models are and whether it can work in significantly different operating circumstances.     International Journal of Aerospace Engineering directly affects the prediction accuracy of the rotor blade load model. For the load model of the rotor blade, the output variable is the loads of the rotor blade, and the input variables are determined by flight parameters. 16 flight parameters were chosen as inputs to the model. These 16 flight parameters were selected because they can describe the status of the helicopter during flight. In addition, they are monitored and recorded by the flight control computer on the helicopter. Therefore, a flight load prediction system using these vari-ables is convenient to be implemented by tapping into existing data based on the helicopter. There is height, indicated airspeed, collective pitch position, longitudinal cyclic pitch position, lateral cyclic pitch position, pedal position, pitch angle, roll angle, angle of yaw, pitch rate, roll rate, yaw rate, normal acceleration, first engine torque, second engine torque, and third engine torque. These 16 variables, as well as their minimum, maximum, units, and symbols are shown in Table 1. Table 2: Correlation analysis of the process variables.    Table 2. Only data in the lower triangle are shown, as the matrix is symmetric. The absolute value of the correlation coefficient between a given factor and the blade loads is a good measure of the latter's dependence on the former. From Table 2, it can be seen that the coefficients have the following order: jX 13 j > jX 10 j > jX 14 j > jX 16 j > jX 15 j > jX 1 j > jX 3 j > jX 2 j > jX 6 j > jX 5 j > jX 11 j > jX 4 j > jX 9 j > jX 7 j > jX 8 j > jX 12 j. This indicates that the blade loads have a strong dependence on the normal acceleration and pitch rate, while the reliance on the yaw rate is negligible.

Establishment of Model
Data in Table 2 illustrates that the 16 factors of the blade load are strongly correlated. Directly inputting the 16 factors into a neutral network is possible but may not produce the optimal predictive power. Because of the large volume of data as well as the correlation among them, the structure of the neural network will be complicated, the training intensity of the neural network will be high, and it is easy to fall into local minimum points, which will eventually lead to low generalizability. Therefore, the PCA was employed to extract the most relevant and independent factors influencing the loads before feeding data into a BP neural network (this turned out to be of great value). As a result, fewer but more effective factors were obtained, and the size (thus, the computation load, too) of our neural network was significantly reduced.
The result of PCA for the 16 blade-load factors is presented in Table 3, where one finds the specific values and variances of each factor and the cumulative variance. The criterions for the factors to be kept are (1) they form the most economic combination and (2) they predict more than 90 percent of the total variance of independent observations. According to these criteria, four principal components were eventually selected and specified in formulas (14)- (17). The corresponding percentage data are shown in Table 4. The four principal components bear a cumulative contribution of 90.807 percent. They are sufficient to cover the characteristics of the original variables. Further, the correlation coefficients of the four selected principal components are vanishing. Thus, the purpose of eliminating the correlation between the original variables was achieved by PCA.

Establishment of the BP Neural Network
Model. The MLR model can predict blade loads, and the BP neural network can also be designed to do the same. To achieve optimal performance, the weight and bias of the neurons are iterated until the error in the test data calculated from equation (18) reaches a predetermined level. In the It is shown in parametric studies that utilizing single hidden-layer networks is a good tradeoff between the complexity and performance of a network, mainly because increasing the number of layers and expanding the layer size have a limited effect on the performance of the network. For the hidden layer and the output layer activation functions, we have assigned a sigmoid function and a linear function, respectively. This choice is made based on the analysis of suitable networks for data with internal coherence [25]. To train the network efficiently, we have adopted the LM algorithm, which is shown to converge faster than other algorithms [26].
Since we have not adopted PCA to preprocess the data, 16 input variables are corresponding to 16 neurons,    International Journal of Aerospace Engineering and one output variable is corresponding to one neuron. The training data and test data are the same as those used in the model of MLR, and they were normalized before being fed into the network. The error goal of the BP model was set to 0.0001, and the learning efficiency was set to 0.01.
The number of neurons M in the hidden layer has a crucial influence on the performance of the model. Commonly, it can reduce the training error of the network as the increase of hidden layer neurons. However, too many hidden layer neurons not only can increase the complexity and training time of the network but also may lead to overfitting.

10
International Journal of Aerospace Engineering In this article, M was obtained from the following empirical formula: where A is the number of input layer neurons, B is the number of output layer neurons, and C is a constant from 1 to 10 [27]. According to the empirical formula, M is an integer. If the square root of A + B is not an integer, the number will be rounded to the nearest integer. In our paper, A is 16 and B is 1, so we round the square root of 17 to 4. By using the trial-and-error method, the network model of an increasing number of hidden layer neuron from 5 to 14 was trained. As a result, it was found that the output error was the smallest when M was 10. So this study used 5 hidden layer neurons. In this way, the network was eventually designed with the architecture of 16 × 10 × 1. In this study, to reach the accuracy level 0.0001, we found that 40342 epochs are needed for this particular model.

4.5.
Establishment of PCA-BP Neural Network Model. As in the cases of the MLR and the BP neural network models, this study used the same data for training and testing the com-bined PCA-BP neural network model. The difference between the combined model and the pure BP model is distinct: now there are only 4 input variables corresponding to the four principal components, instead of 16 in the pure BP model. Similar with the BP model, the PCA-BP model's error goal was set to 0.0001 and the learning efficiency was set to 0.01, and the LM optimization algorithm was used in the process of training the network. Using the trial-and-error method, networks of different numbers of hidden-layer neurons were trained, and it was found that the output error was the smallest when M was 7. Therefore, the selected PCA-BP model has the neural network architecture of 4 × 7 × 1 in the PCA-BP model. Once the optimal neural network structure of the PCA-BP model was determined, the weights and thresholds of the network were saved. In this study, to reach the accuracy level 0.0001, we found that 13862 epochs were needed for this particular model.

Comparison of Models
The test data, accounting up to 2% of 86.5 hours of flight data, were used to determine the accuracy and generalizability of models to predict the blade loads. The performance can  Figure 13: Attitude parameter-time curves during a push-pull maneuver.
11 International Journal of Aerospace Engineering be determined by comparing the percentage error from equation (14). Figures 9-12 show the comparison of the linear model prediction, BP neural network model prediction, and PCA-BP model prediction to the measured load during hovering, flight forward, horizontal turns, and symmetric push-pull maneuver. Table 5 shows the average percentage of errors from the three models and the error percentage reduced by the PCA-BP model compared to the MLR and BP, for loads in four maneuvers.
According to Figures 9-11 and 13, it can be seen that the predicted loads of the MLR model were not very consistent with the measured loads, and the evaluation errors are relatively large, which mainly occurred in breakpoints. This is because there are complex nonlinear relationships between the loads and other influencing factors, while the MLR model simplifies it into a linear relationship. Therefore, the MLR model can only provide a limited guidance for actual flight, and it is necessary to develop other more accurate prediction methods.
The BP neural network model has a better performance in the prediction accuracy than the MLR model. Highly oscillatory behavior can be seen in the linear regression model while not in the BP model. However, at several points, the errors in the prediction were a bit large. As pointed out in the earlier text, the 16 factors are correlated, and this fact causes a discount in the performance of the neural network. Intuitively, we can say that correlated input variables are not clean compared to mutually independent input variables.
It is clear from Table 5 that the combined use of PCA and BP has higher precision in prediction with an average 2.46% percentage error compared to that of MLR (10.20% error) and BP neural network (4.49% error). In addition to that, the PCA-BP model has consistently lower errors when compared to the MLR model and the BP model. This summary of the model performance shows the capacity of the PCA-BP model to predict loads on a rotor blade in different flight maneuvers.

Conclusion
A model combining the PCA and BP neural network has been built and tested for predicting blade loads. In the PCA stage of the model, four independent factors were extracted as principal components from the sixteen original factors influencing the blade loads. Then, a BP neural network model was built with the 4 principal components as input variables. The ultimately decided architecture of the neural network is 4 × 7 × 1 with 7 the number of neurons in the hidden layer of the network.
To assess the quality of prediction of this combined model, we compared the results with an MLR model and a BP neural network model with the same data. The following are the main conclusions of this study.
(1) A MLR model and a BP neural network model for prediction of blade loads were established. The results show that the error percentage in the prediction is relatively high, and some data points in the two models are large. Therefore, they can only provide limited guidance for actual flight (2) The PCA-BP model successfully demonstrates that it is possible to predict the blade load with high accuracy from the flight parameter data collected in different maneuvers. It is beneficial to predicting the blade loads of helicopter in-service where installation of strain gauges is impractical (3) The PCA not only simplifies the structure of the BP network but also improves the network estimation accuracy and convergence speed. The results show that an average error percentage of 2.46% was achieved by the PCA-BP model, compared to 10.20% by MLR and 4.49% by a pure BP neural network model. In order to reach the accuracy of 0.01, the PCA-BP model requires much fewer epochs than the pure BP model (in the specific cases described in this study, it is two-thirds less).

Data Availability
The flight data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
We declare that we have no conflicts of interest. International Journal of Aerospace Engineering