A New Least Squares Support Vector Machines Ensemble Model for Aero Engine Performance Parameter Chaotic Prediction

Aiming at the nonlinearity, chaos, and small-sample of aero engine performance parameters data, a new ensemble model, named the least squares support vector machine (LSSVM) ensemble model with phase space reconstruction (PSR) and particle swarm optimization (PSO), is presented. First, to guarantee the diversity of individualmembers, different single kernel LSSVMs are selected as base predictors, and they also output the primary prediction results independently. Then, all the primary prediction results are integrated to produce the most appropriate prediction results by another particular LSSVM—a multiple kernel LSSVM, which reduces the dependence of modeling accuracy on kernel function and parameters. Phase space reconstruction theory is applied to extract the chaotic characteristic of input data source and reconstruct the data sample, and particle swarm optimization algorithm is used to obtain the best LSSVM individual members. A case study is employed to verify the effectiveness of presented model with real operation data of aero engine. The results show that prediction accuracy of the proposed model improves obviously compared with other three models.


Introduction
With increasing demands in the field of operation safety, asset availability, and economy, the health monitoring of aero engine has been widely considered as the key prerequisite for the competition of an airline company.One of the main tasks of health monitoring is to predict the performance parameter of aero engine.By predicting and analyzing the trend of performance parameters, one can obtain valuable information to avoid future risk and loss due to faults or accidents and reduce associated maintenance costs [1].Therefore, it is necessary to design a high accurate and robust prediction model for aero engine performance parameter (AEPP).
A variety of traditional time series prediction approaches have already been proposed for this problem, such as fuzzy rule [2], Kalman filter [3], grey prediction [4], ARMA [5], and multiple regression [6].These approaches are very mature in theory, but the accuracy is not always high and the robustness is not always satisfied in the application [7].With the development of artificial intelligence techniques, recent studies for AEPP prediction are mainly focused on artificial neural network (ANN) [8,9] and support vector machine (SVM) [10,11].
Compared with standard SVM, least squares support vector machine (LSSVM) adopts equality constraints and a linear Karush-Kuhn-Tucker system, which has a more powerful computational ability in solving the nonlinear and smallsample problem [12,13].In addition, LSSVM eliminates local minima and structure design complexity of ANN.Therefore, LSSVM is a good choice for AEPP prediction model designing.However, the modeling accuracy of a single LSSVM is not only influenced by the input data source, but also affected by its kernel function and regularization parameters [12].Thus, several main disadvantages are worth to be addressed.Firstly, using a data-driven technique to design an LSSVM model, data source should be considered as the first factor.AEPP data is different from the pure random system: that is, the chaotic characteristic of AEPP data should be extracted to reconstruct input data samples before modeling.Secondly, as two common parameter optimized methods for LSSVM, 2 Mathematical Problems in Engineering conventional cross-validation and grid search methods have several defects, such as high time consuming and a priori knowledge requirement.
In addition, although a single LSSVM with optimal parameters and reconstructed input data samples may have an excellent prediction performance under certain circumstances, because its kernel function is fixed, it perhaps has some kinds of inherent bias under other cases.In the literatures, due to the super robustness and generalization, ensemble model has been proved to be an effective way to reduce biases of single model.Ensemble model can make full use of diversity to compensate for disadvantages among the individual members, and the reasonable combination strategy is believed to be able to produce better prediction accuracy and generalization than single model [13][14][15][16].By using combining submodels, the multilayer networks of LS-SVMs ensemble have been discussed deeply, which is very encouraging and promising for further research [13], but up to now, the application of LSSVM ensemble model for AEPP prediction is relatively fresh and untouched in the open literature.
For ensemble model design, there are two points that should be considered.One is that the selected individual members need to exhibit much diversity (disagreement) and accuracy.The other is the effectiveness of the combination strategy [17].For the diversity of individual members, it is an easy and common way to build individual members by using data decomposition [18].However, this method is proved to be effective when the original data sample is sufficient, and it is not suitable for small-sample data.Compared with the existing combination strategies such as simple averaging weighting method, mean squared error weighting method, and least squares estimation weighting averaging method, intelligent method based combination strategies include ANN combiner and SVM combiner which have become the current trend [18]; however, ANN combiner cannot avoid falling into local optima, and for SVM combiner it is not easy to select appropriate kernel function, so it is necessary to further improve these ensemble strategies.
As previously mentioned, in this paper, a new PSR-PSO-(SK)LSSVM-(MK)LSSVM ensemble (PPLLE) model for AEPP prediction is proposed.Firstly, a set of diverse single kernel LSSVMs are created as base predictors.Subsequently these individual member LSSVMs output the primary prediction results independently.Finally, all the primary prediction results are combined to produce the most appropriate prediction results by another particular multiple kernel LSSVM.In the process of modeling, phase space reconstruction (PSR) theory is applied to extract the chaotic characteristic of input data source and reconstruct the data samples.Particle swarm optimization algorithm is used to search the best parameters for LSSVM members to ensure their prediction accuracies.
The rest of this paper is organized as follows.The next section provides a brief introduction to the related knowledge.Section 3 formulates the proposed PPLLE model.For illustration purpose, the detailed application on AEPP prediction and model comparisons is proposed in Section 4. Section 5 concludes this study.

Data Samples Reconstruction Based on PSR Theory.
Although the nonlinear chaos behavior is the main challenge confronting the chaotic data series prediction, the underlying data generating mechanisms can still be explored by PSR theory [19].By means of the ability of revealing the nature of dynamic system state, the PSR theory is useful in system characterization, in nonlinear prediction, and in estimating bounds on the size of the system [20,21].
According to Takens' theorem, for the nonlinear time series {  }  =1 , the current state information can be represented by an -dimensional vector: where  is the delay time,  is the embedded dimension, they are two important parameters for phase space construction, and  is the mapping relation between the inputs and outputs.The autocorrelation function of time series at the first minimum value is taken as the delay time  of the reconstructed phase space; we write where  is the mean of   .
To calculate the correlation dimension, the correlation integral () needs to be computed: where  is the selected radius and  > 0, () is the Heaviside function.

Least Squares Support
Vector Machine.LSSVM is the least squares form of a standard SVM; it was firstly proposed by Suykens and Vandewalle [12].LSSVM uses a set of linear equations during the training process and chooses all training data as support vectors, so it has excellent generalization and low computation complexity [12][13][14].
In LSSVM, the regression issue can be expressed as the following optimization problem: where (X  ) is a nonlinear function which maps the input data into a higher dimensional space.  is the error at time ,  is the bias, and  is the regulation constant.
According to the Lagrange function and Karush-Kuhn-Tucker theorem, the LSSVM for nonlinear functions can be given as below: where   is the Lagrange multiplier and (,   ) is the kernel function which is applied to substitute the mapping process and avoid computing the function (X  ).
Typical kernel functions include linear kernel function, polynomial kernel function, radial basis kernel function, sigmoid kernel function, and multiple kernel function.Some of them are listed as follows: (1) linear kernel function (LKF): (2) polynomial kernel function (PKF): (3) Gaussian radial basis kernel function (RBF): (4) sigmoid kernel function (SKF): (5) multiple kernel function (MKF): The nonlinear mapping ability of LSSVM is mainly determined by its kernel function form and relevant parameters setting: that is, various kernel functions or parameters have different influence on the prediction ability of LSSVM predictor (the parameters setting will be discussed in the next section).As to kernel function, the LKF is suited to expressing the linear component of the mapping relation, and the RBF possesses a wider convergence domain and an outstanding learning ability and high resolution power, while the PKF has a powerful approximation and generalization ability.Meanwhile, kernel functions can also be divided into local kernel function and global kernel function.For the global kernel function, it has the overall situation characteristic and is commonly good at fitting the sample points which are far away from the testing points, but the fitting effect is not perfect on the sample points which are near the testing points, and vice versa to the local kernel function [15].Each kind of kernel function has its own advantages and disadvantages; the prediction performances of LSSVM with different kernel functions are not identical.
Here, we define the LSSVM configured with a multiple kernel function as the multiple kernel LSSVM (MK-LSSVM); otherwise, we call it the single kernel LSSVM (SK-LSSVM).

Parameters Optimized Based on PSO.
Particle swarm optimization (PSO) algorithm is a popular swarm intelligence evolutionary algorithm used for solving global optimization problem [22].It can search the global optimal solution in different regions of the solution space in parallel.
In PSO, the position of each particle represents a solution to the optimization problem.  = ( 1 ,  2 , . . .,   ) is the position vector and V  = (V 1 , V 2 , . . ., V  ) is the velocity vector of the th particle.Similarly,   = ( 1 ,  2 , . . .,   ) represents the best position of the th particle which has been achieved, and   = ( 1 ,  2 , . . .,   ) represents the best position among the whole particle group.
The values of position and velocity of the particle are updated as follows: where  1 and  2 are the acceleration constant,  1 and  2 are two random numbers in the range [0, 1], and  is inertia weight factor.To improve the convergence speed of PSO,  1 ,  2 , and  of PSO are adjusted by using the formulas as below: where  max expresses the maximum iteration number and  is the current iteration number.

Overall Process of Designing the PPLLE Model
The core idea of the ensemble model lies in that all the individual members are accurate as much as possible and diverse enough, and it adopts an appropriate ensemble strategy to combine these outputs of the selected members [13][14][15][16][17][18].

Selection of the Appropriate Individual Member Predictors.
For LSSVM prediction model, several diverse strategies, such as data diversity, parameter diversity, and kernel diversity, have been proved effectively for the creation of ensemble members with much dissimilarity [13].Because kernel function has a crucial and direct effect on the learning and generalizing performance of LSSVM, various kernel functions can be used to create diverse LSSVMs.In this study,  independent SK-LSSVMs, such as LKF-LSSVM, PKF-LSSVM, and RBF-LSSVM, are selected as individual member LSSVM predictors.predictors have been selected, the other key question is how to determine the weight coefficient of each individual predictor, that is, how to construct the combiner effectively.As depicted in previous section, the MKF integrates the advantages of global kernel function and local kernel function and offsets some shortages of both simultaneously.Hence, another special MK-LSSVM is chosen as the combiner.In this paper, the MKF is composed of a RBF and a PKF: the former is a typical local kernel function and the latter is a representative globe kernel function.A similar MK-LSSVM model has high prediction accuracy and generalization ability, which has been proved with chaotic time series by Tian et al. [15].1, where  is the number of the individual member LSSVM predictors.

Overall Process of Designing the PPLLE Model. The basic framework of the proposed PPLLE model is given in Figure
As shown in Figure 1, there are three main stages in the basic framework which can be summarized as follows.
Stage 1 (sample dataset reconstruction and partition).The data source is reconstructed as data samples by using PSR; then the reconstructed data samples are divided into two indispensable subsets: training subset and testing subset.
Stage 2 (individual member creation and prediction).Based on kernel function diversity principle,  independent SK-LSSVMs are created as the individual member.Each SK-LSSVM is trained by using the training subset.Accordingly, the computational results x1 , x2 , x3 , . . ., x of the  SK-LSSVM predictors can be obtained, respectively.In the process of SK-LSSVM creating, PSO is used to optimize parameters of each member SK-LSSVM.
Stage 3 (combiner creation and prediction).When the computational results of the individual member predictors in the second stage are acquired, they are aggregated into an ensemble result by another special MK-LSSVM.Similarly, to create the optimal MK-LSSVM, PSO is applied again.
Here, (⋅) is the mapping function determined by the special combiner MK-LSSVM; thus, the final prediction output ŷ of the PPLLE model can be given as below:

Case Study
Due to different gas path component degradations such as fouling, erosion, corrosion, and foreign object damage, the performance of an aero engine will decline over the service time [23].A lot of gas path performance parameters are often used in health monitoring of aero engine from different angles and levels, such as exhaust gas temperature (EGT), fuel flow (FF), and low pressure fan speed (N1).Among these performance parameters, EGT is considered as one of the most crucial working performance parameters of aero engine, which is measured to represent outlet temperature of combustor chamber in practice.When other conditions remain the same, the higher the EGT is, the more serious the performance degradation of aero engine is [4].EGT gradually rises when the working life of aero engine increases, if the EGT value reaches or exceeds the scheduled threshold provided by the original equipment manufacturer, then the aero engine needs to be arranged for maintenance timely.
In this study, we select EGT as the AEPP representative to predict by using the proposed PPLLE model, and it is worth mentioning that other similar parameters can also be predicted in the same way.

Data Description and Samples
Reconstruction.In this study, the EGT data come from the real flight recorders of the cruise state of a certain type of aero engine, and the sampling interval is 5 flight cycles.The data series consists of 148 EGT datasets, covering the period from February 2013 to September 2014.To increase the quality of the prediction results, some abnormal samples have been discarded from the original data series.The observed EGT data is shown in Figure 2.
For the observed EGT data series {EGT  } 148 =1 , according to (2), (3), and (4), the delay time  is set as 1 and embedding dimension  = 5 is obtained by computing.Thus, (EGT −5 , EGT −4 , . . ., EGT −1 ) is taken as the input vector   , and   = EGT  ( = 6, 7, . . ., 148) is used as the corresponding expected value, so we can get the reconstructed data samples {  ,   } 148 =6 .The data samples {  ,   } 120 =6 are used as training subset to train each individual LSSVM of the ensemble model, and the samples {  ,   } 148 =121 are chosen as testing subset to validate the ensemble model.The one-step ahead prediction used in this paper is explained as in Figure 3.After the ensemble model has been trained, vector  121 is entered into 4 individual predictors (SK-LSSVM predictors) to compute their predicted values x1 121 , x2 121 , . . ., x4 121 , respectively.Then, these predicted values are aggregated into an ensemble result by using a combination predictor (MK-LSSVM predictor).Hence, the final predicted value ŷ121 is obtained.In this way, from  = 121 to 148, all the final predicted values ŷ121 to ŷ148 can be got in turn.
where   and ŷ are the observed values and corresponding prediction values, respectively.

Model Parameters Setting.
In the modeling process of LSSVMs, the parameters of PSO are set as follows:  1min =  2min = 2,  1max =  2max = 3,  min = 1,  max = 3,  = 50, and  max = 1000.By using the PSO, the corresponding optimal parameters of LSSVM  the prediction efficiency and the prediction ability [17].In this study, the member number is set as 5.

Results and Discussion
. Figure 4 illustrates the prediction results for the EGT testing dataset by PPLLE model and corresponding observed EGT value.The black symbol represents the observed value, and the red symbol expresses the prediction value.From Figure 4, we can find that the rise and fall trends of the two curves are approximately the same, and only the individual points have some higher gaps of the size, which means EGT is predicted with good  1.The RBF-chaos model aggregated chaos characteristics and RBF neural networks (here, the input layer, hidden layer, and output layer of RBF neural network are set as 5, 11, and 1, resp.).The difference between the PPLLE model and PPLLE * model lies in that the latter uses an RBF-LSSVM (i.e., SK-LSSVM) as the combiner.
In Table 2, the MAPE, MAE, MSE, and TIC values of the PPLLE, PPLLE * , RBF-chaos, and single LSSVM models on the testing dataset are listed.It shows that the PPLLE model performs the best among the four modes with MAPE of 0.51, compared with those of 0.62, 0.85, and 1.10 by the PPLLE * , RBF-chaos, and single LSSVM models, respectively.The MAE of PPLLE, PPLLE * , RBF-chaos, and single LSSVM models are 3.67, 4.48, 6.16, and 7.99, respectively, which demonstrates the prediction accuracy of the proposed model.PPLLE model predicts the EGT with MSE of 14.04, better than PPLLE * , RBF-chaos, and single LSSVM models with those of 22.48, 49.70, and 75.66, respectively.Besides, it should be pointed out that the TIC of PPLLE is 0.00258, which is quite acceptable compared with those of the other 3 models.A strong support is also exhibited by Figure 5, where the curve of PPLLE model intuitively shows the good prediction accuracy and excellent ability in tracking the observed EGT compared to the other 3 models.

Conclusions
Designing a high accuracy and robust model for AEPP prediction is quite challenging, since AEPP data is nonlinear, chaotic, and small-sample, and the traditional single prediction model may have some inherent biases.To solve this problem and to realize high prediction accuracy level, a new LSSVM ensemble model based on PSR and PSO is presented and applied to AEPP prediction in this paper.
For the presented PPLLE prediction model, individual member LSSVMs based on kernel diverse principle eliminate the inherent biases of single LSSVM and make full use of the advantages of them as much as possible.PSR is applied to reconstruct data samples, which alleviates the influence of the chaotic feature of the original data source to the PPLLE model.PSO is used to guarantee that each individual LSSVM achieves the best performance.The particular ensemble strategy employs an MK-LSSVM combiner, as the MKF integrates the advantages of global kernel function and local kernel function, and it offsets some shortages of both; this ensemble strategy further enhances the prediction ability of the ensemble model.
EGT is selected as the representative health monitoring parameter of aero engine for validating the effectiveness of the proposed PPLLE model.For comparison, the PPLLE * , RBFchaos, and single LSSVM models are also developed and evaluated.The PPLLE predicts EGT with MAPE of 0.51%, better than the PPLLE * , RBF-chaos, and single LSSVM models with those of 0.62%, 0.85%, and 1.10%, respectively.Similarly, the PPLLE predicts EGT with TIC of 0.00258, better than the PPLLE * , RBF-chaos, and single LSSVM models with those of 0.00327, 0.00485, and 0.00598, respectively.In addition, MAE and MSE indices also confirm that the presented model gives improved prediction accuracy.In a word, the above four evaluation indices consistently demonstrate that the PPLLE model is more suitable for AEPP prediction problem, and the PPLLE model can meet the actual demand of engineering application.Moreover, comparing results imply that this ensemble model has a promising application in other similar engineering areas where the data have complex nonlinear chaos relationships.

Figure 1 :
Figure 1: Basic framework of the PPLLE model.

Figures 6 (Figure 5 :
Figure 5: Prediction results of different models on EGT testing dataset.

Figure 6 :
Figure 6: RPE comparison of different models on EGT testing dataset.

Table 2 :
[24]arison of different models on EGT testing dataset.In contrast, the single LSSVM model proposed by Tian et al.[15], RBF-chaos model proposed by Zhang et al.[24], and PPLLE * (PSR-PSO-LSSVM-LSSVM * ensemble) model are built.The kernel function and parameters of the single LSSVM model are the same as those of the LSSVM 5 model listed in Table