A Deep Learning Prediction Model Based on Extreme-Point Symmetric Mode Decomposition and Cluster Analysis

Aiming at the irregularity of nonlinear signal and its predicting difficulty, a deep learning prediction model based on extremepoint symmetric mode decomposition (ESMD) and clustering analysis is proposed. Firstly, the original data is decomposed by ESMD to obtain the finite number of intrinsic mode functions (IMFs) and residuals. Secondly, the fuzzy c-means is used to cluster the decomposed components, and then the deep belief network (DBN) is used to predict it. Finally, the reconstructed IMFs and residuals are the final prediction results. Six kinds of prediction models are compared, which are DBN prediction model, EMDDBN prediction model, EEMD-DBN prediction model, CEEMD-DBN prediction model, ESMD-DBN prediction model, and the proposed model in this paper. The same sunspots time series are predicted with six kinds of prediction models. The experimental results show that the proposed model has better prediction accuracy and smaller error.


Introduction
At present, there are still many difficulties in predicting nonlinear signal such as sunspots and underwater acoustic signal.Sunspots are the basic parameters of the solar activity level.They are closely related to the geomagnetic disturbance and ionospheric electron concentration.Prediction of sunspots is an important part of spatial forecast which can provide important reference information for communication, navigation, and positioning.Some scholars have conducted extensive research on the theory of forecasting [1,2].In the time-frequency signal analysis, the commonly used method is Fourier transform which is mainly mapping the time domain signal to the frequency domain energy spectrum space, but Fourier transform only applies to the stationary signal.Artificial neural network has the characteristics of independent learning compared with the previous regression analysis which is especially suitable for nonlinear signal processing.However, due to the limitation of synchronous instantaneous input, the time cumulative effect of continuous signal cannot be reflected, and the prediction accuracy is low [3].Wavelet neural network is combined with the characteristics of artificial neural network and wavelet analysis which has been widely applied to the processing of nonlinear signal.Li and Wang [1] propose the prediction model based on complementary ensemble empirical mode decomposition and wavelet neural network.Although its prediction accuracy is improved to a certain extent, there is room for further improvement.The emergence of empirical mode decomposition [4] (EMD) provides an idea for the processing of nonlinear signal.It does not need to select a basis function, but it is difficult to determine the number of screenings and there are many defects in Hilbert spectral analysis.The extreme-point symmetric mode decomposition [5][6][7] (ESMD) method is a further improvement of the EMD, whose envelope interpolation from extreme points of the original external changes to internal upper and lower extreme symmetric interpolation.The residual modal component is optimized by the least squares method, which has the characteristics of adaptive global to determine the number of screenings.ESMD uses the direct interpolation (DI) method, which is different from Fourier transform only by the idea of transformation of the integral algorithm.In view of the advantages of ESMD, this paper selects the ESMD method to decompose the nonlinear time series.Then, fuzzy -means [8,9] clustering analysis is used to aggregate the data of the same membership to facilitate the prediction analysis of the model.Finally, the 2 Mathematical Problems in Engineering deep belief network [10][11][12][13] (DBN) is trained to achieve the expected output value, and then the predicted output value is reconstructed to obtain the final predicted value.

ESMD Method
ESMD is a new development of the Hilbert-Huang transform, and its algorithm is as follows: (1) Find all the extreme points (maximum and minimum) of the data  and record them as  = { 1 ,  2 , . . .,   } ∈   .
(3) Supplement the left and right border midpoint  0 ,   by certain methods.
(5) Repeat the above steps until the number of screenings reaches the preset maximum value; then the first decomposed empirical mode is recorded as  1 .
(6) Repeat the above steps for  −  1 to obtain  2 ,  3 , . . .until the final margin  only has a certain number of poles.
(7) Let the maximum number of screenings  process and cycle of the above process in the integer interval  = { 1 ,  2 , . . .,   } ∈   to get some components, and then calculate the variance ratio / 0 and draw it with the  change map, where  is the relative standard deviation of  −  and  0 is the standard deviation of the original data.
(8) Select the maximum number of screenings  0 which corresponded to the minimum variance ratio / 0 in the interval [ min ,  max ], and repeat the first six steps to output the decomposition results.

Clustering Algorithm
The fuzzy clustering algorithm was originally proposed by Dunn [14] and further introduced by Bezdek [15], which is now being applied to many fields.Its operation steps can be expressed as follows: the sample set  = { 1 ,  2 , . . .,   } ∈   is divided into  class.Membership degree of any element   in the sample on the  class is recorded as   .The fuzzy membership matrix is used in the matrix after clustering, which is recorded as  = {  } ∈   and satisfies the following conditions: The fuzzy -means clustering is obtained by minimizing the purpose function   (, ).The purpose function is as follows: where  = {  } is the membership matrix,  = {V 1 , V 2 , . . ., V  } ∈   represents  clustering center point sets, and  ∈ [1, ∞) is the weighted index.The fuzzy clustering is transformed into hard mean clustering [14] when  is 1.The ideal range of  is [1.5, 2.5], usually  = 2.
The distance from the th sample to the th class center is where  is the positive definite matrix of  × , special conditions  = , and (3) is the Euclidean distance.FCM [16] is achieved by continuously optimizing the objective function.FCM algorithm process is as follows: (1) Initialize the cluster center  = {V 1 , V 2 , . . ., V  }.
When   (  , V  ) = 0, a singular value is generated, and membership cannot be calculated by (4).A class of nonsingular values will appear when the membership value is 0. The class of singular value appears, and then the membership is calculated according to (1).

Forecasting Model
4.1.DBN Network Structure.DBN [17,18] is organized by a number of restricted Boltzmann machine (RBM) models.The visual layer of the RBM model is similar to the input layer, and the hidden layer is similar to the output layer.Learning between layers and layers of a large numbers of RBM models is used to end the final operation.The specific structure of RBM model is shown in Figure 1.The unit of the visual layer and the unit of the hidden layer can be interconnected with each other.The elements inside the layers are not connected.The units of the hidden layer can obtain a close correlation between the units of the visual layer.
The core of DBN is restricted Boltzmann machine unit, RBM is a typical artificial neural network, and it is a special logarithmic linear Markova random field [19].The RBM model has three parameters: the offset vector  = ( 1 ,  2 , . . .,   ) represents the offset of each node of the visual  layer,  = ( 1 ,  2 , . . .,   ) represents the offset of each node of the hidden layer, and  represents the weight matrix between the nodes of two layer.These three parameters directly determine the model to encode the  dimension data into  dimension data; thus the conversion between features is realized.DBN is composed of a large number of RBM models from the bottom to the top and the top of a layer of BP neural network, which is shown in Figure 2. The bottom is the sample input which is waiting for training. 0 and  0 are the nodes of RBM visual layer and hidden layer in the first layer, respectively. 0 represents the weight between the visual and hidden layers [20].

DBN Training Process
(1) The input sample is entered from the bottom level.
(2) The first RBM model was trained and then passed to the second RBM model for training, followed by continuous training until the training of the top of the RBM model is also complete.
(3) After the training is completed, the training data can supervise the operation and adopt the maximum likelihood estimation method to fine-tune the network model.
(4) Finally, the BP model is used to fine-tune the model parameters of the top layer so as to minimize the value of the loss function.

The Training Method of Deep Learning (1) Unsupervised Learning from Bottom Up (Pretraining).
Using unlabeled data to train each parameter hierarchically, this is an unsupervised training method, which is the biggest difference from the traditional neural network, and also can be regarded as the process of feature learning.The first layer is first trained with unlabeled data, and the first layer parameters are obtained.The output of the first layer is used as the input of the second layer, so as to train the second layers and finally obtain the parameters of each layer.
(2) Top-Down Supervised Study (Tuning).After the first step is completed, the network adopted discriminative training using labeled data, and the error is transmitted from top to bottom.The first step is similar to the random initialization of the traditional neural network.The difference is that the first step of deep learning is obtained through the study of unlabeled data, rather than random initialization.So the initial value is closer to the overall optimal, so the effect of deep learning is mainly the pretraining of the first step.

ESMD and DBN Prediction Model Based on Clustering.
ESMD and DBN prediction model based on clustering is proposed, whose structure is described as follows: (1) The original sequence is decomposed by ESMD; then the finite number of IMFs and residuals is obtained.
(2) The fuzzy -means clustering analysis is performed for each IMF component and residual; then the frequency fluctuation rule is got.
(3) The DBN model is established for each IMF component and residual, respectively; then the predicted value of each component is obtained.
(4) Reconstruct IMF predicting value to obtain the final predicting results.

Data Simulation and Analysis
The monthly mean total sunspot number from 1963 to 2012 was used as the original data.There are a total of 600 data points shown in Figure 3.The original data is decomposed by EMD and ESMD, respectively, shown in Figures 4 and 5.
In Figure 5, modal components IMF1∼IMF6 are shown from top to bottom, and the instantaneous frequency of modal components IMF3∼IMF6 is basically stable.The modal components can achieve relatively high prediction results after they are decomposed and predicted.But IMF1∼ IMF2 are still quite complex compared to other components, the instantaneous frequency is very large, and nonstationary is strong.So the fuzzy -means clustering analysis is performed, and the results are shown in Figure 6.
The DBN structure contains two hidden layers.The number of neurons is 2 and 12, and the learning rate is 1.The DBN network includes two hidden layers.The number of neurons is 20 and 10, the learning rate is 0.1, the cycle number is set to 100, and the momentum is set to 0. After the training is completed; each layer of the RBM model can obtain initialization parameters which constitute the simplest    The purple line in Figure 8 represents the predicted number of sunspots and the blue line represents the practical number of sunspots.It can be seen that the clustering ESMD-DBN model proposed in this paper has good fitting to the original data and can predict the number of sunspots well.Figure 9 shows the comparison results of the prediction models of DBN, EMD-DBN, EEMD-DBN, CEEMD-DBN, and ESMD-DBN.In order to identify predicted results, local predicted results are shown in Figure 10.
In order to verify the prediction result, the root mean square error (RMSE) and the mean absolute percentage error where  is the number of sample datasets, x() is the th value of the predicted data, and () is the th value of the actual data.Performance comparison of the six models is shown in Table 1.As shown in Table 1, the RMSE and MAPE of the proposed model are smaller than the other five models.Therefore, the proposed model can predict sunspot number and the trend of sunspot time series better, and it is an effective prediction model.

Conclusions
In this paper, a deep learning prediction model based on extreme-point symmetric mode decomposition and clustering analysis is proposed to predict the sunspot monthly mean time series.Comparing with the other models such as DBN, EMD-DBN, EEMD-DBN, CEEMD-DBN, and ESMD-DBN, the RMSE and MAPE of the proposed model are the smallest.The experimental results show that the proposed model can improve the prediction precision and reduce the error compared with other models in predicting the same sunspot time series.It can also be applied to other fields after conducting some modification and has high application value.

Figure 9 :
Figure 9: Predicted results of sunspot numbers for each model.

Figure 10 :
Figure 10: Local predicted results of sunspot numbers for each model.

Table 1 :
Performance comparison of the six models.