A New Approach for Chaotic Time Series Prediction Using Recurrent Neural Network

A self-constructing fuzzy neural network (SCFNN) has been successfully used for chaotic time series prediction in the literature. In this paper, we propose the strategy of adding a recurrent path in each node of the hidden layer of SCFNN, resulting in a selfconstructing recurrent fuzzy neural network (SCRFNN). This novel network does not increase complexity in fuzzy inference or learning process. Specifically, the structure learning is based on partition of the input space, and the parameter learning is based on the supervised gradient descent method using a delta adaptation law. This novel network can also be applied for chaotic time series prediction including Logistic andHenon time series.More significantly, it features rapider convergence and higher prediction accuracy.


Introduction
A chaotic time series can be expressed as a deterministic dynamical system that however is usually unknown or incompletely understood.Therefore, it becomes important to make prediction from the experimental observation of a real system.The technology was widely studied in many science and engineering fields such as mathematical finance, weather forecasting, and intelligent transport and trajectory forecasting.The early research on nonlinear prediction of chaotic time series can be found in [1].Most recent work mainly focuses on different methods for improving the prediction performance, such as fuzzy neural networks [2,3], RBF [4,5], recurrent neural networks [6,7], back-propagation recurrent neural networks [8], predictive consensus networks [9,10], and biologically inspired neuronal network [11,12].
Both fuzzy logic and artificial neural networks are potentially suitable for nonlinear chaotic series prediction as they can perform complex mappings between their input and output spaces.In particular, the self-constructing fuzzy neural network (SCFNN) [13] is capable of constructing a simple network without the need of knowledge to the chaotic series.This capability is due to SCFNN's ability of self-adjusting the location of the input space fuzzy partition, so there is no need to estimate in advance the series states distribution.Moreover, carefully setting conditions on the increasing demand of fuzzy rules makes the architecture of the constructed SCFNN fairly simple.These advantages motivated researchers to build various chaotic series prediction algorithms by employing an SCFNN structure in, for example, [14,15].
Neural network has wide applications in various areas; see, for example, [16][17][18][19][20][21].In particular, recurrent neural network (RNN) has been proved successful in speech processing and adaptive channel equalization.One of the most important features of RNN is its feedback paths in the circuit that makes it have a sequential rather than a combinational behavior.RNNs were applied not only in the processing of time-varying patterns or data sequences but also in dealing with the dissonance of input pattern when the possibly different outputs are generated by the same set of input patterns due to the feedback paths.Since RNN is a highly nonlinear dynamical system that exhibits rich and complex behaviors, it is expected that RNN possesses better performance than traditional signal processing techniques in modeling and predicting chaotic time series.Some works on improving the performance of chaotic series prediction using RNNs can be found in, for example, [6,7].

Mathematical Problems in Engineering
From the above observation, it is a novel idea to combine the SCFNN and RNN techniques for chaotic series prediction, which results in a new architecture called selfconstructing recurrent fuzzy neural network (SCRFNN) in this paper.The structure learning and parameter learning algorithms in SCRFNN are inherited from those in SCFNN, which maintains the simplicity in implementation.Nevertheless, the recurrent path in SCRFNN makes it more complex in function deviation and exhibit richer behaviors.Extensive numerical simulation shows that both SCFNN and SCRFNN are effective in predicting chaotic time series including Logistic series and Henon series.But the latter has superior performance in convergence rate and prediction accuracy at the cost of slightly heavier structure (number of hidden nodes) and fuzzy logic rules.
It is noted that a similar neural network structure has been used for nonlinear channel equalizers in [20].The purpose of an equalizer in, for example, wireless communication systems, is to recover the transmitted sequence or its delayed version using a trained neural network.But the chaotic time series prediction studied in this paper has a completely different mechanism.Chaotic time series prediction is the problem of developing a dynamic model by the observed time series for a nonlinear chaotic system that exhibits deterministic behavior with a known initial condition.A neural network is used to build chaotic time series predication after the so-called phase space reconstruction that is not touched in channel equalizers.
The rest of this paper is organized as follows.In Section 2, the problem of chaotic series prediction is formulated.The structure, the inference output, and the learning algorithms of SCRFNN are described in Section 3. Numerical simulation results on two classes of Logistic and Henon chaotic series predictions are presented in Section 4. Finally, the paper is concluded in Section 5.

Background and Preliminaries
The phase space reconstruction theory is commonly used for chaotic time series prediction; see, for example, [22,23].The main idea is to find a way for phase space reconstruction from time series and then conduct prediction in phase space.The theory is briefly revisited in this section followed by a general neural network prediction model.
The parameter  is the delay,  the embedding dimension, and  =  − ( − 1) the number of phase vectors.A prediction model contains an attractor that warps the observed data, in the phase space, and provides precise information about the dynamics involved.Therefore, it can be used to predict  +1 from   in phase space and hence synthesize  +1 using time series reconstruction.The procedure can be summarized as a model with input   and output  +1 .
Takens' theorem provides the conditions under which a smooth attractor can be reconstructed from the observations made with a generic function.The theorem states that, if the embedding dimension  ≥ 2 + 1, where  is the dimension of the system dynamics, then the phase space constituted by the original system state variable and the dynamic behaviors in the one-dimensional sequence of observations are equivalent.It is equivalent because the chaotic attractor differential in the two spaces is homeomorphism.The reconstructed system that includes the evolution information of all state variables is capable of calculating the future state of the system based on its current state, which provides a basic mechanism for chaotic time series prediction.
As explained before, the prediction model with input   and output  +1 provides precise information of the nonlinear dynamics under consideration.Neural network (NN) is an appropriate structure to build a nonlinear model of chaotic time series predication.Specifically, a typical three-layer NN is discussed below.
When we apply a three-layer NN to predict a chaotic time series, better prediction performance can be achieved if the number of neurons of the input layer is equal to the embedding dimension of the phase space reconstructed by the chaotic time series.Specifically, let the number of neurons of the input layer be , that of the hidden layer be , and that of output layer be 1.Then, the NN describes a mapping  : R  → R 1 .
The input for the nodes of the hidden layer is where   is the link weight from the input layer to the hidden layer and   is the threshold.Assume the Sigmoid transfer function () = 1/(1+ − ) is used for the NN.Then the output of the hidden layer nodes is Similarly, the input and the output of the output layer nodes are respectively.Here   is the link weight from the hidden layer to the output layer and  is the threshold.In general, the aforementioned link weights (  ,   ) and the thresholds (  , ) of the NN can be randomly initialized, for example, within [0, 1].Then, they can be properly trained such that the resulting NN has the capability of precise prediction.The specific training strategy depends on the specific architecture of the NN, which is the main scope of this research line and will be elaborated in the next section.

Self-Constructing Recurrent Fuzzy Neural Network
Following the general description of the NN prediction model introduced in the previous section, we aim to propose a specific architecture with the features of fuzzy logic, recurrent path, and self-constructing ability, called a selfconstructing recurrent fuzzy neural network (SCRFNN).With these features, the SCRFNN is able to exhibit rapid convergence and high prediction accuracy.

Background of SCRFNN.
The network structure of a traditional fuzzy neural network (FNN) is determined in advance.During the learning process, the structure is fixed and a supervised back-propagation algorithm is applied to adjust the membership function parameters and the weighting coefficients.Such an FNN with a fixed structure usually needs a large number of hidden layer nodes for acceptable performance, which significantly increases the system complexity [24,25].
To overcome the aforementioned drawback of a fixed structure, the SCRFNN used in this paper exploits a twophase learning algorithm that includes both structure learning and parameter learning.Specifically, the structure of fuzzy rules is determined in the first phase and the coefficients of each rule are tuned in the second one.It is conceptually easy to sequentially carry out the two phases.However, it is suitable only for offline operations with a large amount of representative data collected in advance [26].Moreover, independent realization of structure and parameter learning is time-consuming.These disadvantages can be eliminated in SCRFNN as the two phases of structure learning and parameter learning are conducted concurrently [27].
The main feature of the proposed SCRFNN is the novel recurrent path in the circuit.The schematic diagram of a recurrent neural network (RNN) is shown in Figure 1 [28], where, for example,  and   ,  = 1, 2, 3, denote the external O (2)  21 (t) O (2)  12 (t) O (2)  11 (t) O (2)  22 (t) O (2)  M2 (t) O (2)  M1 (t) input and the unit outputs, respectively.The dynamics of such a structure can be described by, for  = 1, 2, 3, where  and  denote the numbers of external inputs and hidden layer units, respectively,   [] is the connection weight from the th unit to the th unit at the time instant , and the activation function (⋅) can be any real differentiable function.Clearly, the output of each unit depends on the previous external inputs to the network as well as the previous outputs of all units.The training algorithms for the recurrent parameters of RNNs have been well studied, for example, the real-time recurrent learning (RTRL) algorithm [29].

Structure and Inference Output of SCRFNN.
The schematic diagram of SCRFNN is shown in Figure 2. The fuzzy logic rule and the functions in each layer are briefly described as follows.
The fuzzy logic rule adopted in the SCRFNN has the following form: The nodes in Layer 1, called the input nodes, simply pass the input signals to the next layer.There are two input variables  1 and  2 and two corresponding outputs  (1)  1 () =  1 and  (1)  2 () =  2 for the problem considered in this paper.Each node in Layer 2 acts as a linguistic label for the input variables from Layer 1.Let  (2)   () be the output of the th rule associated with the th input node in status  and    's the recurrent coefficients.Then, the Gaussian membership function is determined in this layer as follows: with the mean   and the standard deviation   .Each node in Layer 3, represented by the product symbol Π, works as the precondition part of the fuzzy logic rule.Specifically, the output of the th rule node in status  is thus expressed as The single node in Layer 4, called the output node, acts as a defuzzifier.As a result, the final output of the SCRFNN is where the link weight   is the output action strength associated with the th rule.

Online Learning Algorithm for SCRFNN.
As discussed in Section 3.1, a two-phase learning algorithm for structure learning and parameter learning is used for SCRFNN.The initial SCRFNN is composed of the input and output nodes only.The membership and the rule nodes are dynamically generated and adjusted according to the online data by performing the structure and parameter learning processes.These two phases are explained below.
The structure learning algorithm aims to find the proper input space fuzzy partitions and fuzzy logic rules with minimal fuzzy sets and rules.As the initial SCRFNN contains no membership or rule node, the main work in structure learning is to decide whether it is necessary to add a new membership function node in Layer 2 and the associated fuzzy logic rule in Layer 3. The criterion for generating a new fuzzy rule for new incoming data is based on the firing strengths  (3)   () = ∏   (2)   (),  = 1, . . ., , where  is the number of existing rules.Specifically, if the maximum degree  max = max 1≤≤  (3)   () is not larger than a prespecified threshold parameter  min , a new membership function needs to be generated.Also, the mean value and the standard deviation of the new membership function are, respectively, assigned as  new  =   and  new  =   , where   is the new incoming data and   is an empirical prespecified constant.The value  min is initially chosen between 0 and 1 and then keeps decaying in order to limit the growing size of the SCRFNN structure.
The parameter learning algorithm aims to minimize a predefined energy function by adaptively adjusting the vector of network parameters based on a given set of input-output pairs.The particular energy function used in the SCRFNN is as follows: where   is the desired output associated with the input pattern and  (4) () is the inferred output.The vector is adjusted along the negative gradient of the energy function with respect to the vector.In the four-layer SCRFNN, a backpropagation learning rule is adopted as the gradient vector is calculated in the direction opposite to the data flow, as described below.
First, the link weight   for the output node in Layer 4 is updated along the negative gradient of the energy function; that is, In particular, it is updated according to where  is the pattern number of the th link and the factor   is the learning-rate parameter.Next, in Layer 3, the mean   and the standard deviation   of the membership functions are updated by with the learning-rate parameters   and   .The terms Δ  and Δ  are also calculated as the gradient of the energy function as follows: Finally, the variation of recurrent coefficient Δ   in Layer 2 is updated by where is again the gradient of the energy function.

Simulation Results
In order to evaluate the effectiveness of the proposed SCRFNN, we apply it on two benchmark chaotic time series data sets: Logistic series and Henon series.The number of data used for each benchmark problem is 2000.In particular, we use the first 1000 data for training and the remaining 1000 for validation.It will be shown that both SCFNN and SCRFNN are effective in predicting Logistic series and Henon series.But the latter has superior performance in convergence rate and prediction accuracy at the cost of slightly heavier structure (number of hidden nodes) and rules.

Logistic Chaotic Series.
A Logistic chaotic series is generated by the following equation: In particular, the parameter  is restricted within the range of 3.57 <  ≤ 4 for chaotic behavior.
The training performance for both SCFNN and SCRFNN is demonstrated in Figure 3, where the profiles with 5, 20, and 50 learning cycles are compared in three graphs, respectively.It is observed that the proposed learning algorithm is effective in terms of fast convergence.In all the three cases, the convergence rate for the SCRFNN is faster than that for the SCFNN.
Root mean squared error (RMSE) is the main error metrics for describing the training errors.It is explicitly defined as follows: where  is the number of trained patterns and   and   are the FNN derived values and the actual chaotic time series data, respectively.Table 1 shows that the RMSEs of SCFNN and SCRFNN completed 5, 20, and 50 learning cycles.In all the three cases, SCRFNN results in smaller RMSEs than SCFNN.
For the well-trained SCFNN and SCRFNN, the prediction performance is compared and shown in Figures 4 and  5.It is observed in Figure 4 that the predicted outputs from the SCFNN and the SCRFNN well match the real data.More  specifically, the prediction errors are less significant in the SCRFNN than those in the SCFNN, as shown in Figure 5. Finally, Table 2 shows that the numbers of hidden nodes in SCFNN and SCRFNN completed 5, 20, and 50 learning cycles.The SCRFNN has slightly more hidden nodes in all the three cases.This heavier structure of SCRFNN, together with the extra rules for recurrent path, is the cost for the aforementioned performance improvement.

Henon Chaotic Series.
Henon mapping was proposed by the French astronomer Michel Henon for studying globular clusters [30].It is one of the most famous simple dynamical systems with wide applications.In recent years, Henon mapping has been well studied in chaos theory.Its dynamics are given as follows: For example, chaos is produced with  = 1.4 and  = 0.3.Similar simulation is conducted for Henon series with the corresponding results shown in Figure 6 for the comparison of convergence rates in 5, 20, and 50 learning cycles, in Figure 7 for the comparison of prediction performance, and in Figure 8 for the prediction errors.RMSEs and the numbers of hidden nodes for SCFNN and SCRFNN are also listed in Tables 1 and 2, respectively.

Conclusions
A novel type of network architecture called SCRFNN has been proposed in this paper for chaotic time series prediction.It inherits the practically implementable algorithms, the selfconstructing ability, and the fuzzy logic rule from the existing  SCFNN.Also, it brings new recurrent path in each node of the hidden layer of SCFNN.Two numerical studies have demonstrated that SCRFNN has superior performance in convergence rate and prediction accuracy than the existing SCFNN, at the cost of slightly heavier structure (number of hidden nodes) and extra rules for recurrent path.Hardware implementation of the proposed SCRFNN will be interesting for future research.

Figure 4 :
Figure 4: Logistic series and the predictions with SCFNN and SCRFNN.

Figure 7 :
Figure 7: Henon series and the predictions with SCFNN and SCRFNN.
Mathematical Problems in Engineering membership function    ,   's and  denote the input and output variables, respectively, and    and (2) ( − 1) represent the recurrent coefficient and the last state output of the th term associated with the th input variable.