Time Series Prediction Based on Complex-Valued S-System Model

Symbolic regression has been utilized to infer mathematical formulas in order to solve the complex prediction and classiﬁcation problems. In this paper, complex-valued S-system model (CVSS) is proposed to predict real-valued time series data. In a CVSS model, input variables and rate constants are complex-valued. The time series data need to be translated into complex numbers. The hybrid evolutionary algorithm based on complex-valued restricted additive tree and ﬁreﬂy algorithm is proposed to search the optimal CVSS model. Three ﬁnancial time series data and Mackey–Glass chaos time series are collected to evaluate our proposed method. The experiment results show that the predicted data are very close to the target ones and our method could obtain the better RMSE, MAP, MAPE, POCID, R 2 , and ARV performances than ARIMA, radial basis function neural network (RBFNN), ﬂexible neural tree (FNT), ordinary diﬀerential equation (ODE), and S-system.


Introduction
One of the main purposes of time series analysis is to predict the future data based on the existing historical one. Its idea is to search a model or function, in which the past values are set as inputs and the future values are utilized as outputs [1][2][3][4]. Time series exist in almost all fields of natural science and social science, so researches of time series analysis methods are of great significance for prediction, control, and diagnosis of practical issues [5][6][7]. However, due to the nonlinear characteristics of time series such as noise, irregularity, randomness, and chaos in practical application, the traditional time series prediction methods, such as exponential smoothing method and autoregressive integrated moving average (ARIMA) model, are not effective in solving such problems [8,9]. us, time series prediction problem has been considered to be a difficult job.
Because the artificial neural network (ANN) has good abilities of learning, generalization, and error tolerance, many ANN models have been utilized widely to capture nonlinear characteristics for time series prediction problems in the past decades, such as radial basis function neural network (RBFNN) [10], Elman neural network [11], wavelet process neural network [12], recurrent predictor neural network (RPNN) [13], beta basis function interval type-2 fuzzy neural network [14], flexible neural tree (FNT) [15], functional link network [16], and deep neural networks [17,18]. However, most of the ANN models are black boxes, in which the relationships between input variables and output variables are not explained and understood easily. Especially for some practical problems, internal mechanism could not be understood from the models obtained, which may lead to restrict the problems to being solved.
Recently, various symbolic regression (SR) methods are being utilized to solve time series prediction problems. Compared with ANN models, with the given sets of independent variables and functions, SR methods could search the hidden mathematical formulas, computer programs, and logical expressions, which could explain well the internal mechanisms of the practical problems [19][20][21][22]. Johari et al. presented a genetic programming (GP) approach to predict the soil-water characteristic curves of soils according to terminal and function sets [23]. Azzawi et al. proposed gene expression programming (GEP) and gene extracted approaches to predict lung cancer from biology data. Mahmoodi et al. inferred mathematical equations based on GEP and computational fluid dynamics data in order to predict the marine propeller data [24]. Zhang et al. proposed an improved multiexpression programming (MEP) to predict 28-day cement compressive strength, which performed better than MEP, neural network, and fuzzy logical [25]. We have utilized an ordinary differential equation (ODE) model to forecast the small-scale traffic data and model stock index, and the results shown that the ODE model was more feasible and efficient for predicting time series data [26]. Zhang et al. utilized an S-system model for Shanghai stock exchange composite index prediction and the experiment results revealed that the S-system has the better performance than an ODE model [27].
In order to improve the prediction accuracy of time series and explain the internal mechanisms of the practical problems future, a novel SR method based on complexvalued S-system (CVSS) and complex-valued hybrid evolutionary algorithm are proposed in this paper. A complex number could broaden the dimension of solution, which improves the abilities of modeling and generalization, so complex-valued methods have shown great potential for forecasting time series data [28][29][30][31][32][33][34][35]. us, this paper proposes the complex-valued version of S-system (CVSS) to solve the time series prediction problem. CVSS could contain a complex-valued structure and coefficients, which can enhance the prediction ability of the S-system. e representation of a complex-valued restricted additive tree (CVRAT) is very close to the form of a CVSS model, so CVRAT is utilized to search the optimal structure of a CVSS model. In order to improve the optimization ability of firefly algorithm (FA), complex-valued firefly algorithm (CVFA) is presented to optimize the complex-valued and real-valued parameters of CVSS with faster convergence and ability and more population diversity. e novelty in our approach is to predict real time series data by a novel complex-valued model and evolutionary algorithms. ree real time series datasets collected from the Shanghai stock exchange composite index, NASDAQ index and exchange rate of Hong Kong dollar to renminbi, and Mackey-Glass chaos time series are utilized to test the performance of our method. e experiment results prove that our proposed method is superior to classical statistical method, neural network, and real-valued differential equation models, and our method could also obtain the clear and well-understood mathematical formulas.

Complex-Valued S-System Model.
e complex-valued S-system (CVSS) model is the complex-valued version of S-system, whose coefficients and functions are complexvalued. Suppose that the time series contain N variables. e i − th complex-valued ordinary differential equation model in the CVSS model is described as follows: where X i and X j are the i − th and j − th complex-valued input variables, respectively; α i and β i are complex-valued rate constants of the i − th variables; and h ij and g ij are realvalued kinetic orders.

Structure Optimization.
A real-valued restricted additive tree model (RVRAT) was proposed to optimize the structure of a real-valued S-system model [36,37]. In order to search the optimal CVSS model, the complex-valued restricted additive tree (CVRAT) is proposed. In CVRAT algorithm, the root node is set to subtraction (− ). Other nodes are created by variable (V) and function (F) sets, which are described as follows: where × n is the product of n complex-valued input variables. An example of the complex-valued restricted additive tree model is depicted in Figure 1. In order to represent the parameters of the CVSS model, a real-valued parameter (h ij or g ij ) is assigned to each variable node and a complexvalued parameter (α i or β i ) is assigned to each branch of the root node. e corresponding CVSS model is In order to search the optimal CVSS model, genetic operators, such as selection, crossover, and mutation, are utilized. e selection operation is consistent with standard genetic programming. e greater the fitness values of individuals, the greater the probability that the individuals are selected to the next generation. In the crossover operation, two CVSS models are randomly selected, and two subtrees are selected to be crossed, which are shown in Figure 2. In the mutation operation, a mutation node is randomly selected. Another node is created randomly and utilized to replace the selected node, which are shown in Figure 3.

Parameter Optimization.
Complex-valued firefly algorithm (CFA) is the complex-valued variant of firefly algorithm, in which each firefly represents a complex-valued solution containing two parts: real part and imaginary part [38]. e multiplication of individual dimension could increase the information capacity and diversity of population. In this paper, CFA is utilized to evolve the parameters of CVSS.
In CFA, complex-valued firefly can find its collaborators and move to the place of the better firefly according to brightness property. e fireflies with low brightness are attracted by the fireflies with high brightness. For each individual, the attraction and luminance of other individuals vary according to distance [39]. e flowchart of parameters optimization of CVSS with CFA is given in detailed as followed: (1) In the CVSS model, rate constants (α i and β i ) are complex-valued and kinetic orders (h ij and g ij ) are real-valued. us, the parameter vector of CVSS contains complex-valued and real-valued parameters. According to the given CVSS model, add up the number of rate constants R and kinetic orders K.

Complexity
Initialize the m complex-valued firefly individuals . . , f m ] are calculated. e firefly of CFA is complex-valued, so complex-valued kinetic orders need to be converted into real-valued ones. Complex-valued kinetic order Z k is converted as follows [40]: where ρ k is the modulus and a k and b k are the minimum and maximum values, respectively. If the optimal solution is obtained, CFA is stopped. e real and imaginary parts of brightness and attractiveness of the i − th firefly are calculated.
where B i0 denotes the maximum brightness of the i − th firefly which is equal to the fitness value f i of the i − th firefly; c is a coefficient of light absorption; B i,R and B i,I are the real part and imaginary part of the brightness of firefly i, respectively; and r ij,R and r ij,I are the real part and imaginary part of the distance between firefly i and firefly j, respectively. (4) For each firefly, search the most attractive firefly around it and update its position containing real and imaginary parts.
where Z i,R (t + 1) and Z i,I (t + 1) are the real part and imaginary part of firefly i at the t + 1 − th time point, Figure 2: e crossover operator. respectively; Z i,R (t) and Z i,I (t) are the real part and imaginary part of firefly i at the t − th time point, respectively; ε i,R and ε i,I are the real part and imaginary part of complex-valued Gaussian random number, respectively; and α is a step size randomly created from the interval [0,1]. en, Go to step (2).

Time Series Prediction Based on Complex-Valued S-System
Model. e complex-valued S-system model is utilized to solve the time series prediction problem. In the prediction process, real-valued time series are firstly translated into complex-valued data with Algorithm 1. Complex-valued time series data are utilized to search the optimal CVSS model, which predicts the complex-valued outputs. Finally, complex-valued outputs are translated into real-valued data in order to evaluate the prediction performance of the model with Algorithm 2. Suppose that time series data [D 1 , D 2 , . . . , D m ] include m variables and With complex-valued time series data, our proposed hybrid evolutionary algorithm is utilized to evolve the complex-valued structure and parameters of the CVSS model. e overall optimization process is described as follows: (1) Define the parameters in the algorithm and initialize N CVSS models. (2) e fitness value of each CVSS model could be calculated according to Algorithm 3.
(3) RVRAT is applied to search the optimal structure of the CVSS model, which is introduced in Section 2.2. At some iterations, CFA is utilized to search the optimal complex-valued parameters of the CVSS model, whose structure is fixed.
(4) If the satisfied condition is achieved, the optimal process stops; otherwise, go to step 2. --

Data and Evaluation Standard.
ree real time series datasets are collected daily to test the performance of the complex-valued S-system model, which include the Shanghai stock exchange composite index (SSEI), NASDAQ index (NASI) and exchange rate of Hong Kong dollar to renminbi (RHKRMB). e data of the past seven days (x 1 , x 2 , . . . , x 7 ) are utilized to predict the current value (y).
Mackey-Glass chaos time series are also utilized to evaluate our method. RMSE (root mean square error), MAP (mean absolute percentage), MAPE (mean absolute percentage error), R 2 (coefficient of multiple determinations for multiple regressions), ARV (average relative variance), and POCID (prediction of change in direction) are utilized to evaluate the performance of time series prediction models [16]. Suppose that time series dataset D contains m time points. e six criterions are defined as follows: Input: real-valued input vector [g 1 , g 2 , . . . , g m ] (m is the number of sample points) Output: complex-valued output vector [z 1 , z 2 , . . . , z m ] Count the maximum and minimum values of the input vector (g max and g min ); ; zo � z t + h × ((k 1 + 2k 2 + 2k 3 + k 4 )/6); ro t ⟵ convert zo into real value with Algorithm 2; end f k � F(r o , Z); ALGORITHM 3: Pseudocode of calculating the fitness of the k − th CVSS model.
where y t k is the target value at k − th time point and y f k is the forecasting value at k − th time point. f is the mean of dataset D.
In our experiment, ARIMA model [41], radial basis function neural network (RBFNN) [42], FNT, ordinary differential equation (ODE) [26], and S-system model [27] are also utilized to predict three real time series datasets and Mackey-Glass chaos time series. ARIMA is the most common statistic model used for time series prediction. Real-valued firefly algorithm is utilized to optimize the parameters of RBFNN, FNT, ODE, and S-system, which are the same as the CVSS model. e parameters of the experiment are set consistently, which are chosen based on the experience and references [36][37][38], and listed in Table 1.

Shanghai Stock Exchange Composite Index.
is section collects the Shanghai stock exchange composite index (SSEI) from 1 January 1996 to 29 December 2017. 70% of the data are utilized for training, which contains 3866 time points, and the rest of the data are utilized for testing.
rough several runs, the optimal CVSS model is obtained as follows: From equation (7), it could be seen that our method selects two important features (x 3 and x 7 ), which reveal that the data of the past third and seven days may play the most important role in predicting the Shanghai stock exchange composite index. e predicted results of 6 methods are depicted in Figure 4, which contains the predicted data and predicted errors. From the results, it could be seen that the predicted data of CVSS, ODE, and S-system are very close to the target ones and the predicted errors are very small and close to zero. e prediction errors of FNT, RBFNN, and ARIMA are relatively large. e predicted performances of 6 methods are listed in Table 2. From Table 2, we can see that differential equation models (ODE, S-system, and CVSS) have smaller RMSE, MAP, ARV, and MAPE and higher R 2 and POCID than the neural network model (FNT and RBFNN) and classical time series prediction method (ARIMA). Among three kinds of differential equation models, the CVSS model has better performance than real-valued differential equation models except for POCID. In terms of POCID, S-system performs best, but there is little difference among these three methods, which reveals that our method could predict directional tendency of data fluctuation accurately.

NASDAQ Index.
is section collects the NASDAQ index (NASI) from 2 January 1990 to 29 December 2017. 70% of the data are utilized for training, which contains 4936 time points, and the rest of the data are utilized for testing. rough several runs, the optimal CVSS model is obtained as follows: In equation (8), our method selects three important features (x 3 , x 4 , and x 6 ) automatically. e stock indexes of the past third, fourth, and sixth days may play the important role in predicting the current one. e predicted NASI data of 6 methods are depicted in Figure 5, which also contains the predicted errors. For CVSS, the curves of actual data and predicted data are almost coincident. e predicted errors distribute mainly around zero. From the prediction results, we can see that our proposed method could predict the NASI dataset more accurately.
e predicted performances of 6 methods are listed in Table 3. In terms of RMSE, CVSS is 31.7% smaller than S-system, 33.07% smaller than ODE, 61.65% smaller than

(9)
From equation (9), we could see that our proposed method selects three important features (x 1 , x 2 , and x 4 ), which show that the data of the past first, second, and fourth days help to forecast the current exchange rate accurately. e predicted data and errors by 6 models are depicted in Figure 6, which show that the predicted errors by CVSS are very small and the predicted curve is almost the same as the real one. Other models could gain the larger prediction errors. e predicted performances of 6 methods are listed in Table 4. For RMSE, ARV, MAPE, R 2 , and POCID, the CVSS model performs best among 6 methods. RBFNN has the lowest MAP. CVSS has worse MAP performance, which reveals that the predicted errors by CVSS may be very large at some time points. In terms of RMSE, CVSS is 6.42% smaller than S-system, 19.59% smaller than ODE, 64.42% smaller than FNT, 73.87% smaller than RBFNN, and 75.88% smaller than ARIMA. In terms of ARV, CVSS is 12.11% smaller than S-system, 43.52% smaller than ODE, 89.61% smaller than FNT, 94.48% smaller than RBFNN, and 95.29% smaller than ARIMA, which show that our method has the stronger generalization ability. In terms of MAPE, CVSS is 5.75% smaller than S-system, 24.54% smaller than ODE, 87.54% smaller than FNT, 93.02% smaller than RBFNN, and 93.05% smaller than ARIMA. In terms of R 2 , CVSS is 0.048% higher than S-system, 0.269% higher than ODE, 3.0% higher than FNT, 6.34% higher than RBFNN, and 7.59% higher than ARIMA. In terms of POCID, CVSS is 0.022% higher than S-system, 1.11% higher than ODE, 1.99% higher than FNT, 4.87% higher than RBFNN, and 22.26% higher than ARIMA, which prove that CVSS could predict directional tendency of data fluctuation of exchange rate more accurately.

Mackey-Glass Chaos Time Series. Since Mackey and
Glass first discovered the chaotic phenomena in time-delay systems, time-delay chaotic systems have attracted wide attention and are often utilized to test the performances of nonlinear system models. e chaos time series could be generated by the time-delayed differential equation [43,44]: e variable vector [x(t − 18), x(t − 12), x(t − 6), x(t)] is utilized to forecast variable x(t + 6). 500 samples are utilized to search the optimal structures and parameters of the CVSS model and the rest of the 500 samples are utilized as testing data to verify the validity of the CVSS model. e  8 Complexity predicted Mackey-Glass chaos time series by 5 methods are depicted in Figure 7, which also displays the predicted errors. e curves of actual data and predicted data obtained by CVSS are almost the same. e predicted errors of CVSS are smaller than S-system, ODE, FNT, and RBFNN. us, the CVSS model could predict chaos time series accurately. e predicted performances of RBFNN, FNT, ODE, S-system, and CVSS are listed in Table 5. From Table 5, it could be clearly seen that the CVSS model has the best performances in terms of RMSE, Map, ARV, MAPE, POCID, and R 2 .
According to Tables 2-5, we summarize the performances of 6 models in three aspects: fitting the data, generalization ability, and prediction effect. 6 models are divided into three levels: strong, medium, and week. e analysis results are listed in Table 6.

Statistical Significance Test.
In order to measure the differences of six forecasting models, the Friedman test is utilized. e null hypothesis of the Friedman test is that all the prediction models are equivalent, so these models have the same ranks. In order to overturn this null hypothesis, Friedman statistic χ 2 F and F-distribution F F are defined as follows [16]: where N is the number of the criterions; k is the number of the prediction models; and R i is the sum of the averaged ranks of N criterions for the i − th model. If χ 2 F is larger than F F , the null hypothesis could be rejected.
According to the criterions of six models for four time series prediction problems, six models are sorted and the averaged ranks are given, which are listed in Table 7. k and N are set as 6. According to Table 7, Friedman statistic χ 2 F and F-distribution F F are calculated as 22.67 and 15.46, respectively. χ 2 F is larger than F F , so the null hypothesis is rejected, which reveals that there is the significant difference among six prediction models.
In order to find out the significant difference between a pair of models, the Nemenyi post-hoc test is utilized. e null hypothesis of the Nemenyi test is that there is no significant difference between a pair of models. If there is no significant difference, the null hypothesis will be rejected. In the Nemenyi test, critical difference (CD) is calculated to judge whether the significant difference exits, which is defined as follows: where q α is a critical value. When α is selected as 0.01, q α is found to be 0.4643. en, the CD could be calculated as 0.502. If the difference of the average rank values of two models exceeds the CD value, the null hypothesis is rejected with corresponding confidence. According to Table 7, the results of the Nemenyi test by the pairwise comparisons of six models are depicted in Figure 8. If the difference of the average rank values of two models is smaller than CD (0.502), the responding place is set as white box, which  reveals that two models have no significant difference. e gray value of the box indicates the difference between two models, which is also shown in Figure 8. From Figure 8, it could be seen that there is no significant difference between ODE and S-system, while other pairs of models satisfy the Nemenyi test for significant difference.

Effect of the Number of Training Sets.
In order to test the effect of increasing the training for our proposed method, two datasets (SSEI and NASDAQ index) are chosen. e numbers of training sets are set to 500, 1000, 1500, and 2000, respectively, while the number of testing sets is set as 500. e RMSE performances of our method with different training sets for SSEI and NASDAQ prediction problems are listed in Table 8. From Table 8, we could see that with the increase of training sets our method could gain the smaller and more stable prediction RMSE values. When the number of training sets reaches a certain bound, the increase of the training set could improve the prediction ability of our model slightly.

Conclusion
Efficient and accurate time series prediction has been considered to be a difficult job. In this paper, we have presented complex-valued version of the S-system model to predict real-valued time series data. To search the optimal CVSS model, time series data need to be converted into complex data firstly. Complex-valued versions of the restricted additive tree and firefly algorithm are proposed to optimize the CVSS model. Stock index data from SSEI and NASI, exchange rate of Hong Kong dollar to renminbi, and Mackey-Glass chaos time series are collected to evaluate our proposed method. e forecasting capability of the CVSS model is compared with ARIMA, RBFNN, FNT, ODE, and S-system. e experiment results show that the predicted curves of the CVSS model are very closer to the target ones than other models. For SSEI dataset prediction, CVSS has the better performance than S-system, ODE, FNT, RBFNN, and ARIMA except for POCID. For NASI dataset prediction, compared with other five methods, CVSS could improve 0.103%-99.195% prediction performances in terms of RMSE, MAP, ARV, MAPE, R 2 , and POCID. For RHKRMB dataset prediction, our method could improve 0.022%-95.29% prediction performances in terms of RMSE, ARV, MAPE, R 2 , and POCID. Our method gains worse MAP performance, which reveals that the predicted errors by CVSS may be very large at some time points. For Mackey-Glass chaos time series prediction, CVSS could make 1.02%-90.74% prediction improvements in terms of six criterions. e reasons that our method performs better are analyzed as follows. (1) A complex-valued parameter of the CVSS model contains two real-valued variables (real and imaginary parts), so the CVSS model contains twice more parameters than S-system. CVSS also has complex-valued functions. Complex-valued functions and parameters make the CVSS model to have stronger modeling and generalization abilities. (2) Complex-valued firefly algorithm has more population diversity and faster convergence speed than real-valued firefly algorithm and could search the optimal solution of the model faster. (3) e representation of CVRAT is very close to the form of the CVSS model, so CVRAT could search the optimal structure of the CVSS model faster.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare no conflicts of interest.