A Novel Approach for Nonstationary Time Series Analysis with Time-Invariant Correlation Coefficient

We will concentrate on the modeling and analysis of a class of nonstationary time series, called correlation coefficient stationary series, which commonly exists in practical engineering. First, the concept and scope of correlation coefficient stationary series are discussed to get a better understanding. Second, a theorem is proposed to determine standard deviation function for correlation coefficient stationary series.Third, we propose a moving multiple-point average method to determine the function forms for mean and standard deviation, which can help to improve the analysis precision, especially in the context of limited sample size. Fourth, the conditional likelihood approach is utilized to estimate the model parameters. In addition, we discuss the correlation coefficient stationarity test method, which can contribute to the verification of modeling validity. Monte Carlo simulation study illustrates the authentication of the theorem and the validity of the establishedmethod. Empirical study shows that the approach can satisfactorily explain the nonstationary behavior ofmany practical data sets, including stock returns,maximumpower load, Chinamoney supply, and foreign currency exchange rate. The effectiveness of these processes is addressed by forecasting performance.


Introduction
Time series methods have been generally accepted as one of the most important means in an increasing number of real-world applications including finance.In the past several decades, considerable efforts have been made for time series analysis and prediction [1][2][3].Time series approaches [4], regression models [5], artificial intelligence method [6], and Grey theory [7] are the commonly used techniques [8].Many analyses are based on the assumption that the probabilistic properties of the underlying process are time invariant; that is, the series to be analyzed is covariance stationary.Modeling this stationary time series, one frequently chooses time series methods because of their high performance and robustness, which mainly include autoregressive (AR), moving average (MA), autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), and Box-Jenkins models.
Although the stationary assumption is very useful for the construction of simple models, it does not seem to be the best strategy in practice, and sometimes such stationarity assumptions are often questionable [9], because time series with time-varying means and variances are commonly seen in economic forecast [10], fault diagnosis [11], quality control [12], signal processing [13], performance test [14], automatic control [15], biopharmaceutical [16], and other fields.When the heteroscedasticity time series is processed by existing covariance stationary time series analysis method, the model parameters will lose the minimum variance property, and the variance estimator is no longer the unbiased estimation [17].Referring to time series approaches and regression analysis, reasonable analysis and accurate prediction cannot be achieved for the nonstationary time series.Considering artificial intelligence, such as expert system and neural network, abundant prediction rule and practical experience from specific experts and large historical data banks are requisite for precise forecast.Although the Grey prediction model has been successfully applied in various fields and has demonstrated satisfactory results, its prediction performance still could be improved.The reason is that the Grey forecasting model is constructed of exponential function, and hence 2 Mathematical Problems in Engineering it may have worse prediction precise in the case of more random data sets.
Meanwhile, great progress has been achieved related to process monitoring in industrial fields.To solve the multimode problem, illustrated by industrial process because of multiple production patterns in the same production line, various methods including partial least squares methods [18], model library-based methods [19], the Gaussian mixture model [20], the localized Fisher discriminant analysis approach [21], and the recent independent component analysis (ICA) based statistical processing methods [22,23] have been constructed.The industrial process monitoring is of significant importance in the literature.However, the statistical method based on time series analysis is focused on in this study.
In 1982, Engle [24] proposed the concept of conditional heteroscedasticity, with which they solved the conditional heteroscedasticity estimation problem for time series with constant unconditional variance.The proposed theory has been widely applied in financial risk evaluation.For his significant contribution, Engle gained 2003's Nobel Economics Prize.However, the analysis problem for time series with unconditional time-varying variance still exists.It can be commonly seen in application [25,26].In addition, some hybrid models are also seen in the literature [27,28], which combine dissimilar models or models that disagree with each other strongly to lower the generalization variance or error.Although hybrid models have shown advantages in some circumstances, there is no denying that they are much complicated for application.
A simple nonstationary model contains a second-order stationary process modulated by a deterministic time-varying mean and a deterministic unconditional time-varying variance [29].Let   ,  = 1, 2, 3, . .., be a stationary process with zero mean and a simple nonstationary model can be given by   = () + ()  , where () is the deterministic time-varying mean function and () is the deterministic unconditional time-varying standard deviation function which is strictly positive.We can conclude that the nonstationarity of   is expressed by its evolving mean and unconditional variance.The efficient analysis of this nonstationary process is a substantial drawback in practice and has gradually attached great importance to the researchers.
Based on systematic study of mass measured data, Fu and Liu [30] found that some common characteristics are shared by certain nonstationary time series.Like general nonstationary ones, these series exhibit time-varying mean () and variance  2 (), and their autocovariance function (,  + ) = Cov(  ,  + ) = {[  − ()][ + − ( + )]} is no longer a univariate function of time interval , that is, (,  + ) ̸ = (), while their correlation coefficient function (,  + ) = Cov(  ,  + )/[()( + )] is still a univariate function of time interval , that is, (,  + ) =   .Accordingly, we can conclude that (i) they are not covariance stationary time series [31], whose autocovariance function (,  + ) is a univariate function of time interval ; that is, (,  + ) = (); (ii) they are a certain class of nonstationary time series and different from other nonstationary time series whose correlation coefficient function (,  + ) varies with time .On this basis, Fu and Liu [30] proposed the concept of "correlation coefficient stationary process, " and discussed the establishment of the correlation coefficient autoregressive moving average (CCARMA) model.
In this paper, we further study the nonstationary behavior of this correlation coefficient stationary series.First, characteristics of the variance function have been further studied, and a rigorous theorem was proposed, which can help not only determination of the standard deviation but also verification of the modeling process.Second, a rolling window determination scheme named moving multiplepoint average method has been established to obtain the mean and standard deviation functions.This technology can enhance the accuracy under the same sample size, and the effect is more obvious in case of limited sample size.Third, we studied the scope of correlation coefficient stationary process, in which discussion can be helpful to better understand the concept of correlation coefficient stationary process.Finally, the correlation coefficient stationary test method has been investigated, which can assess the validity of the modeling process and make the modeling process a closed-loop system.
In the next section, the concept of CCARMA process and its basic properties are introduced; we also discuss the CCARMA model.In Section 3, we develop a method for determining the function forms for mean and standard deviation.Section 4 establishes the parameter determination method and a correlation coefficient stationary test method.Section 5 illustrates simulation studies to assess the validity of the approach.And Section 6 is devoted to the practical evaluation of the proposed method on several data sets, including daily returns to Shanghai composite index, Guangxi monthly maximum power load, China monthly money supply, and daily foreign exchange (FX) rate EUR/USD.A comparison between our forecasting results and ARIMA, variable differential, GARCH, GM(1, 1), and Modified GM(1, 1) models is also provided in this section.Finally, we conclude this paper with a discussion in Section 7. The Gaussian CCARMA(, ) model can be denoted by

Concept of CCARMA Process
where Suppose that   is a covariance stationary series and () is a deterministic function; then we can conclude that series   =   + (),  = 1, 2, . . ., , is a correlation coefficient stationary time series, with time-varying mean () and constant variance Var(  ) =  2 .For this series,   and () represent the random part and deterministic part, respectively.Sequences of this kind are very common in practice, such as ground movement and deformation time series, meteorological data, and observation sequence in other fields.Wang [32] called it variance stationary sequence.
Let   be a zero mean covariance stationary series, and () is a deterministic positive function and then we can know that   = () ×   ,  = 1, 2, . . ., , is a correlation coefficient stationary time series with zero mean and time-varying variance.Amplitude modulation signal, commonly seen in radio communication, monitoring, and other fields, belongs to this case.In these signals, carrier signal is the zero mean covariance stationary series   , and modulated signal is the positive deterministic function ().
Considering a more composite circumstance, we assume that   is a zero mean covariance stationary series, () is a deterministic function, and () is a positive deterministic function; then, it can be inferred that   = () ×   + (),  = 1, 2, . . ., , is a correlation coefficient stationary time series with time-varying mean and variance.Actually, this is a comprehensive result of the former two cases.
Furthermore, correlation coefficient stationary series also includes the sequences which can satisfy the correlation coefficient stationary conditions.The correlation coefficient stability test method will be discussed in Section 4.

Function Form Determination of Mean and Standard Deviation
We know that one can hardly efficiently obtain the mean and standard deviation functions when these two functions discontinuously vary with time.Consequently, in this study, we consider the general cases in which mean and standard deviation functions vary with time continuously and slowly.This is a common assumption in model constructing for time series analysis.In our theoretical study process, we found that some constraints have to be satisfied for rigorous derivation.Accordingly, we proposed the following theorem for determining the mean and standard deviation functions for nonstationary time series.
where  is a positive real number.See Appendix A for theorem proof.
Generally speaking, [Δ()/()] max can be considered as a negligible small amount when two orders smaller than one.It can be inferred, from the above theorem, that the standard deviation () has the same function form with the mean function of series | +1 − ( + 1) −   + ()|,  = 1, 2, . . .,  − 1.Consequently, in the process of mean and standard deviation function form determination, we need to conduct the following steps: (1) obtain the trend estimator Otherwise, when the theorem conditions cannot be met; that is, neither Δ()/() is a constant nor [Δ()/()] max is a negligible small amount compared with one, we have to change the determination strategy.In this case, its standard deviation function () has the same function form with the trend item of series |  − ()|,  = 1, 2, . . ., .That is, for correlation coefficient stationary series   ,  = 1, 2, . . ., , its standard deviation function can be determined by where  1 is a positive real number.See Appendices for proof.
The function form determination of mean and variance focuses on accessing the trend items of   , . ., .We consider that the trend function contains nonperiodic part and periodic part.In this paper, we propose a rolling window method called "moving multiple-point average method" for the determination of sequence trend item.In order to determine the nonperiodic part in the trend item, the proposed method movingly fits on the whole sample data length  with the multiple-point average method.Meanwhile we adopt the sample periodogram method to obtain the periodic part.To better address this issue, the following steps can be performed.
(1) Determine the periodic part of trend item with sample periodogram method.
When sample size  is an odd number, coefficients   and   ,  = 1, 2, . . ., , can be calculated by When sample size  is an even number, coefficients   and   ,  = 1, 2, . . .,  − 1, can also be worked out by (6) and Then, we introduce a parameter   = √ 2  +  2  depicting the amplitude of frequency   ,  = 1, 2, . . ., .When one or more   is significantly greater than the other ones  1 ,  2 , . . .,  −1 ,  +1 , . . .,   , we can affirm that a periodic item with frequency   exists.And then the existing periodic item with frequency   can be expressed as   cos[  ( − 1)] +   sin[  (−1)].For the circumstance of multiple frequencies, the periodic item is sum of the periodic items corresponding to each crest value (2) Calculate nonperiodic part of trend item with moving multiple-point average method.
With the former results of step (1), nonperiodic trend part can be obtained by Then we select the point number of each averaging segment  (subsequence length) and moving time interval Δ based on the volatility of the obtained series from (9).Generally speaking,  is in the range of [/50, /2], where  is the sample length and Δ is in the range of [/10, /2].In order to get an accurate periodic part of trend item, we should note that averaging point number  must be not less than ent(2/ min ), whereas  min is the smallest frequency in the determined periodic function; that is, (8).
Then, we implement moving -point average on the whole sample data length  based on the moving time interval Δ and obtain a group of mean value (  ,   ) by where  = ent(( − )/Δ) + 1 indicates the total averaging times.
(3) Redetermine the periodic part of trend item.
Based on the nonperiodic trend function () obtained above, the following series can be calculated: Process the obtained series from ( 12) and then the periodic part function expressed by ( 8) can be redetermined with the periodogram method.In a word, the function forms of mean () and standard deviation () can be determined.In order to facilitate the parameter estimation process, we depict the time-varying functions of mean and standard deviation by where  0 () =  0 () = 1,   (),  = 1, 2, . . ., , and   (), = 1, 2, . . ., , are functions that can be known through the above determination process and a = ( 0 ,  1 , . . .,   )  and b = ( 0 ,  1 , . . .,   )  are general sets of unknown parameters to be calculated.

Model Construction and Testing
See Appendices for proof.

CCARMA Model Testing.
Here, we focus on the model testing method, which can demonstrate whether the proposed model is appropriate to describe the observation data.For the nonstationary series   ,  = 1, 2, . . ., , based on the parameter estimation process given above, we can obtain its mean function () and standard deviation function ().Then we can implement the correlation coefficient stationary test of series   through the covariance stationary test of series   , where   = [  − ()]/().Several tests have been proposed and applied to examine the covariance stationarity in literature.In this paper, we take the postsample prediction testing method presented by Pagan and Schwert [35] as an example to illustrate the testing procedure.Postsample prediction test for covariance stationarity is a nonparametric method, and it is facilitative to implement and familiar by scholars.
Second, split series   ,  = 1, 2, . . ., , averagely into two parts, and calculate the sample variance If  2  is a covariance stationary process with autocovariances   , and let ] =  0 + 2 ∑ ∞ =1   ; then it can be estimated by where is the estimated serial correlation coefficients of  2  calculated over the whole sample.Finally, define null hypothesis that  0 :   ,  = 1, 2, . . ., , is a covariance stationarity series versus alternative hypothesis that   :   ,  = 1, 2, . . ., , is not a covariance stationarity series.Construct test statistic , and the rejection region can be expressed by where   is the 100th percentile of the standard normal distribution and  is the selected significance level indicating the probability of type I error.

Simulation Experiment
To assess the computational performance of the proposed method and determine whether the approach seems to give reasonable results, we study the effectiveness and the performance of presented methods in Sections 3 and 4 from a Monte Carlo simulated example.
The following zero mean CCAR(1) model Nonzero mean CCAR() and CCARMA(, ) model with higher order and different parameter specifications are also considered to study the authentication of the theorem, and the results were very similar to those reported in this paper.

Empirical Results
In this section, we focus on the practical performance of the proposed approach.Experiments are presented for four different economic data sets presented in Section 6.1.In Section 6.2, the mean and standard deviation functions are determined.Then, in Section 6.3, we apply the statistical test method discussed in Section 4. Finally, Section 6.4 is devoted to evaluating the forecasting performance of the correlation coefficient stationary method.

The Data Sets
Daily CIR.The daily returns to composite index for Shanghai from January 5, 1999, through September 30, 2003September 30, (1131 observations) observations).
Monthly MPL.The monthly maximum power load for Guangxi from January 1990 through December 1999 (120 observations).

Determination of Mean and Variance.
In order to test the stability of the three data sets, we apply the determination method presented in Sections 3 and 4 on the four data sets.All parameter results are summarized in Tables 3 and 4. Note that the standard deviation functions () are obtained by (3) for Daily CIR, Monthly MPL, and Daily FX rate data sets and by (4) for Monthly M2, because the theorem condition can be satisfied for the former three sequences.

Testing for Correlation Coefficient Stationarity.
As a second step, we apply the correlation coefficient stationarity test presented in Section 4 on the four data sets.All results are summarized in Table 5.We first derive sequence   by transformation   = [  − ()]/() with the results listed in Table 3.Let significance level  = 0.05 and the rejection region is  >  1−/2 = 1.96.
From the results, we can conclude that the four data sets are correlation coefficient stationary time series, and the application of the proposed method is reasonable.

Prediction.
Based on the CCAR models constructed above, the forecast of original sequence   can be obtained by (1).In this section, we will consider autoregressive integrated moving average (ARIMA) model, variable differential (VD) model, generalized autoregressive conditional heteroscedasticity (GARCH) model, Grey prediction model GM(1, 1), and modified GM(1, 1) model which are considered as standard remarks to study its forecasting accuracy.The future 5 daily Shanghai composite index data, the future 24 monthly maximum power load data for Guangxi from January From the results, we can conclude that the application of the proposed method for correlation coefficient stationary time series is reasonable and effective.Furthermore, the presented methodology can be considered good and shows a promise for future applications in nonstationary time series analysis and forecasting.

Discussion
In this paper, we discussed the category of correlation coefficient stationary series, a nonstationary time series with time-varying mean and variance.We proposed a moving determination method for its time-varying mean function and standard deviation function.We also discussed the correlation coefficient stationary test method.
The determination principle of function form and order of mean () and standard deviation () cannot be separated from primary sequence analysis.It is worth noting in prediction problem that the function models of () and () should also trade off between sequence volatility and accuracy requirements.

B. Proof of PDF Relationship
Let us derive the relationship between joint PDF    , −1 ,...,

Monthly M2 .
The monthly money supply for China from January 2000 through December 2009 (112 observations).Daily FX Rate EUR/USD.Euro to the United States dollar parity from January 1, 2005, through December 30, 2005 (260 observations).
specifies the MA polynomial, and   ∼ i.i.d.Basic properties show that the covariance stationary series is a special case of correlation coefficient stationary sequence.When the mean and variance do not vary with time; that is, (  ) = , Var(  ) =  2 , the correlation coefficient stationary series   ,  = 1, 2, . . ., , degenerates to covariance stationary sequence.

Table 2 .
) is considered with four simulation function forms for the standard deviation, including linear function, quadratic function, periodic function, and the combination function of periodic function and linear function.The parameters assumed in each experiment are summarized in Table1for convenience.The sample size is  = 200.The point number of averaging and moving time interval for each experiment is also listed in Table1.Series | +1 −   | and |  | can be taken to determine the standard deviation function () by (3) (Theorem method) presented in Section 3, since the mean of series   equals zero; that is, () = 0. Simulation results are analyzed by the index of average percent relative error err = (1/) ∑  =1 |()− The results are based on 20000 Monte Carlo simulations with innovations drawn from an IID Gaussian distribution.

Table 1 :
Model specifications by experiments. and Δ are point number of each averaging and moving time interval;  is the autocorrelation coefficient of simulation model;  1 ,  2 , and  3 are model parameters in standard deviation function; [Δ()/()] max is the maximum of Δ()/().Symbol "\" indicates that the parameter does not exist. Note:

Table 2 :
Results of parameter and the average percent relative error for each experiment.

Table 1 .
Symbol "\" indicates that the parameter does not exist.

Table 4 :
Model parameter results.

Table 5 :
Testing results of correlation coefficient stationarity.

Table 6 :
Prediction results for future five daily Shanghai composite indexes.

Table 7 :
Prediction results of the Guangxi monthly maximum power load (%).

Table 8 :
Prediction results of China money supply.