DNA Optimization Threshold Autoregressive Prediction Model and Its Application in Ice Condition Time Series

There are many parameters which are very difficult to calibrate in the threshold autoregressive prediction model for nonlinear time series. The threshold value, autoregressive coefficients, and the delay time are key parameters in the threshold autoregressive prediction model. To improve prediction precision and reduce the uncertainties in the determination of the above parameters, a new DNA deoxyribonucleic acid optimization threshold autoregressive prediction model DNAOTARPM is proposed by combining threshold autoregressive method and DNA optimization method. The above optimal parameters are selected by minimizing objective function. Real ice condition time series at Bohai are taken to validate the newmethod. The prediction results indicate that the new method can choose the above optimal parameters in prediction process. Compared with improved genetic algorithm threshold autoregressive prediction model IGATARPM and standard genetic algorithm threshold autoregressive prediction model SGATARPM , DNAOTARPM has higher precision and faster convergence speed for predicting nonlinear ice condition time series.


Introduction
Many natural phenomena, such as ice condition, runoff, are usually nonlinear, complex, and dynamic processes.Prediction of ice conditions is of primary importance for weather forecasting, agriculture, geosciences, and marine transportation safety.The simulation of the nonlinear time series was very difficult with the traditional deterministic mathematic models, which cause new challenges to calibrate the parameters 1, 2 .There are many methods for predicting nonlinear time series 3-10 .Threshold autoregressive TAR models are typically applied to time series data as an extension of autoregressive models for higher degree of flexibility in model parameters through a regime switching behavior.TAR models were introduced by Tong and Li in 1977 and more fully developed in the seminal paper 11 .The threshold autoregressive model is a special case of Tong's general threshold autoregressive models.The latter allows the threshold variable to be very flexible, such as an exogenous time series in the open-loop threshold autoregressive system 11-13 .For a comprehensive review of developments over the 30 years since the birth of the model, see Tong 14 .However, the uncertainties in determining the parameters of the threshold variables, autoregressive coefficients, and the delay time exist in the developed threshold autoregressive model.So as to improve the prediction accuracy, the key problem is how to determine the parameters in the prediction model.
The global optimization in determining all the parameters is intractable mathematically.Once an objective function has many local extreme points, the traditional optimization methods may not obtain the global optimal solution.A genetic algorithm GA based on the genetic evolution of a species was proposed by Holland 15 .GA is a global optimization algorithm.However, the computational amount is very large and premature convergence phenomena exist in GA 16-20 .Recently, Adleman 21 showed that DNA can be used to solve a computationally hard problem.Many scientists used DNA computation to solve real problems 22-24 .In this study, DNA optimization threshold autoregressive prediction model DNAO-TARPM is presented to determine the parameters and to improve the calculation precision for predicting ice condition time series.In order to validate the new method, some real ice condition time series are used.

DNA Optimization Threshold Autoregressive Prediction Model (DNAOTARPM)
The TAR model is a tool for predicting future values in time series assuming that the behavior of the time series changes once the time series shifts to a different regime.
where r 0 −∞, r k ∞, r j j 1, 2, . . ., k − 1 are k − 1 nontrivial threshold parameters dividing the domain into k different regimes; d is the delay time parameters, b j, l is the regressive coefficients in the jth regime, e j, i stands for white-noise error term with constant variance, and p j is the autoregressive order in the jth regime of the model.The threshold parameters satisfy the constraint: Here d, k, r 1 , r 2 , . . ., r k − 1 , p 1 , p 2 , . . ., p k , and b j, l are parameters in TAR model.It is very difficult to determine these parameters with the traditional methods.
In this paper, we use DNA optimization method to determine the parameters and improve model accuracy.The new model, DNA optimization threshold autoregressive prediction method DNAOTARPM , is described as follows.
Step 1 Determine the delay time d and the number of regressive coefficients .The delay time d is determined by the autocorrelation function method 21 .The autocorrelation function R j for delay time j is calculated as

2.3
The delay time d is selected when autocorrelation function R j 25 satisfies the following condition: where u α/2 is the upper 100 • α/2 percentage point of the normal distribution for 1 − α confidence level.The number of regressive coefficients p l ≤ max j , l 1, 2, . . ., k.Some of the j values are regarded as the delay time.
Step 2 Determine the number and ranges of threshold parameters .Considering a set , . ..} from the time series i i 1, 2, . . ., n , we divide x i−d into s regimes s > k .Suppose there are N j number of x i−d in the jth part, and the corresponding x i is regarded as x i, j .In the jth part, the conditional expectation of x i given the event X x i − d is Let x i − d be horizontal axis, and let E x i /x i − d be vertical axis; we can get the scatter plots.When the scatter plots are piecewise linear map, we can estimate the number and ranges of threshold parameters.The piecewise number of piecewise linear map is the number of threshold parameters, and the ranges of the piecewise points are the ranges of threshold parameters.
Step 3 Construct the objective function .The parameter estimation for DNAOTARPM can be obtained by the following objective function, namely, the mean of least residual absolute value sum: Step 4 Solve objective function by DNA optimization method .Solving the parameters of r 1 , r 2 , . . ., r k − 1 ; b j, l , j 1, 2, . . ., k; l 1, 2, . . ., p j , in the optimization objective function 2.7 is one nonlinear optimization problem.It is rather difficult to deal with it using a traditional optimization method.The above optimal model can be solved by the following DNA optimization method 24 .The k-regime prediction formula will be seen in the following application part in detail.
If we solve objective function 2.7 with improved genetic algorithm, we call the method improved genetic algorithm 18 threshold autoregressive prediction method IGATARPM , and if we solve objective function 2.7 with standard genetic optimization method 15 , we call the method standard genetic algorithm threshold autoregressive prediction method SGATARPM .

3.1
where c {c j , j 1, 2, . . ., p}, c j is a parameter to be optimized, f is an objective function, and f ≥ 0, a j , b j is the range of c j .
The procedure of DNAOM is shown as follows 25 .
Step 1 DNA encoding .Suppose DNA-encoding length is m in every parameter, the jth parameter range is the interval a j , b j , and then each interval is divided into 2 m − 1 subintervals: where the length of subinterval of the jth parameter h j b j − a j / 2 m − 1 is constant.The searching location I j θ j • 2 m − 1 is an integer, and 0 ≤ I j < 2 m , θ j is a random variable, and 0 ≤ θ j ≤ 1, for j 1, 2, . . ., p.
The DNA code array of the jth parameter is denoted by the grid points of {d j, k |k 1, 2, . . ., m} for every individual: 3.3 DNAOM's process operates on a population of individuals also called DNA code array, strings, or chromosomes .Each individual represents a potential solution to the problem.For corresponding 1 The first position value "1" or "0" expresses the position of DNA code and the second position value "1" or "0" expresses the true value of binary code and the value of DNA code.
Step 2 creating the initial population .To cover the whole solution space and to avoid individuals entering into the same region, large uniformity random population is selected in this algorithm.Once the initial father population has been generated, the decoding and fitness evaluation should be done.
Step 3 evaluating fitness value of each individual .The smaller the value f i is, the higher the fitness of its corresponding ith chromosome is i 1, 2, . . ., N .So the fitness function of ith chromosome is defined as follows:
Step 5 two-point crossover and two-point mutation .Perform crossover and mutation on chromosomes the same as GA.
Step 6 DNA evolution .Repeat Steps 3-6 until the evolution times q Q Q is the total evolution times or the termination condition is satisfied.
Step 7 accelerating cycle .The parameter ranges of n e -excellent individuals obtained by Qtimes of the DNA-encoded optimal evolution alternating are regarded as the new ranges of the parameters, and then the whole process is back to the DNA-encoding.The DNAOM computation is over until the algorithm running times reaches the designed T times or there exists an optimal chromosome C fit whose fitness satisfies a given criterion.In the former case the C fit is the fittest chromosome or the most excellent chromosome in the population.That is, the chromosome C fit represents the solution 25 .
The parameters of the DNAOM are selected as follows.The length m 10, population size N 100, the number of excellent individuals n e 10, the times of evolution alternating Q 3, the crossover probability p c 1.0, and the mutation probability p m 0.5.

Application in Ice Condition Time Series
The real ice condition time series in this study are chosen as the annual ice condition at Bohai in China for the period of 1966 to 1994 29 years 25 .For the ice condition time series, the first modeling data set is the data during the period of 1966 to 1993 28 years .The prediction lead time is the year of 1970-1994 25 years .

The Autocorrelation Function R j for Delay Time j
The changes of the autocorrelation functions for the time series are presented at the confidence level 70% in Figure 1.
From Figure 1, we can see that only the values of R 1 , R 3 , R 4 satisfy condition 2.4 .So the delay time d is 1, 3, or 4 in DNAOTARPM.

The Number and Ranges of Threshold Parameters
The number and ranges of threshold parameters of the above ice condition time series are determined by the conditional expectation of x i given the event X x i − d .The scatter plot of the conditional expectation is shown in Figure 2.
From Figure 2, we can see that there are two piecewise linear maps, and the piecewise point is around the mean value of the time series.So we suppose y i x i − mean value, and the k-regime TAR d; p 1 , p 2 , . . ., p k model has the following form for d 1, 3, 4: x i y i mean value.

4.1
The parameters of r 1 , b j, l j 1, 2; l 1, 3, 4 are required in this model.In this work, the three parameters are estimated with respect to one criterion, namely, the mean of least residual absolute value sum shown in 2.6 .
Mean least residual absolute value sum f is 0.5737 for DNAOTARPM.The evaluation number of the objective function is 900.The computational results of the above model are given in Table 1.
For IGATARPM, the evaluation number of the objective function is 2700, and the prediction error f is 0.6016.
For SGATARPM, the evaluation number of the objective function is 2700, and the prediction error f is 0.6380.
From Table 1, we can see that prediction results for DNAOTARPM are better than those with the other methods.The prediction results of the practical example are shown in Figure 3 with different methods.
From Table 1 and Figure 3, we can see that the results achieved with our DNAO-TARPM method are satisfactory in global optimum and prediction precision.Compared with IGATARPM and SGATARPM, DNAOTARPM has a faster convergence speed and higher precision.And it is useful for parameter optimization of the nonlinear ice condition time series model.

Conclusions
In order to improve prediction precision and reduce the uncertainties in determination of the parameters for forecasting nonlinear ice condition time series, a new DNA optimization threshold autoregressive prediction model DNAOTARPM is proposed in this paper.The ice condition time series at Bohai in China are studied by using DNAOTARPM.The main conclusions are given as follows.

Figure 1 :
Figure 1: The autocorrelation function figure for the observed time series.

d 4 Figure 2 :
Figure 2: The scatter plot of the conditional expectation: a E x i /x i − 1 : x i − 1 ; b E x i /x i − 3 : x i − 3 ; c E x i /x i − 4 : x i − 4 .

1
DNAOTARPM is established by using DNA optimization method and threshold autoregressive model.The delay time d is selected with autocorrelation function, and the results indicate x i − 1 , x i − 3 , x i − 4 have significant influence on the ice condition time series α 0.30 at Bohai.